BUE DLO with synchronized replication - .sync files
Overview
In this scenario: The customer has called with this problem: fs 'vol0': Metadata stripe groups over 85 percent full
Problem Description: The customer is using
BUE DLO (Desktop Laptop Option) to backup his organizations desktops and laptops to a pair of DXi2500’s that do synchronization replication to a DXi6520 (1.4.4). The BUE DLO uses the DXi’s in a methodology very similar to NAS storage device.
Available Information:
The DXI 6520’s disk usage was at 89%:
(/bin/df: /dev/cvfsctl1_vol0 7.9TB (total) 7TB (used) .88TB (available) -> 89% /snfs)
The available snfs inodes was down to 6%. For a good DXI FW upgrade, QTM recommends that there be at least 50% free inodes:
cvadmin ->
snadmin (vol0) > show
Show stripe groups (File System "vol0")
Stripe Group 0 [A0_Metadata] Status:Up,MetaData,Journal,Exclusive
Total Blocks:1048431 (63.99 GB) Reserved:0 (0.00 B) Free:65590 (4.00 GB) (6%)
Tech Support’s Initial Thoughts:
ASPS & SES began examining the file system in preparation for de-fragmentation. The idea was to find a way to clean up unused cvfs vol0 inodes.
The analysis and a breakthrough:
In preparation for de-fragmentation, a listing of files complied. It was discovered that the system had a very large amount of .sync files in the share’s folder. A quick ‘find’ command resulted in 102 ‘.sync folders following the .sync, .sync1, .sync11, sequence to the point where were a total of 101 trailing ones. Further examination of the NAS share found that under each of the .sync folders was a complete copy of all users file (essentially a full backup).
Figure 1: The Windows Explorer representation of a portion of the /Q/shares/ADOKTLT NAS share.
DXI developers relayed that: “The .sync1…. directory was created on the target during the recovery action of the synchronization. It should have been removed as part of the synchronization cleanup once the recovery completed.” As it turns out, each day, during the synchronization replication process, if a .sync folder exists, new .sync folder is created with an appended ‘1’. Thus day one begins creating .sync folder(area for staging replication traffic), day 2 creates .sync1 folder, day 3 creates .sync11 folder and so on.
Besides the problem of available inodes, the customer was seeing very high DXI capacity utilization (89%). Examination of the contents of the .sync folder indicated that in this case, they contained 7,602,970 files using 2.46 TB.
Figure 2. The Windows properties of the .sync folders show 7,602,970 files using 2.46 TB.
At this point the questions are:
· Why didn’t the replication process delete these folders (and how do ensure future replications clean-up after themselves?
· How can we delete the .sync folders?
The solution:
Using command line remove (rm –Rf /Q/shares/ADOKTLT/.sync*) was successful in removing the majority of the files. DXi capacity dropped from 89% to 78% and vol0 free inodes increased from 6% to 61%.
A Snag:
There were a number of desktop/laptop users that remained. An “errno(39): Directory not empty” was displayed. Examination shows that BUE DLO puts a DLO identifier in front of every file
Figure 3. Screen shot showing a file’s DLO identifier in Windows explorer: [007g7cmg3$LguQ000P96Du]mastercamX5-web.exe.
Figure4. In the case of [007g7cmg3$LguQ000P96Du]mastercamX5-web.exe, the DXI sees the file as [007g7cm3] mastercamX5-web.exe.
It appears that the DXI has trouble handling the ‘$’ character in the file’s DLO identifier. QTM engineering is looking into this. In the meantime removing the .sync folders continues to return the dreaded “Directory not empty” error for folders who’s DLO identifier contains the ‘$’ character. Apparently the DXI has a process holding onto the folder. A work-around includes stopping the DXi heartbeat, and repeating the remove command (rm –Rf /Q/shares/ADOKTLT/.sync*).
The Plan:
· Upgrade 2.0.1.1
· Check free inodes and disk capacity
· Watch for the return of .sync folders