[gpfsug-discuss] Best way to migrate data : Plan B: policy engine + rsync
Alexander Saupp
Alexander.Saupp at de.ibm.com
Tue Oct 23 06:51:54 BST 2018
Hi,
I agree, a tool with proper wrapping delivered in samples would be the
right approach.
No warranty, no support - below a prototype I documented 2 years ago (prior
to mmfind availability). The BP used an alternate approach, so its not
tested at scale, but the principle was tested and works.
Reading through it right now I'd re-test the 'deleted files on destination
that were deleted on the source' scenario, that might now require some
fixing.
# Use 'GPFS patched' rsync on both ends to keep GPFS attributes
https://github.com/gpfsug/gpfsug-tools/tree/master/bin/rsync
# Policy - initial & differential (add mod_time > .. for incremental runs.
Use MOD_TIME < .. to have a defined start for the next incremental rsync,
remove it for the 'final' rsync)
#
http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.adv.doc/bl1adv_usngfileattrbts.htm
cat /tmp/policy.pol
RULE 'mmfind'
LIST 'mmfindList'
DIRECTORIES_PLUS
SHOW(
VARCHAR(MODE) || ' ' ||
VARCHAR(NLINK) || ' ' ||
VARCHAR(USER_ID) || ' ' ||
VARCHAR(GROUP_ID) || ' ' ||
VARCHAR(FILE_SIZE) || ' ' ||
VARCHAR(KB_ALLOCATED) || ' ' ||
VARCHAR(POOL_NAME) || ' ' ||
VARCHAR(MISC_ATTRIBUTES) || ' ' ||
VARCHAR(ACCESS_TIME) || ' ' ||
VARCHAR(CREATION_TIME) || ' ' ||
VARCHAR(MODIFICATION_TIME)
)
# First run
WHERE MODIFICATION_TIME < TIMESTAMP('2016-08-10 00:00:00')
# Incremental runs
WHERE MODIFICATION_TIME > TIMESTAMP('2016-08-10 00:00:00') and
MODIFICATION_TIME < TIMESTAMP('2016-08-20 00:00:00')
# Final run during maintenance, should also do deletes, ensure you to
call rsync the proper way (--delete)
WHERE TRUE
# Apply policy, defer will ensure the result file(s) are not deleted
mmapplypolicy group3fs -P /tmp/policy.pol -f /ibm/group3fs/pol.txt
-I defer
# FYI only - look at results, ... not required
# cat /ibm/group3fs/pol.txt.list.mmfindList
3 1 0 drwxr-xr-x 4 0 0 262144 512 system D2u 2016-08-25
08:30:35.053057 -- /ibm/group3fs
41472 1077291531 0 drwxr-xr-x 5 0 0 4096 0 system D2u 2016-08-18
21:07:36.996777 -- /ibm/group3fs/ces
60416 842873924 0 drwxr-xr-x 4 0 0 4096 0 system D2u 2016-08-18
21:07:45.947920 -- /ibm/group3fs/ces/ha
60417 2062486126 0 -rw-r--r-- 1 0 0 0 0 system FAu 2016-08-19
15:17:57.428922 -- /ibm/group3fs/ces/ha/.dummy
60418 436745294 0 drwxr-xr-x 4 0 0 4096 0 system D2u 2016-08-18
21:05:54.482094 -- /ibm/group3fs/ces/ces
60419 647668346 0 -rw-r--r-- 1 0 0 0 0 system FAu 2016-08-19
15:17:57.484923 -- /ibm/group3fs/ces/ces/.dummy
60420 1474765985 0 -rw-r--r-- 1 0 0 0 0 system FAu 2016-08-18
21:06:43.133640
-- /ibm/group3fs/ces/ces/addrs/1471554403-node0-9.155.118.69
60421 1020724013 0 drwxr-xr-x 2 0 0 4096 0 system D2um 2016-08-18
21:07:37.000695 -- /ibm/group3fs/ces/ganesha
cat /ibm/group3fs/pol.txt.list.mmfindList |awk ' { print $19}'
/ibm/group3fs/ces/ha/.dummy
/ibm/group3fs/ces/ces/.dummy
/ibm/group3fs/ces/ha/nfs/ganesha/v4recov/node3
/ibm/group3fs/ces/ha/nfs/ganesha/v4old/node3
/ibm/group3fs/pol.txt.list.mmfindList
/ibm/group3fs/ces/ces/connections
/ibm/group3fs/ces/ha/nfs/ganesha/gpfs-epoch
/ibm/group3fs/ces/ha/nfs/ganesha/v4recov
/ibm/group3fs/ces/ha/nfs/ganesha/v4old
# Start rsync - could split up single result file into multiple ones for
parallel / multi node runs
rsync -av --gpfs-attrs --progress --files-from $
( cat /ibm/group3fs/pol.txt.list.mmfindList ) 10.10.10.10:/path
Be sure you verify that extended attributes are properly replicated. I have
in mind that you need to ensure the 'remote' rsync is not the default one,
but the one with GPFS capabilities (rsync -e "remoteshell").
Kind regards,
Alex Saupp
Mit freundlichen Grüßen / Kind regards
Alexander Saupp
IBM Systems, Storage Platform, EMEA Storage Competence Center
Phone: +49 7034-643-1512 IBM Deutschland GmbH
Mobile: +49-172 7251072 Am Weiher 24
Email: alexander.saupp at de.ibm.com 65451 Kelsterbach
Germany
IBM Deutschland
GmbH /
Vorsitzender des
Aufsichtsrats:
Martin Jetter
Geschäftsführung:
Matthias Hartmann
(Vorsitzender),
Norbert Janzen,
Stefan Lutz,
Nicole Reimer,
Dr. Klaus
Seifert, Wolfgang
Wendt
Sitz der
Gesellschaft:
Ehningen /
Registergericht:
Amtsgericht
Stuttgart, HRB
14562 /
WEEE-Reg.-Nr. DE
99369940
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20181023/27092fed/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20181023/27092fed/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1C800025.gif
Type: image/gif
Size: 1851 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20181023/27092fed/attachment-0001.gif>
More information about the gpfsug-discuss
mailing list