[gpfsug-discuss] naive question about rsync: run it on a client or on NSD server?
Jonathan Buzzard
jonathan.buzzard at strath.ac.uk
Fri Feb 14 21:09:14 GMT 2020
On 14/02/2020 16:24, Sanchez, Paul wrote:
> Some (perhaps obvious) points to consider:
>
> - There are some corner cases (e.g. preserving hard-linked files or
> sparseness) which require special options.
>
> - Depending on your level of churn, it may be helpful to pre-stage
> the sync before your cutover so that there is less data movement
> required, and you're primarily comparing metadata.
>
> - Files on the source filesysytem might change (and become internally
> inconsistent) during your rsync, so you should generally sync from a
> snapshot on the source.
In my experience this causes an rsync to exit with a none zero error
code. See later as to why this is useful. Also it will likely have a
different mtime that will cause it be resynced on a subsequent run, the
final one will be with the file system in a "read only" state. Not
necessarily mounted read only but without anything running that might
change stuff.
[SNIP]
>
> - If you decide to do a final "offline" sync, you want it to be fast
> so users can get back to work sooner, so parallelism is usually a
> must. If you have lots of filesets, then that's a convenient way to
> split the work.
This final "offline" sync is an absolute must, in my experience unless
you are able to be rather woolly about preserving data.
>
> - If you have any filesets with many more inodes than the others,
> keep in mind that those will likely take the longest to complete.
>
Indeed. We found last time that we did an rsync which was for a HPC
system from the put of woe that is Lustre to GPFS there was huge mileage
to be hand from telling users that they would get on the new system once
their data was synced, it would be done on a "per user" basis with the
priority given to the users with a combination of the smallest amount of
data and the smallest number of files. Did unbelievable wonders for the
users to clean up their files. One user went from over 17 million files
to under 50 thousand! The amount of data needing syncing nearly halved.
It shrank to ~60% of the pre-announcement size.
> - Test, test, test. You usually won't get this right on the first go
> or know how long a full sync takes without practice. Remember that
> you'll need to employ options to delete extraneous files on the
> target when you're syncing over the top of a previous attempt, since
> files intentionally deleted on the source aren't usually welcome if
> they reappear after a migration.
>
rsync has a --delete option for that.
I am going to add that if you do any sort of ILM/HSM then an rsync is
going to destroy you ability to identify old files that have not been
accessed, as the rsync will up date the atime of everything (don't ask
how I know).
If you have a backup (of course you do) I would strongly recommend
considering getting your first "pass" from a restore. Firstly it won't
impact the source file system while it is still in use and second it
allows you to check your backup actually works :-)
Finally when rsyncing systems like this I use a Perl script with an
sqlite DB. Basically a list of directories to sync, you can have both
source and destination to make wonderful things happen if wanted, along
with a flag field. The way I use that is -1 means not synced, -2 means
the folder in question is currently been synced, and anything else is
the exit code of rsync.
If you write the Perl script correctly you can start it on any number of
nodes, just dump the sqlite DB on a shared folder somewhere (either the
source or destination file systems work well here). If you are doing it
in parallel record the node which did the rsync as well it can be useful
in finding any issues in my experience.
Once everything is done you can quickly check the sqlite DB for none
zero flag fields to find out what if anything has failed, which gives
you the confidence that your sync has completed accurately. Also any
flag fields less than zero show you it's not finished.
Finally you might want to record the time each individual rsync took,
it's handy for working out that ordering I mentioned :-)
JAB.
--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
More information about the gpfsug-discuss
mailing list