[gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

Jonathan Buzzard jonathan.buzzard at strath.ac.uk
Mon Nov 16 22:58:49 GMT 2020


On 16/11/2020 19:44, Andi Christiansen wrote:
> Hi all,
> 
> i have got a case where a customer wants 700TB migrated from isilon to 
> Scale and the only way for him is exporting the same directory on NFS 
> from two different nodes...
> 
> as of now we are using multiple rsync processes on different parts of 
> folders within the main directory. this is really slow and will take 
> forever.. right now 14 rsync processes spread across 3 nodes fetching 
> from 2..
> 
> does anyone know of a way to speed it up? right now we see from 1Gbit to 
> 3Gbit if we are lucky(total bandwidth) and there is a total of 30Gbit 
> from scale nodes and 20Gbits from isilon so we should be able to reach 
> just under 20Gbit...
> 
> 
> if anyone have any ideas they are welcome!
> 

My biggest recommendation when doing this is to use a sqlite database to 
keep track of what is going on.

The main issue is that you are almost certainly going to need to do more 
than one rsync pass unless your source Isilon system has no user 
activity, and with 700TB to move that seems unlikely. Typically you do 
an initial rsync to move the bulk of the data while the users are still 
live, then shutdown user access to the source system and do the final 
rsync which hopefully has a significantly smaller amount of data to 
actually move.

So this is what I have done on a number of occasions now. I create a 
very simple sqlite DB with a list of source and destination folders and 
a status code. Initially the status code is set to -1.

Then I have a perl script which looks at the sqlite DB, picks a row with 
a status code of -1, and sets the status code to -2, aka that directory 
is in progress. It then proceeds to run the rsync and when it finishes 
it updates the status code to the exit code of the rsync process.

As long as all the rsync processes have access to the same copy of the 
sqlite DB (simplest to put it on either the source or destination file 
system) then all is good. You can fire off multiple rsync's on multiple 
nodes and they will all keep churning away till there is no more work to 
be done.

The advantage is you can easily interrogate the DB to find out the state 
of play. That is how many of your transfers have completed, how many are 
yet to be done, which ones are currently being transferred etc. without 
logging onto multiple nodes.

*MOST* importantly you can see if any of the rsync's had an error, by 
simply looking for status codes greater than zero. I cannot stress how 
important this is. Noting that if the source is still active you will 
see errors down to files being deleted on the source file system before 
rsync has a chance to copy them. However this has a specific exit code 
(24) so is easy to spot and not worry about.

Finally it is also very simple to set the status codes to -1 again and 
set the process away again. So the final run is easier to do.

If you want to mail me off list I can dig out a copy of the perl code I 
used if your interested. There are several version as I have tended to 
tailor to each transfer.


JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG



More information about the gpfsug-discuss mailing list