[gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

Jonathan Buzzard jonathan.buzzard at strath.ac.uk
Mon Nov 16 23:12:47 GMT 2020


On 16/11/2020 21:58, Skylar Thompson wrote:
> When we did a similar (though larger, at ~2.5PB) migration, we used rsync
> as well, but ran one rsync process per Isilon node, and made sure the NFS
> clients were hitting separate Isilon nodes for their reads. We also didn't
> have more than one rsync process running per client, as the Linux NFS
> client (at least in CentOS 6) was terrible when it came to concurrent access.
> 

The million dollar question IMHO is the number of files and their sizes.

Basically if you have a million 1KB files to move it is going to take 
much longer than a 100 1GB files. That is the overhead of dealing with 
each file is a real bitch and kills your attainable transfer speed stone 
dead.

One option I have used in the past is to use your last backup and 
restore to the new system, then rsync in the changes. That way you don't 
impact the source file system which is live.

Another option I have used is to inform users in advance that data will 
be transferred based on a metric of how many files and how much data 
they have. So the less data and fewer files the quicker you will get 
access to the new system once access to the old system is turned off.

It is amazing how much users clear up junk under this scenario. Last 
time I did this a single user went from over 17 million files to 11 
thousand! In total many many TB of data just vanished from the system 
(around half of the data when puff) as users actually got around to some 
house keeping LOL. Moving less data and files is always less painful.

> Whatever method you end up using, I can guarantee you will be much happier
> once you are on GPFS. :)
> 
Goes without saying :-)


JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG



More information about the gpfsug-discuss mailing list