[gpfsug-discuss] Backing up GPFS with Rsync
Jonathan Buzzard
jonathan.buzzard at strath.ac.uk
Wed Mar 10 15:15:58 GMT 2021
On 10/03/2021 02:59, Alec wrote:
> CAUTION: This email originated outside the University. Check before
> clicking links or attachments.
> You would definitely be able to search by inode creation date and find
> the files you want... our 1.25m file filesystem takes about 47 seconds
> to query... One thing I would worry about though is inode deletion and
> inter-fileset file moves. The SQL based engine wouldn't be able to
> identify those changes and so you'd not be able to replicate deletes and
> such.
>
This is the problem with rsync "backups", you need to run it with
--delete otherwise any restore will "upset" your users as they find
large numbers of file they had deleted unhelpfully "restored"
> Alternatively....
> I have a script that runs in about 4 minutes and it pulls all the data
> out of the backup indexes, and compares the pre-built hourly file index
> on our system and identifies files that don't exist in the backup, so I
> have a daily backup validation... I filter the file list using
> ksh's printf date manipulation to filter out files that are less than 2
> days old, to reduce the noise. A modification to this could simply
> compare a daily file index with the previous day's index, and send rsync
> a list of files (existing or deleted) based on just a delta of the two
> indexes (sort|diff), then you could properly account for all the
> changes. If you don't care about file modifications just produce both
> lists based on creation time instead of modification time. The mmfind
> command or GPFS policy engine should be able to produce a full file
> list/index very rapidly.
>
My view would be somewhere along the lines of this is a lot of work and
if you have the space to rsync your GPFS file system to, presumably with
a server attached to said storage then for under 500 PVU of Spectrum
Protect licensing you can have a fully supported client/server Spectrum
Protect/TSM backup solution and just use mmbackup.
You need to play the game and use older hardware ;-) I use an ancient
pimped out Dell PowerEdge R300 as my TSM client node. Why this old, well
it has a dual core Xeon E3113 for only 100 PVU. Anything newer would be
quad core and 70 PVU per core which would cost an additional ~$1000 in
licensing.
If it breaks down they are under $100 on eBay. It's never skipped a beat
and I have just finished a complete planned restore of our DSS-G using it.
JAB.
--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
More information about the gpfsug-discuss
mailing list