[gpfsug-discuss] ILM and Backup Question

Hughes, Doug Douglas.Hughes at DEShawResearch.com
Mon Oct 26 13:42:47 GMT 2015


We have all of our GPFSmetadata on FlashCache devices (nee Ramsan) and that helps a lot. We also have our data going into monotonically increasing buckets of about 30TB that we call lockers (e.g. locker100, locker101, locker102), with 1 primary active at a time.

We have an hourly job that scans the most recent 2 lockers (taked about 45 seconds each) to generate a file list using the ILM 'LIST' policy of all files that have been modified or created in the last hour. That goes to a file that has all of the names which then trickles to a custom backup daemon that has up to 10 threads for rsyncing these over to our HSM server (running GPFS/TSM space management). From there things automatically get backed up and archived. Not all hourlies are necessarily complete (we can't guarantee that nobody is still hanging on to $lockernum-2 for instance), so we have a daily that scans the entire 3PB to find anything created/updated in the last 24 hours and does an rsync on that. There's no harm in duplication of hourlies from the rsync perspective because rsync takes care of that (already exists on destination). The daily job takes about 45 minutes. Needless to say it would be impossible without metadata on a fast flash device.



Sent from my android device.

-----Original Message-----
From: "Kallback-Rose, Kristy A" <kallbac at iu.edu>
To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
Sent: Sun, 25 Oct 2015 22:39
Subject: [gpfsug-discuss] ILM and Backup Question

Simon wrote recently in the GPFS UG Blog: "We also got into discussion on backup and ILM, and I think its amazing how everyone does these things in their own slightly different way. I think this might be an interesting area for discussion over on the group mailing list. There's a lot of options and different ways to do things!”

Yes, please! I’m *very* interested in what others are doing.

We (IU) are currently doing a POC with GHI for DR backups (GHI=GPFS HPSS Integration—we have had HPSS for a very long time), but I’m interested what others are doing with either ILM or other methods to brew their own backup solutions, how much they are backing up and with what regularity, what resources it takes, etc.

If you have anything going on at your site that’s relevant, can you please share?

Thanks,
Kristy

Kristy Kallback-Rose
Manager, Research Storage
Indiana University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151026/2f40d5af/attachment-0005.htm>


More information about the gpfsug-discuss mailing list