[gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection" - how about a Billion files in 140 seconds?

Marc A Kaplan makaplan at us.ibm.com
Wed Aug 31 19:10:07 BST 2016


When you write something like "mmbackup takes ages" - that let's us know 
how you feel, kinda. 

But we need some facts and data to make a determination if there is a real 
problem and whether and how it might be improved.

Just to do a "back of the envelope" estimate of how long backup operations 
"ought to" take - we'd need to
know how many disks and/or SSDs with what performance characteristics,
how many nodes withf what performance characteristics,
network "fabric(s)",

Number of files to be scanned,
Average number of files per directory,
GPFS blocksize(s) configured,

Backup devices available with speeds and feeds, etc, etc.

But anyway just to throw ballpark numbers "out there" to give you an idea 
of what is possible.

I can tell you that a 20 months ago Sven and I benchmarked mmapplypolicy 
scanning 983 Million files  in 136 seconds!

The command looked like this:

mmapplypolicy /ibm/fs2-1m-p01/shared/Btt -g /ibm/fs2-1m-p01/tmp -d 7 -A 
256 -a 32 -n 8  -P /ghome/makaplan/sventests/milli.policy -I test -L 1 -N 
fastclients

fastclients was  10 X86_64 commodity nodes

The fs2-1m-p01 file system was hosted on just two IBM GSS nodes and 
everything was on an Infiniband switch.

We packed about 7000 files into each directory....  (This admittedly may 
not be typical...)

This is NOT to say you could back up that many files that fast, but 
Spectrum Scale metadata scanning can be fast, even 
with relatively modest hardware resources.

YMMV ;-)

Marc of GPFS

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160831/2c9df317/attachment-0002.htm>


More information about the gpfsug-discuss mailing list