[gpfsug-discuss] *New* IBM Spectrum Protect Whitepaper "Petascale Data Protection"

Sven Oehme oehmes at gmail.com
Wed Aug 31 00:24:59 BST 2016


so lets start with some simple questions.

when you say mmbackup takes ages, what version of gpfs code are you running
?
how do you execute the mmbackup command ? exact parameters would be useful
.
what HW are you using for the metadata disks ?
how much capacity (df -h) and how many inodes (df -i) do you have in the
filesystem you try to backup ?

sven


On Tue, Aug 30, 2016 at 3:02 PM, Lukas Hejtmanek <xhejtman at ics.muni.cz>
wrote:

> Hello,
>
> On Mon, Aug 29, 2016 at 09:20:46AM +0200, Frank Kraemer wrote:
> > Find the paper here:
> >
> > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/
> Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection
>
> thank you for the paper, I appreciate it.
>
> However, I wonder whether it could be extended a little. As it has the
> title
> Petascale Data Protection, I think that in Peta scale, you have to deal
> with
> millions (well rather hundreds of millions) of files you store in and this
> is
> something where TSM does not scale well.
>
> Could you give some hints:
>
> On the backup site:
> mmbackup takes ages for:
> a) scan (try to scan 500M files even in parallel)
> b) backup - what if 10 % of files get changed - backup process can be
> blocked
> several days as mmbackup cannot run in several instances on the same file
> system, so you have to wait until one run of mmbackup finishes. How long
> could
> it take at petascale?
>
> On the restore site:
> how can I restore e.g. 40 millions of file efficiently? dsmc restore
> '/path/*'
> runs into serious troubles after say 20M files (maybe wrong internal
> structures used), however, scanning 1000 more files takes several minutes
> resulting the dsmc restore never reaches that 40M files.
>
> using filelists the situation is even worse. I run dsmc restore -filelist
> with a filelist consisting of 2.4M files. Running for *two* days without
> restoring even a single file. dsmc is consuming 100 % CPU.
>
> So any hints addressing these issues with really large number of files
> would
> be even more appreciated.
>
> --
> Lukáš Hejtmánek
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160830/d9b3fb68/attachment-0002.htm>


More information about the gpfsug-discuss mailing list