[gpfsug-discuss] What is an independent fileset? was: mmbackup with fileset : scope errors

David D. Johnson david_johnson at brown.edu
Thu May 18 15:24:17 BST 2017


Here is one big reason independent filesets are problematic:
A5.13:
Table 43. Maximum number of filesets
Version of GPFS	Maximum Number of Dependent Filesets	Maximum Number of Independent Filesets
IBM Spectrum Scale V4	10,000	1,000
GPFS V3.5	10,000	1,000
Another is that each independent fileset must be sized (and resized) for the number of inodes it is expected to contain.
If that runs out (due to growth or a runaway user job), new files cannot be created until the inode limit is bumped up.
This is true of the root namespace as well, but there’s only one number to watch per filesystem. 

 — ddj
Dave Johnson
Brown University

> On May 18, 2017, at 10:12 AM, Peter Childs <p.childs at qmul.ac.uk> wrote:
> 
> As I understand it,
> 
> mmbackup calls mmapplypolicy so this stands for mmapplypolicy too.....
> 
> mmapplypolicy scans the metadata inodes (file) as requested depending on the query supplied.
> 
> You can ask mmapplypolicy to scan a fileset, inode space or filesystem.
> 
> If scanning a fileset it scans the inode space that fileset is dependant on, for all files in that fileset. Smaller inode spaces hence less to scan, hence its faster to use an independent filesets, you get a list of what to process quicker.
> 
> Another advantage is that once an inode is allocated you can't deallocate it, however you can delete independent filesets and hence deallocate the inodes, so if you have a task which has losts and lots of small files which are only needed for a short period of time, you can create a new independent fileset for them work on them and then blow them away afterwards.
> 
> I like independent filesets I'm guessing the only reason dependant filesets are used by default is history.....
> 
> 
> Peter
> 
> 
> On 18/05/17 14:58, Jaime Pinto wrote:
>> Thanks for the explanation Mark and Luis,
>> 
>> It begs the question: why filesets are created as dependent by default, if the adverse repercussions can be so great afterward? Even in my case, where I manage GPFS and TSM deployments (and I have been around for a while), didn't realize at all that not adding and extra option at fileset creation time would cause me huge trouble with scaling later on as I try to use mmbackup.
>> 
>> When you have different groups to manage file systems and backups that don't read each-other's manuals ahead of time then we have a really bad recipe.
>> 
>> I'm looking forward to your explanation as to why mmbackup cares one way or another.
>> 
>> I'm also hoping for a hint as to how to configure backup exclusion rules on the TSM side to exclude fileset traversing on the GPFS side. Is mmbackup smart enough (actually smarter than TSM client itself) to read the exclusion rules on the TSM configuration and apply them before traversing?
>> 
>> Thanks
>> Jaime
>> 
>> Quoting "Marc A Kaplan" <makaplan at us.ibm.com>:
>> 
>>> When I see "independent fileset" (in Spectrum/GPFS/Scale)  I always think
>>> and try to read that as "inode space".
>>> 
>>> An "independent fileset" has all the attributes of an (older-fashioned)
>>> dependent fileset PLUS all of its files are represented by inodes that are
>>> in a separable range of inode numbers - this allows GPFS to efficiently do
>>> snapshots of just that inode-space (uh... independent fileset)...
>>> 
>>> And... of course the files of dependent filesets must also be represented
>>> by inodes -- those inode numbers are within the inode-space of whatever
>>> the containing independent fileset is... as was chosen when you created
>>> the fileset....   If you didn't say otherwise, inodes come from the
>>> default "root" fileset....
>>> 
>>> Clear as your bath-water, no?
>>> 
>>> So why does mmbackup care one way or another ???   Stay tuned....
>>> 
>>> BTW - if you look at the bits of the inode numbers carefully --- you may
>>> not immediately discern what I mean by a "separable range of inode
>>> numbers" -- (very technical hint) you may need to permute the bit order
>>> before you discern a simple pattern...
>>> 
>>> 
>>> 
>>> From:   "Luis Bolinches" <luis.bolinches at fi.ibm.com>
>>> To:     gpfsug-discuss at spectrumscale.org
>>> Cc:     gpfsug-discuss at spectrumscale.org
>>> Date:   05/18/2017 02:10 AM
>>> Subject:        Re: [gpfsug-discuss] mmbackup with fileset : scope errors
>>> Sent by:        gpfsug-discuss-bounces at spectrumscale.org
>>> 
>>> 
>>> 
>>> Hi
>>> 
>>> There is no direct way to convert the one fileset that is dependent to
>>> independent or viceversa.
>>> 
>>> I would suggest to take a look to chapter 5 of the 2014 redbook, lots of
>>> definitions about GPFS ILM including filesets
>>> http://www.redbooks.ibm.com/abstracts/sg248254.html?Open Is not the only
>>> place that is explained but I honestly believe is a good single start
>>> point. It also needs an update as does nto have anything on CES nor ESS,
>>> so anyone in this list feel free to give feedback on that page people with
>>> funding decisions listen there.
>>> 
>>> So you are limited to either migrate the data from that fileset to a new
>>> independent fileset (multiple ways to do that) or use the TSM client
>>> config.
>>> 
>>> ----- Original message -----
>>> From: "Jaime Pinto" <pinto at scinet.utoronto.ca>
>>> Sent by: gpfsug-discuss-bounces at spectrumscale.org
>>> To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>,
>>> "Jaime Pinto" <pinto at scinet.utoronto.ca>
>>> Cc:
>>> Subject: Re: [gpfsug-discuss] mmbackup with fileset : scope errors
>>> Date: Thu, May 18, 2017 4:43 AM
>>> 
>>> There is hope. See reference link below:
>>> https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.1.1/com.ibm.spectrum.scale.v4r11.ins.doc/bl1ins_tsm_fsvsfset.htm 
>>> 
>>> 
>>> The issue has to do with dependent vs. independent filesets, something
>>> I didn't even realize existed until now. Our filesets are dependent
>>> (for no particular reason), so I have to find a way to turn them into
>>> independent.
>>> 
>>> The proper option syntax is "--scope inodespace", and the error
>>> message actually flagged that out, however I didn't know how to
>>> interpret what I saw:
>>> 
>>> 
>>> # mmbackup /gpfs/sgfs1/sysadmin3 -N tsm-helper1-ib0 -s /dev/shm
>>> --scope inodespace --tsm-errorlog $logfile -L 2
>>> --------------------------------------------------------
>>> mmbackup: Backup of /gpfs/sgfs1/sysadmin3 begins at Wed May 17
>>> 21:27:43 EDT 2017.
>>> --------------------------------------------------------
>>> Wed May 17 21:27:45 2017 mmbackup:mmbackup: Backing up *dependent*
>>> fileset sysadmin3 is not supported
>>> Wed May 17 21:27:45 2017 mmbackup:This fileset is not suitable for
>>> fileset level backup.  exit 1
>>> --------------------------------------------------------
>>> 
>>> Will post the outcome.
>>> Jaime
>>> 
>>> 
>>> 
>>> Quoting "Jaime Pinto" <pinto at scinet.utoronto.ca>:
>>> 
>>>> Quoting "Luis Bolinches" <luis.bolinches at fi.ibm.com>:
>>>> 
>>>>> Hi
>>>>> 
>>>>> have you tried to add exceptions on the TSM client config file?
>>>> 
>>>> Hey Luis,
>>>> 
>>>> That would work as well (mechanically), however it's not elegant or
>>>> efficient. When you have over 1PB and 200M files on scratch it will
>>>> take many hours and several helper nodes to traverse that fileset just
>>>> to be negated by TSM. In fact exclusion on TSM are just as inefficient.
>>>> Considering that I want to keep project and sysadmin on different
>>>> domains then it's much worst, since we have to traverse and exclude
>>>> scratch & (project|sysadmin) twice, once to capture sysadmin and again
>>>> to capture project.
>>>> 
>>>> If I have to use exclusion rules it has to rely sole on gpfs rules, and
>>>> somehow not traverse scratch at all.
>>>> 
>>>> I suspect there is a way to do this properly, however the examples on
>>>> the gpfs guide and other references are not exhaustive. They only show
>>>> a couple of trivial cases.
>>>> 
>>>> However my situation is not unique. I suspect there are may facilities
>>>> having to deal with backup of HUGE filesets.
>>>> 
>>>> So the search is on.
>>>> 
>>>> Thanks
>>>> Jaime
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> 
>>>>> Assuming your GPFS dir is /IBM/GPFS and your fileset to exclude is
>>> linked
>>>>> on /IBM/GPFS/FSET1
>>>>> 
>>>>> dsm.sys
>>>>> ...
>>>>> 
>>>>> DOMAIN /IBM/GPFS
>>>>> EXCLUDE.DIR /IBM/GPFS/FSET1
>>>>> 
>>>>> 
>>>>> From:   "Jaime Pinto" <pinto at scinet.utoronto.ca>
>>>>> To:     "gpfsug main discussion list"
>>> <gpfsug-discuss at spectrumscale.org>
>>>>> Date:   17-05-17 23:44
>>>>> Subject:        [gpfsug-discuss] mmbackup with fileset : scope errors
>>>>> Sent by:        gpfsug-discuss-bounces at spectrumscale.org
>>>>> 
>>>>> 
>>>>> 
>>>>> I have a g200 /gpfs/sgfs1 filesystem with 3 filesets:
>>>>> * project3
>>>>> * scratch3
>>>>> * sysadmin3
>>>>> 
>>>>> I have no problems mmbacking up /gpfs/sgfs1 (or sgfs1), however we
>>>>> have no need or space to include *scratch3* on TSM.
>>>>> 
>>>>> Question: how to craft the mmbackup command to backup
>>>>> /gpfs/sgfs1/project3 and/or /gpfs/sgfs1/sysadmin3 only?
>>>>> 
>>>>> Below are 3 types of errors:
>>>>> 
>>>>> 1) mmbackup /gpfs/sgfs1/sysadmin3 -N tsm-helper1-ib0 -s /dev/shm
>>>>> --tsm-errorlog $logfile -L 2
>>>>> 
>>>>> ERROR: mmbackup: Options /gpfs/sgfs1/sysadmin3 and --scope filesystem
>>>>> cannot be specified at the same time.
>>>>> 
>>>>> 2) mmbackup /gpfs/sgfs1/sysadmin3 -N tsm-helper1-ib0 -s /dev/shm
>>>>> --scope inodespace --tsm-errorlog $logfile -L 2
>>>>> 
>>>>> ERROR: Wed May 17 16:27:11 2017 mmbackup:mmbackup: Backing up
>>>>> dependent fileset sysadmin3 is not supported
>>>>> Wed May 17 16:27:11 2017 mmbackup:This fileset is not suitable for
>>>>> fileset level backup.  exit 1
>>>>> 
>>>>> 3) mmbackup /gpfs/sgfs1/sysadmin3 -N tsm-helper1-ib0 -s /dev/shm
>>>>> --scope filesystem --tsm-errorlog $logfile -L 2
>>>>> 
>>>>> ERROR: mmbackup: Options /gpfs/sgfs1/sysadmin3 and --scope filesystem
>>>>> cannot be specified at the same time.
>>>>> 
>>>>> These examples don't really cover my case:
>>>>> 
>>> https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adm_mmbackup.htm#mmbackup__mmbackup_examples 
>>> 
>>>>> 
>>>>> 
>>>>> Thanks
>>>>> Jaime
>>>>> 
>>>>> 
>>>>>         ************************************
>>>>>          TELL US ABOUT YOUR SUCCESS STORIES
>>>>>         http://www.scinethpc.ca/testimonials
>>>>>         ************************************
>>>>> ---
>>>>> Jaime Pinto
>>>>> SciNet HPC Consortium - Compute/Calcul Canada
>>>>> www.scinet.utoronto.ca - www.computecanada.ca
>>>>> University of Toronto
>>>>> 661 University Ave. (MaRS), Suite 1140
>>>>> Toronto, ON, M5G1M1
>>>>> P: 416-978-2755
>>>>> C: 416-505-1477
>>>>> 
>>>>> ----------------------------------------------------------------
>>>>> This message was sent using IMP at SciNet Consortium, University of
>>>>> Toronto.
>>>>> 
>>>>> _______________________________________________
>>>>> gpfsug-discuss mailing list
>>>>> gpfsug-discuss at spectrumscale.org
>>>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> Ellei edellä ole toisin mainittu: / Unless stated otherwise above:
>>>>> Oy IBM Finland Ab
>>>>> PL 265, 00101 Helsinki, Finland
>>>>> Business ID, Y-tunnus: 0195876-3
>>>>> Registered in Finland
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>         ************************************
>>>>          TELL US ABOUT YOUR SUCCESS STORIES
>>>>         http://www.scinethpc.ca/testimonials
>>>>         ************************************
>>>> ---
>>>> Jaime Pinto
>>>> SciNet HPC Consortium - Compute/Calcul Canada
>>>> www.scinet.utoronto.ca - www.computecanada.ca
>>>> University of Toronto
>>>> 661 University Ave. (MaRS), Suite 1140
>>>> Toronto, ON, M5G1M1
>>>> P: 416-978-2755
>>>> C: 416-505-1477
>>>> 
>>>> ----------------------------------------------------------------
>>>> This message was sent using IMP at SciNet Consortium, University of
>>> Toronto.
>>>> 
>>>> _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>>          ************************************
>>>           TELL US ABOUT YOUR SUCCESS STORIES
>>>          http://www.scinethpc.ca/testimonials
>>>          ************************************
>>> ---
>>> Jaime Pinto
>>> SciNet HPC Consortium - Compute/Calcul Canada
>>> www.scinet.utoronto.ca - www.computecanada.ca
>>> University of Toronto
>>> 661 University Ave. (MaRS), Suite 1140
>>> Toronto, ON, M5G1M1
>>> P: 416-978-2755
>>> C: 416-505-1477
>>> 
>>> ----------------------------------------------------------------
>>> This message was sent using IMP at SciNet Consortium, University of
>>> Toronto.
>>> 
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>> 
>>> 
>>> 
>>> Ellei edellä ole toisin mainittu: / Unless stated otherwise above:
>>> Oy IBM Finland Ab
>>> PL 265, 00101 Helsinki, Finland
>>> Business ID, Y-tunnus: 0195876-3
>>> Registered in Finland
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>         ************************************
>>          TELL US ABOUT YOUR SUCCESS STORIES
>>         http://www.scinethpc.ca/testimonials
>>         ************************************
>> ---
>> Jaime Pinto
>> SciNet HPC Consortium - Compute/Calcul Canada
>> www.scinet.utoronto.ca - www.computecanada.ca
>> University of Toronto
>> 661 University Ave. (MaRS), Suite 1140
>> Toronto, ON, M5G1M1
>> P: 416-978-2755
>> C: 416-505-1477
>> 
>> ----------------------------------------------------------------
>> This message was sent using IMP at SciNet Consortium, University of Toronto.
>> 
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170518/5aea6f03/attachment-0002.htm>


More information about the gpfsug-discuss mailing list