[gpfsug-discuss] GPFS Independent Fileset Limit

Jake Carroll jake.carroll at uq.edu.au
Sat Aug 11 03:18:28 BST 2018


Just to chime in on this...

We have experienced a lot of problems as a result of the independent fileset limitation @ 1000. We have a very large campus wide deployment that relies upon filesets for collection management of large (and small) scientific data outputs. Every human who uses our GPFS AFM fabric gets a "collection", which is an independent fileset. Some may say this was an unwise design choice - but it was deliberate and related to security, namespace and inode isolation. It is a considered decision. Just not considered _enough_ given the 1000 fileset limit ;).

We've even had to go as far as re-organising entire filesystems (splitting things apart) to sacrifice performance (less spindles for the filesets on top of a filesystem) to work around it - and sometimes spill into entirely new arrays.

I've had it explained to me by internal IBM staff *why* it is hard to fix the fileset limits - and it isn't as straightforward as people think - especially in our case where each fileset is an AFM cache/home relationship - but we desperately need a solution. We logged an RFE. Hopefully others do, also.

The complexity has been explained to me by a very good colleague who has helped us a great deal inside IBM (name withheld to protect the innocent) as a knock on effect of the computational overhead and expense of things _associated_ with independent filesets, like recursing a snapshot tree. So - it really isn't as simple as things appear on the surface - but it doesn't mean we shouldn't try to fix it, I suppose!

We'd love to see this improved, too - as it's currently making things difficult.

Happy to collaborate and work together on this, as always.

-jc

----------------------------------------------------------------------

Message: 1
Date: Fri, 10 Aug 2018 11:22:23 -0400
From: Doug Johnson <djohnson at osc.edu>

Hi all,

I want to chime in because this is precisely what we have done at OSC due to the same motivations Janell described.  Our design was based in part on the guidelines in the "Petascale Data Protection" white paper from IBM.  We only have ~200 filesets and 250M inodes today, but expect to grow.

We are also very interested in details about performance issues and independent filesets.  Can IBM elaborate?

Best,
Doug


Martin Lischewski <m.lischewski at fz-juelich.de> writes:

> Hello Olaf, hello Marc,
>
> we in J?lich are in the middle of migrating/copying all our old 
> filesystems which were created with filesystem
> version: 13.23 (3.5.0.7) to new filesystems created with GPFS 5.0.1.
>
> We move to new filesystems mainly for two reasons: 1. We want to use the new increased number of subblocks.
> 2. We have to change our quota from normal "group-quota per filesystem" to "fileset-quota".
>
> The idea is to create a separate fileset for each group/project. For 
> the users the quota-computation should be much more transparent. From 
> now on all data which is stored inside of their directory (fileset) counts for their quota independent of the ownership.
>
> Right now we have round about 900 groups which means we will create round about 900 filesets per filesystem.
> In one filesystem we will have about 400million inodes (with rising tendency).
>
> This filesystem we will back up with "mmbackup" so we talked with 
> Dominic Mueller-Wicke and he recommended us to use independent 
> filesets. Because then the policy-runs can be parallelized and we can increase the backup performance. We belive that we require these parallelized policies run to meet our backup performance targets.
>
> But there are even more features we enable by using independet 
> filesets. E.g. "Fileset level snapshots" and "user and group quotas inside of a fileset".
>
> I did not know about performance issues regarding independent 
> filesets... Can you give us some more information about this?
>
> All in all we are strongly supporting the idea of increasing this limit.
>
> Do I understand correctly that by opening a PMR IBM allows to increase 
> this limit on special sides? I would rather like to increase the limit and make it official public available and supported.
>
> Regards,
>
> Martin
>
> Am 10.08.2018 um 14:51 schrieb Olaf Weiser:
>
>  Hallo Stephan,
>  the limit is not a hard coded limit - technically spoken, you can raise it easily.
>  But as always, it is a question of test 'n support ..
>
>  I've seen customer cases, where the use of much smaller amount of 
> independent filesets generates a lot  performance issues, hangs ... at least noise and partial trouble ..
>  it might be not the case with your specific workload, because due to 
> the fact, that you 're running already  close to 1000 ...
>
>  I suspect , this number of 1000 file sets - at the time of 
> introducing it - was as also just that one had to pick a  number...
>
>  ... turns out.. that a general commitment to support > 1000 
> ind.fileset is more or less hard.. because what  uses cases should we 
> test / support  I think , there might be a good chance for you , that 
> for your specific workload, one would allow and support  more than 
> 1000
>
>  do you still have a PMR for your side for this ? - if not - I know .. 
> open PMRs is an additional ...but could you  please ..
>  then we can decide .. if raising the limit is an option for you ..
>
>  Mit freundlichen Gr??en / Kind regards
>
>  Olaf Weiser
>
>  EMEA Storage Competence Center Mainz, German / IBM Systems, Storage 
> Platform,
>  
> ----------------------------------------------------------------------
> ---------------------------------------------------------------------
>  IBM Deutschland
>  IBM Allee 1
>  71139 Ehningen
>  Phone: +49-170-579-44-66
>  E-Mail: olaf.weiser at de.ibm.com
>  
> ----------------------------------------------------------------------
> ---------------------------------------------------------------------
>  IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter
>  Gesch?ftsf?hrung: Martina Koederitz (Vorsitzende), Susanne Peter, 
> Norbert Janzen, Dr. Christian Keller, Ivo  Koerner, Markus Koerner  
> Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht 
> Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE
>  99369940
>
>  From: "Peinkofer, Stephan" <Stephan.Peinkofer at lrz.de>
>  To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>  Cc: Doris Franke <doris.franke at de.ibm.com>, Uwe Tron 
> <utron at lenovo.com>, Dorian Krause  <d.krause at fz-juelich.de>
>  Date: 08/10/2018 01:29 PM
>  Subject: [gpfsug-discuss] GPFS Independent Fileset Limit  Sent by: 
> gpfsug-discuss-bounces at spectrumscale.org
> ----------------------------------------------------------------------
> -----------------------------
>
>  Dear IBM and GPFS List,
>
>  we at the Leibniz Supercomputing Centre and our GCS Partners from the 
> J?lich Supercomputing Centre will  soon be hitting the current Independent Fileset Limit of 1000 on a number of our GPFS Filesystems.
>
>  There are also a number of RFEs from other users open, that target this limitation:
>  
> https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=
> 56780
>  
> https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=
> 120534
>  
> https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=
> 106530
>  
> https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=
> 85282
>
>  I know GPFS Development was very busy fulfilling the CORAL 
> requirements but maybe now there is again  some time to improve something else.
>
>  If there are any other users on the list that are approaching the 
> current limitation in independent filesets,  please take some time and vote for the RFEs above.
>
>  Many thanks in advance and have a nice weekend.
>  Best Regards,
>  Stephan Peinkofer
>
>  _______________________________________________
>  gpfsug-discuss mailing list
>  gpfsug-discuss at spectrumscale.org
>  http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss


------------------------------

Message: 2
Date: Fri, 10 Aug 2018 16:01:17 +0000
From: Bryan Banister <bbanister at jumptrading.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] GPFS Independent Fileset Limit
Message-ID: <01780289b9e14e599f848f78b33998d8 at jumptrading.com>
Content-Type: text/plain; charset="iso-8859-1"

Just as a follow up to my own note, Stephan, already provided a list of existing RFEs from which to vote through the IBM RFE site, cheers, -Bryan

From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Bryan Banister
Sent: Friday, August 10, 2018 10:51 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] GPFS Independent Fileset Limit

Note: External Email
________________________________
This is definitely a great candidate for a RFE, if one does not already exist.

Not to try and contradict by friend Olaf here, but I have been talking a lot with those internal to IBM, and the PMR process is for finding and correcting operational problems with the code level you are running, and closing out the PMR as quickly as possible.  PMRs are not the vehicle for getting substantive changes and enhancements made to the product in general, which the RFE process is really the main way to do this.

I just got off a call with Kristie and Carl about the RFE process and those on the list may know that we are working to improve this overall process.  More will be sent out about this in the near future!!  So I thought I would chime in on this discussion here to hopefully help us understand how important the RFE (admittedly currently got great) process really is and will be a great way to work together on these common goals and needs for the product we rely so heavily upon!

Cheers!!
-Bryan

From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>> On Behalf Of Peinkofer, Stephan
Sent: Friday, August 10, 2018 10:40 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] GPFS Independent Fileset Limit

Note: External Email
________________________________

Dear Olaf,



I know that this is "just" a "support" limit. However Sven some day on a UG meeting in Ehningen told me that there is more to this than just

adjusting your QA qualification tests since the way it is implemented today does not really scale ;).

That's probably the reason why you said you see sometimes problems when you are not even close to the limit.



So if you look at the 250PB Alpine file system of Summit today, that is what's going to deployed at more than one site world wide in 2-4 years and

imho independent filesets are a great way to make this large systems much more handy while still maintaining a unified namespace.

So I really think it would be beneficial if the architectural limit that prevents scaling the number of independent filesets could be removed at all.


Best Regards,
Stephan Peinkofer
________________________________
From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>> on behalf of Olaf Weiser <olaf.weiser at de.ibm.com<mailto:olaf.weiser at de.ibm.com>>
Sent: Friday, August 10, 2018 2:51 PM
To: gpfsug main discussion list
Cc: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>; Doris Franke; Uwe Tron; Dorian Krause
Subject: Re: [gpfsug-discuss] GPFS Independent Fileset Limit

Hallo Stephan,
the limit is not a hard coded limit  - technically spoken, you can raise it easily.
But as always, it is a question of test 'n support ..

I've seen customer cases, where the use of much smaller amount of independent filesets generates a lot performance issues, hangs ... at least noise and partial trouble ..
it might be not the case with your specific workload, because due to the fact, that you 're running already  close to 1000 ...

I suspect , this number of 1000 file sets  - at the time of introducing it - was as also just that one had to pick a number...

... turns out.. that a general commitment to support > 1000 ind.fileset is more or less hard.. because what uses cases should we test / support I think , there might be a good chance for you , that for your specific workload, one would allow and support more than 1000

do you still have a PMR for your side for this ?  - if not - I know .. open PMRs is an additional ...but could you please ..
then we can decide .. if raising the limit is an option for you ..





Mit freundlichen Gr??en / Kind regards


Olaf Weiser

EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland
IBM Allee 1
71139 Ehningen
Phone: +49-170-579-44-66
E-Mail: olaf.weiser at de.ibm.com<mailto:olaf.weiser at de.ibm.com>
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter
Gesch?ftsf?hrung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert Janzen, Dr. Christian Keller, Ivo Koerner, Markus Koerner Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940



From:        "Peinkofer, Stephan" <Stephan.Peinkofer at lrz.de<mailto:Stephan.Peinkofer at lrz.de>>
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Cc:        Doris Franke <doris.franke at de.ibm.com<mailto:doris.franke at de.ibm.com>>, Uwe Tron <utron at lenovo.com<mailto:utron at lenovo.com>>, Dorian Krause <d.krause at fz-juelich.de<mailto:d.krause at fz-juelich.de>>
Date:        08/10/2018 01:29 PM
Subject:        [gpfsug-discuss] GPFS Independent Fileset Limit
Sent by:        gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
________________________________



Dear IBM and GPFS List,

we at the Leibniz Supercomputing Centre and our GCS Partners from the J?lich Supercomputing Centre will soon be hitting the current Independent Fileset Limit of 1000 on a number of our GPFS Filesystems.

There are also a number of RFEs from other users open, that target this limitation:
https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=56780
Sign up for an IBM account<https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=56780>
www.ibm.com<http://www.ibm.com>
IBM account registration



https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=120534
https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=106530
https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=85282

I know GPFS Development was very busy fulfilling the CORAL requirements but maybe now there is again some time to improve something else.

If there are any other users on the list that are approaching the current limitation in independent filesets, please take some time and vote for the RFEs above.

Many thanks in advance and have a nice weekend.
Best Regards,
Stephan Peinkofer

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential, or privileged information and/or personal data. If you are not the intended recipient, you are hereby notified that any review, dissemination, or copying of this email is strictly prohibited, and requested to notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request, or solicitation of any kind to buy, sell, subscribe, redeem, or perform any type of transaction of a financial product. Personal data, as defined by applicable data privacy laws, contained in this email may be processed by the Company, and any of its affiliated or related companies, for potent  ial ongoing compliance and/or business-related purposes. You may have rights regarding your personal data; for information on exercising these rights or the Company's treatment of personal data, please email datarequests at jumptrading.com<mailto:datarequests at jumptrading.com>.

________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential, or privileged information and/or personal data. If you are not the intended recipient, you are hereby notified that any review, dissemination, or copying of this email is strictly prohibited, and requested to notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request, or solicitation of any kind to buy, sell, subscribe, redeem, or perform any type of transaction of a financial product. Personal data, as defined by applicable data privacy laws, contained in this email may be processed by the Company, and any of its affiliated or related companies, for potent  ial ongoing compliance and/or business-related purposes. You may have rights regarding your personal data; for information on exercising these rights or the Company's treatment of personal data, please email datarequests at jumptrading.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20180810/c91cc55c/attachment.html>

------------------------------

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


End of gpfsug-discuss Digest, Vol 79, Issue 29
**********************************************




More information about the gpfsug-discuss mailing list