[gpfsug-discuss] SS 5.0.x and quota issues

Ryan Novosielski novosirj at rutgers.edu
Mon May 18 18:34:06 BST 2020


Is there a simple way to tell if we’re currently being affected by this bug? 

We run 5.0.4-1 on the client side and 5.0.3-2.3 on the server side (DSS-G 2.4b I believe it is).

--
____
|| \\UTGERS,  	 |---------------------------*O*---------------------------
||_// the State	 |         Ryan Novosielski - novosirj at rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\    of NJ	 | Office of Advanced Research Computing - MSB C630, Newark
     `'

> On May 18, 2020, at 12:58 PM, Felipe Knop <knop at us.ibm.com> wrote:
> 
> All,
>  
> Likely not an answer to these situations, but a fix was introduced in 5.0.4.4 (APAR IJ22894) and 4.2.3.22 (APAR IJ24661) to address a (long-standing) problem where mmcheckquota would compute quotas incorrectly in file systems where metadata pool and data pool subblocksizes are different.
>  
>   Felipe
>  
> ----
> Felipe Knop knop at us.ibm.com
> GPFS Development and Security
> IBM Systems
> IBM Building 008
> 2455 South Rd, Poughkeepsie, NY 12601
> (845) 433-9314 T/L 293-9314
>  
>  
>  
> ----- Original message -----
> From: Jaime Pinto <pinto at scinet.utoronto.ca>
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
> To: gpfsug-discuss at spectrumscale.org
> Cc:
> Subject: [EXTERNAL] Re: [gpfsug-discuss] SS 5.0.x and quota issues
> Date: Mon, May 18, 2020 9:56 AM
>  
> So Bob,
> Yes, we too have observed an uncharacteristic lag on the correction of the internal quota accounting on GPFS since we updated from version 3.3 to version 4.x some 7-8 years ago. That lag remains through version 5.0.x as well. And it persisted through several appliances (DDN, G200, GSS, ESS and now DSS-G). In our university environment there is also a lot of data churning, in particular small files.
> 
> The workaround has always been to periodically run mmcheckquota on the top independent fileset to expedite that correction (I have a crontab script that measures the relative size of the in-dought columns on the mmrepquota report, size and inodes, and if either exceeds 2% for any USR/GRP/FILESET I run mmrepquota)
> 
> We have opened supports calls with IBM about this issue in the past, and we never got a solution, to possibly adjust some GPFS configuration parameter, and have this correction done automatically. We gave up.
> 
> And that begs the question: what do you mean by "... 5.0.4-4 ... that has a fix for mmcheckquota"? Isn't mmcheckquota zeroing the in-doubt columns when you run it?
> 
> The fix should be for gpfs (something buried in the code over many versions). As far as I can tell there has never been anything wrong with mmcheckquota.
> 
> Thanks
> Jaime
> 
> 
> On 5/18/2020 08:59:09, Cregan, Bob wrote:
> > Hi,
> >        At Imperial we  have been experiencing an issue with SS 5.0.x and quotas. The main symptom is a slow decay in the accuracy of reported quota usage when compared to the actual usage as reported by "du". This discrepancy can be as little as a few percent and as much as many  X100% . We also sometimes see bizarre effects such negative file number counts being reported.
> >
> > We have been working with IBM  and have put in the latest 5.0.4-4 (that has a fix for mmcheckquota) that we have been pinning our hopes on, but this has not worked.
> >
> > Is anyone else experiencing similar issues? We need to try and get an idea if this is an issue peculiar to our set up or a more general SS problem.
> >
> > We are using user and group quotas in a fileset context.
> >
> > Thanks
> >
> >
> > *Bob Cregan*
> > HPC Systems Analyst
> >
> > Information & Communication Technologies
> >
> > Imperial College London,
> > South Kensington Campus London, SW7 2AZ
> > T: 07712388129
> > E: b.cregan at imperial.ac.uk
> >
> > W: www.imperial.ac.uk/ict/rcs <http://www.imperial.ac.uk/ict/rcs >
> >
> > _1505984389175_twitter.png @imperialRCS @imperialRSE
> >
> > 1505983882959_Imperial-RCS.png
> >
> >
> >
> > _______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at spectrumscale.org
> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss 
> >
> 
> .
> .
> .        ************************************
>            TELL US ABOUT YOUR SUCCESS STORIES
>           http://www.scinethpc.ca/testimonials 
>           ************************************
> ---
> Jaime Pinto - Storage Analyst
> SciNet HPC Consortium - Compute/Calcul Canada
> www.scinet.utoronto.ca - www.computecanada.ca
> University of Toronto
> 661 University Ave. (MaRS), Suite 1140
> Toronto, ON, M5G1M1
> P: 416-978-2755
> C: 416-505-1477
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss 
>  
>  
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss



More information about the gpfsug-discuss mailing list