[gpfsug-discuss] mmfsck runs very slowly on a small filesystem

Mengxing Cheng mxcheng at uchicago.edu
Fri Mar 10 16:59:43 GMT 2017


Achim, thank you very much!

Mengxing


---
Mengxing Cheng, Ph.D.
HPC System Administrator
Research Computing Center
The University of Chicago

5607 S. Drexel Ave.
Chicago, IL 60637
email:    mxcheng at uchicago.edu
phone:  (773) 702-4070


________________________________
From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Achim Rehor [Achim.Rehor at de.ibm.com]
Sent: Friday, March 10, 2017 10:56 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] mmfsck runs very slowly on a small filesystem

Hi,

mmfsck is highly dependent on pagepool, so my best guess would be, the the EMS node with only 16GB of pagepool is sort of a bottleneck for the slow progress.
You might want to try to temporarily raise the pagepool on the service node to .. lets say 64GB, or restrict the mmfsck to run only on the storage nodes.

note: the fs mgr node will always be part of the mmfsck nodes. so if it is located on the service node, mmchmgr to a storage node first


Mit freundlichen Grüßen / Kind regards

Achim Rehor

________________________________


Software Technical Support Specialist AIX/ Emea HPC Support
[cid:_1_D975D64CD975D0CC005D0FE8C12580DF]

IBM Certified Advanced Technical Expert - Power Systems with AIX
TSCC Software Service, Dept. 7922
Global Technology Services
________________________________

Phone:  +49-7034-274-7862        IBM Deutschland
E-Mail: Achim.Rehor at de.ibm.com   Am Weiher 24
                 65451 Kelsterbach
                 Germany

________________________________


IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter
Geschäftsführung: Martina Koederitz (Vorsitzende), Reinhard Reschke, Dieter Scholz, Gregor Pillen, Ivo Koerner, Christian Noll
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 WEEE-Reg.-Nr. DE 99369940






From:        Mengxing Cheng <mxcheng at uchicago.edu>
To:        Eric Sperley <esperle at us.ibm.com>
Cc:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:        03/09/2017 10:24 PM
Subject:        Re: [gpfsug-discuss] mmfsck runs very slowly on a small filesystem
Sent by:        gpfsug-discuss-bounces at spectrumscale.org

________________________________



Eric, thank you very much for replying. Here is the memory configuration and current usage. Note that mmfsck is not running now. The two gss servers have the same 256GB memory and the service node has 128GB.


1. service node:
                  total       used       free     shared    buffers     cached
Mem:           125         58         66          0          0          4
-/+ buffers/cache:         53         71
Swap:            7           0          7

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
12990 root       0 -20 71.0g  43g 885m S  7.6 34.4   9306:00 mmfsd


2. gss nodes:
====================================
                 total        used        free      shared  buff/cache   available
Mem:            251         210          37           0           4          36
Swap:             3           0           3

PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
36770 root       0 -20  0.216t 0.192t 2.667g S  48.9 78.1  75684:09 /usr/lpp/mmfs/bin/mmfsd


The gss nodes' memory usage is so high because their pagepool is set to 192GB while the service node has 16GB pagepool.


Mengxing

---
Mengxing Cheng, Ph.D.
HPC System Administrator
Research Computing Center
The University of Chicago

5607 S. Drexel Ave.
Chicago, IL 60637
email:    mxcheng at uchicago.edu
phone:  (773) 702-4070


________________________________

From: Eric Sperley [esperle at us.ibm.com]
Sent: Thursday, March 09, 2017 3:13 PM
To: Mengxing Cheng
Cc: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] mmfsck runs very slowly on a small filesystem

Mengxing,

It is nice meeting you.

I have seen a situation where the amount of RAM on a node can affect mmfsck times. Do all the nodes have the same amount of RAM, or does the slow running node have less RAM?
Best Regards, Eric



        [cid:_4_0B8EBB580B8EB73C005D0FE8C12580DF]       Eric Sperley
                SDI Architect
        "Carpe Diem"    IBM Systems
                esperle at us.ibm.com
                +15033088721




[Inactive hide details for Mengxing Cheng ---03/09/2017 11:24:02 AM---Dear all, My name is Mengxing Cheng and I am a HPC system]Mengxing Cheng ---03/09/2017 11:24:02 AM---Dear all, My name is Mengxing Cheng and I am a HPC system administrator at the University of Chicago

From: Mengxing Cheng <mxcheng at uchicago.edu>
To: "gpfsug-discuss at spectrumscale.org" <gpfsug-discuss at spectrumscale.org>
Date: 03/09/2017 11:24 AM
Subject: [gpfsug-discuss] mmfsck runs very slowly on a small filesystem
Sent by: gpfsug-discuss-bounces at spectrumscale.org

________________________________



Dear all,

My name is Mengxing Cheng and I am a HPC system administrator at the University of Chicago. We have a GSS26 running gss2.5.10.3-3b and gpfs-4.2.0.3.

Recently, we run mmfsck on a relatively small filesystem with 14TB block and 73863102 inodes but it was unusually slow so as to not be able to finish in 48 hours. In contrast, mmfsck run on a filesystem with the same size and inodes but sitting on a traditional IBM DS3512 storage took only 2 hours to complete.

In particular, the mmfsck run in parallel using 3 nodes within the GSS storage cluster, we notice that one gss storage server scans inodes much slower than the other gss storage server and the quorum service node.

Has anyone experience the same mmfsck performance issue?
Could anyone make recommendation to troubleshoot and improve mmfsck performance?

Thank you!


Mengxing


---
Mengxing Cheng, Ph.D.
HPC System Administrator
Research Computing Center
The University of Chicago

5607 S. Drexel Ave.
Chicago, IL 60637
email: mxcheng at uchicago.edu
phone: (773) 702-4070

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170310/f022c090/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00007.gif
Type: image/gif
Size: 7182 bytes
Desc: ATT00007.gif
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170310/f022c090/attachment-0006.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00008.png
Type: image/png
Size: 481 bytes
Desc: ATT00008.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170310/f022c090/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00009.gif
Type: image/gif
Size: 45 bytes
Desc: ATT00009.gif
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170310/f022c090/attachment-0007.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00010.png
Type: image/png
Size: 2322 bytes
Desc: ATT00010.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170310/f022c090/attachment-0005.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00011.gif
Type: image/gif
Size: 105 bytes
Desc: ATT00011.gif
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170310/f022c090/attachment-0008.gif>


More information about the gpfsug-discuss mailing list