[gpfsug-discuss] mmfsck runs very slowly on a small filesystem
Mengxing Cheng
mxcheng at uchicago.edu
Fri Mar 10 16:59:43 GMT 2017
Achim, thank you very much!
Mengxing
---
Mengxing Cheng, Ph.D.
HPC System Administrator
Research Computing Center
The University of Chicago
5607 S. Drexel Ave.
Chicago, IL 60637
email: mxcheng at uchicago.edu
phone: (773) 702-4070
________________________________
From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Achim Rehor [Achim.Rehor at de.ibm.com]
Sent: Friday, March 10, 2017 10:56 AM
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] mmfsck runs very slowly on a small filesystem
Hi,
mmfsck is highly dependent on pagepool, so my best guess would be, the the EMS node with only 16GB of pagepool is sort of a bottleneck for the slow progress.
You might want to try to temporarily raise the pagepool on the service node to .. lets say 64GB, or restrict the mmfsck to run only on the storage nodes.
note: the fs mgr node will always be part of the mmfsck nodes. so if it is located on the service node, mmchmgr to a storage node first
Mit freundlichen Grüßen / Kind regards
Achim Rehor
________________________________
Software Technical Support Specialist AIX/ Emea HPC Support
[cid:_1_D975D64CD975D0CC005D0FE8C12580DF]
IBM Certified Advanced Technical Expert - Power Systems with AIX
TSCC Software Service, Dept. 7922
Global Technology Services
________________________________
Phone: +49-7034-274-7862 IBM Deutschland
E-Mail: Achim.Rehor at de.ibm.com Am Weiher 24
65451 Kelsterbach
Germany
________________________________
IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter
Geschäftsführung: Martina Koederitz (Vorsitzende), Reinhard Reschke, Dieter Scholz, Gregor Pillen, Ivo Koerner, Christian Noll
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 WEEE-Reg.-Nr. DE 99369940
From: Mengxing Cheng <mxcheng at uchicago.edu>
To: Eric Sperley <esperle at us.ibm.com>
Cc: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 03/09/2017 10:24 PM
Subject: Re: [gpfsug-discuss] mmfsck runs very slowly on a small filesystem
Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________
Eric, thank you very much for replying. Here is the memory configuration and current usage. Note that mmfsck is not running now. The two gss servers have the same 256GB memory and the service node has 128GB.
1. service node:
total used free shared buffers cached
Mem: 125 58 66 0 0 4
-/+ buffers/cache: 53 71
Swap: 7 0 7
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12990 root 0 -20 71.0g 43g 885m S 7.6 34.4 9306:00 mmfsd
2. gss nodes:
====================================
total used free shared buff/cache available
Mem: 251 210 37 0 4 36
Swap: 3 0 3
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
36770 root 0 -20 0.216t 0.192t 2.667g S 48.9 78.1 75684:09 /usr/lpp/mmfs/bin/mmfsd
The gss nodes' memory usage is so high because their pagepool is set to 192GB while the service node has 16GB pagepool.
Mengxing
---
Mengxing Cheng, Ph.D.
HPC System Administrator
Research Computing Center
The University of Chicago
5607 S. Drexel Ave.
Chicago, IL 60637
email: mxcheng at uchicago.edu
phone: (773) 702-4070
________________________________
From: Eric Sperley [esperle at us.ibm.com]
Sent: Thursday, March 09, 2017 3:13 PM
To: Mengxing Cheng
Cc: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] mmfsck runs very slowly on a small filesystem
Mengxing,
It is nice meeting you.
I have seen a situation where the amount of RAM on a node can affect mmfsck times. Do all the nodes have the same amount of RAM, or does the slow running node have less RAM?
Best Regards, Eric
[cid:_4_0B8EBB580B8EB73C005D0FE8C12580DF] Eric Sperley
SDI Architect
"Carpe Diem" IBM Systems
esperle at us.ibm.com
+15033088721
[Inactive hide details for Mengxing Cheng ---03/09/2017 11:24:02 AM---Dear all, My name is Mengxing Cheng and I am a HPC system]Mengxing Cheng ---03/09/2017 11:24:02 AM---Dear all, My name is Mengxing Cheng and I am a HPC system administrator at the University of Chicago
From: Mengxing Cheng <mxcheng at uchicago.edu>
To: "gpfsug-discuss at spectrumscale.org" <gpfsug-discuss at spectrumscale.org>
Date: 03/09/2017 11:24 AM
Subject: [gpfsug-discuss] mmfsck runs very slowly on a small filesystem
Sent by: gpfsug-discuss-bounces at spectrumscale.org
________________________________
Dear all,
My name is Mengxing Cheng and I am a HPC system administrator at the University of Chicago. We have a GSS26 running gss2.5.10.3-3b and gpfs-4.2.0.3.
Recently, we run mmfsck on a relatively small filesystem with 14TB block and 73863102 inodes but it was unusually slow so as to not be able to finish in 48 hours. In contrast, mmfsck run on a filesystem with the same size and inodes but sitting on a traditional IBM DS3512 storage took only 2 hours to complete.
In particular, the mmfsck run in parallel using 3 nodes within the GSS storage cluster, we notice that one gss storage server scans inodes much slower than the other gss storage server and the quorum service node.
Has anyone experience the same mmfsck performance issue?
Could anyone make recommendation to troubleshoot and improve mmfsck performance?
Thank you!
Mengxing
---
Mengxing Cheng, Ph.D.
HPC System Administrator
Research Computing Center
The University of Chicago
5607 S. Drexel Ave.
Chicago, IL 60637
email: mxcheng at uchicago.edu
phone: (773) 702-4070
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170310/f022c090/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00007.gif
Type: image/gif
Size: 7182 bytes
Desc: ATT00007.gif
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170310/f022c090/attachment-0006.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00008.png
Type: image/png
Size: 481 bytes
Desc: ATT00008.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170310/f022c090/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00009.gif
Type: image/gif
Size: 45 bytes
Desc: ATT00009.gif
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170310/f022c090/attachment-0007.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00010.png
Type: image/png
Size: 2322 bytes
Desc: ATT00010.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170310/f022c090/attachment-0005.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATT00011.gif
Type: image/gif
Size: 105 bytes
Desc: ATT00011.gif
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170310/f022c090/attachment-0008.gif>
More information about the gpfsug-discuss
mailing list