<div dir="ltr"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div class="msg2465296025784317369"><div lang="EN-US" style="overflow-wrap: break-word;"><div class="m_2465296025784317369WordSection1"><p class="MsoNormal"><span style="font-size:11pt">Are you seeing the issues across the whole file system or in certain areas? </span></p></div></div></div></blockquote><div><br></div><div>Only with accounts in GPFS, local accounts and root do not gt this.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div class="msg2465296025784317369"><div lang="EN-US" style="overflow-wrap: break-word;"><div class="m_2465296025784317369WordSection1"><p class="MsoNormal"><span style="font-size:11pt">That sounds like inode exhaustion to me (and based on it not being block exhaustion as you’ve demonstrated).
<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt">What does a “df -i /cluster” show you? </span></p></div></div></div></blockquote><div><br></div><div>We bumped it up a few weeks ago:</div><div><font face="monospace">df -i /cluster<br>Filesystem Inodes IUsed IFree IUse% Mounted on<br>cluster 276971520 154807697 122163823 56% /cluster</font><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div class="msg2465296025784317369"><div lang="EN-US" style="overflow-wrap: break-word;"><div class="m_2465296025784317369WordSection1"><p class="MsoNormal"><span style="font-size:11pt">Or if this is only in a certain area you can “cd” into that directory and run a “df -i .”</span></p></div></div></div></blockquote><div><br></div><div>As root on a login node;</div><font face="monospace">df -i<br>Filesystem Inodes IUsed IFree IUse% Mounted on<br>/dev/sda2 20971520 169536 20801984 1% /<br>devtmpfs 12169978 528 12169450 1% /dev<br>tmpfs 12174353 1832 12172521 1% /run<br>tmpfs 12174353 77 12174276 1% /dev/shm<br>tmpfs 12174353 15 12174338 1% /sys/fs/cgroup<br>/dev/sda1 0 0 0 - /boot/efi<br>/dev/sda3 52428800 2887 52425913 1% /var<br>/dev/sda7 277368832 35913 277332919 1% /local<br>/dev/sda5 104857600 398 104857202 1% /tmp<br>tmpfs 12174353 1 12174352 1% /run/user/551336<br>tmpfs 12174353 1 12174352 1% /run/user/0<br>moto 276971520 154807697 122163823 56% /cluster<br>tmpfs 12174353 3 12174350 1% /run/user/441245<br>tmpfs 12174353 12 12174341 1% /run/user/553562<br>tmpfs 12174353 1 12174352 1% /run/user/525583<br>tmpfs 12174353 1 12174352 1% /run/user/476374<br>tmpfs 12174353 1 12174352 1% /run/user/468934<br>tmpfs 12174353 5 12174348 1% /run/user/551200<br>tmpfs 12174353 1 12174352 1% /run/user/539143<br>tmpfs 12174353 1 12174352 1% /run/user/488676<br>tmpfs 12174353 1 12174352 1% /run/user/493713<br>tmpfs 12174353 1 12174352 1% /run/user/507831<br>tmpfs 12174353 1 12174352 1% /run/user/549822<br>tmpfs 12174353 1 12174352 1% /run/user/500569<br>tmpfs 12174353 1 12174352 1% /run/user/443748<br>tmpfs 12174353 1 12174352 1% /run/user/543676<br>tmpfs 12174353 1 12174352 1% /run/user/451446<br>tmpfs 12174353 1 12174352 1% /run/user/497945<br>tmpfs 12174353 6 12174347 1% /run/user/554672<br>tmpfs 12174353 32 12174321 1% /run/user/554653<br>tmpfs 12174353 1 12174352 1% /run/user/30094<br>tmpfs 12174353 1 12174352 1% /run/user/470790<br>tmpfs 12174353 59 12174294 1% /run/user/553037<br>tmpfs 12174353 1 12174352 1% /run/user/554670<br>tmpfs 12174353 1 12174352 1% /run/user/548236<br>tmpfs 12174353 1 12174352 1% /run/user/547288<br></font><div><font face="monospace">tmpfs 12174353 1 12174352 1% /run/user/547289 </font></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div class="msg2465296025784317369"><div lang="EN-US" style="overflow-wrap: break-word;"><div class="m_2465296025784317369WordSection1"><p class="MsoNormal"><span style="font-size:11pt"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt">You may need to allocate more inodes to an independent inode fileset somewhere. Especially with something as old as 4.2.3 you won’t have auto-inode expansion for the filesets.</span></p></div></div></div></blockquote><div><br></div><div>Do we have to restart any service after upping the inode count?</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div class="msg2465296025784317369"><div lang="EN-US" style="overflow-wrap: break-word;"><div class="m_2465296025784317369WordSection1"><p class="MsoNormal"><span style="font-size:11pt"><u></u><u></u></span></p>
<p class="MsoNormal"><br></p>
<p class="MsoNormal"><span style="font-size:11pt">Best,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt"><u></u> <u></u></span></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:Calibri,sans-serif;color:black">J.D. Maloney</span><span style="font-size:11pt;font-family:Calibri,sans-serif;color:black"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:black">Lead</span><span style="font-size:10.5pt;font-family:Calibri,sans-serif;color:black"> HPC Storage Engineer | Storage Enabling Technologies Group</span><span style="font-size:11pt;font-family:Calibri,sans-serif;color:black"><u></u><u></u></span></p>
</div>
</div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:Calibri,sans-serif;color:black">National Center for Supercomputing Applications (NCSA)</span></p></div></div></div></blockquote><div><br></div><div>Ho JD I took an intermediate LCI workshop with you at Univ of Cincinnati!</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div class="msg2465296025784317369"><div lang="EN-US" style="overflow-wrap: break-word;"><div class="m_2465296025784317369WordSection1"><p class="MsoNormal"><span style="font-size:11pt"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt"><u></u> <u></u></span></p>
<div id="m_2465296025784317369mail-editor-reference-message-container">
<div>
<div style="border-width:1pt medium medium;border-style:solid none none;border-color:rgb(181,196,223) currentcolor currentcolor;padding:3pt 0in 0in">
<p class="MsoNormal" style="margin-bottom:12pt"><b><span style="color:black">From:
</span></b><span style="color:black">gpfsug-discuss <<a href="mailto:gpfsug-discuss-bounces@gpfsug.org" target="_blank">gpfsug-discuss-bounces@gpfsug.org</a>> on behalf of Rob Kudyba <<a href="mailto:rk3199@columbia.edu" target="_blank">rk3199@columbia.edu</a>><br>
<b>Date: </b>Thursday, June 6, 2024 at 3:50</span><span style="font-family:Arial,sans-serif;color:black"> </span><span style="color:black">PM<br>
<b>To: </b><a href="mailto:gpfsug-discuss@gpfsug.org" target="_blank">gpfsug-discuss@gpfsug.org</a> <<a href="mailto:gpfsug-discuss@gpfsug.org" target="_blank">gpfsug-discuss@gpfsug.org</a>><br>
<b>Subject: </b>[gpfsug-discuss] No space left on device, but plenty of quota space for inodes and blocks<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal">Running GPFS 4.2.3 on a DDN GridScaler and users are getting the <span style="font-family:"Courier New"">No space left on device</span> message when trying to write to a file. In <span style="font-family:"Courier New"">/var/adm/ras/mmfs.log
</span>the only recent errors are this:<u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">2024-06-06_15:51:22.311-0400: mmcommon getContactNodes cluster failed. Return code -1.<br>
2024-06-06_15:51:22.311-0400: The previous error was detected on node x.x.x.x (headnode).<br>
2024-06-06_15:53:25.088-0400: mmcommon getContactNodes cluster failed. Return code -1.<br>
2024-06-06_15:53:25.088-0400: The previous error was detected on node x.x.x.x (headnode).</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">according to <a href="https://urldefense.com/v3/__https:/www.ibm.com/docs/en/storage-scale/5.1.9?topic=messages-6027-615__;!!DZ3fjg!4ZyUNmTiGNp6C3Yls1wqW-RdRGa8n-ZmfZ0y0i-y6pce_ZIFSaefpOWvKIYIXspKjfREPtf3BRuO5VqAS6Y9UXQ$" target="_blank">
https://www.ibm.com/docs/en/storage-scale/5.1.9?topic=messages-6027-615</a> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<blockquote style="border-width:medium medium medium 1pt;border-style:none none none solid;border-color:currentcolor currentcolor currentcolor rgb(204,204,204);padding:0in 0in 0in 6pt;margin-left:4.8pt;margin-right:0in">
<p class="MsoNormal">Check the preceding messages, and consult the earlier chapters of this document. A frequent cause for such errors is lack of space in
<span style="font-family:"Courier New"">/var</span>.<u></u><u></u></p>
</blockquote>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">We have plenty of space left.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New""> /usr/lpp/mmfs/bin/mmlsdisk cluster<br>
disk driver sector failure holds holds storage<br>
name type size group metadata data status availability pool<br>
------------ -------- ------ ----------- -------- ----- ------------- ------------ ------------<br>
S01_MDT200_1 nsd 4096 200 Yes No ready up system
<br>
S01_MDT201_1 nsd 4096 201 Yes No ready up system
<br>
S01_DAT0001_1 nsd 4096 100 No Yes ready up data1 <br>
S01_DAT0002_1 nsd 4096 101 No Yes ready up data1 <br>
S01_DAT0003_1 nsd 4096 100 No Yes ready up data1 <br>
S01_DAT0004_1 nsd 4096 101 No Yes ready up data1 <br>
S01_DAT0005_1 nsd 4096 100 No Yes ready up data1 <br>
S01_DAT0006_1 nsd 4096 101 No Yes ready up data1 <br>
S01_DAT0007_1 nsd 4096 100 No Yes ready up data1 </span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New""> /usr/lpp/mmfs/bin/mmdf headnode <br>
disk disk size failure holds holds free KB free KB<br>
name in KB group metadata data in full blocks in fragments<br>
--------------- ------------- -------- -------- ----- -------------------- -------------------<br>
Disks in storage pool: system (Maximum disk size allowed is 14 TB)<br>
S01_MDT200_1 1862270976 200 Yes No 969134848 ( 52%) 2948720 ( 0%)
<br>
S01_MDT201_1 1862270976 201 Yes No 969126144 ( 52%) 2957424 ( 0%)
<br>
------------- -------------------- -------------------<br>
(pool total) 3724541952 1938260992 ( 52%) 5906144 ( 0%)<br>
<br>
Disks in storage pool: data1 (Maximum disk size allowed is 578 TB)<br>
S01_DAT0007_1 77510737920 100 No Yes 21080752128 ( 27%) 897723392 ( 1%)
<br>
S01_DAT0005_1 77510737920 100 No Yes 14507212800 ( 19%) 949412160 ( 1%)
<br>
S01_DAT0001_1 77510737920 100 No Yes 14503620608 ( 19%) 951327680 ( 1%)
<br>
S01_DAT0003_1 77510737920 100 No Yes 14509205504 ( 19%) 949340544 ( 1%)
<br>
S01_DAT0002_1 77510737920 101 No Yes 14504585216 ( 19%) 948377536 ( 1%)
<br>
S01_DAT0004_1 77510737920 101 No Yes 14503647232 ( 19%) 952892480 ( 1%)
<br>
S01_DAT0006_1 77510737920 101 No Yes 14504486912 ( 19%) 949072512 ( 1%)
<br>
------------- -------------------- -------------------<br>
(pool total) 542575165440 108113510400 ( 20%) 6598146304 ( 1%)<br>
<br>
============= ==================== ===================<br>
(data) 542575165440 108113510400 ( 20%) 6598146304 ( 1%)<br>
(metadata) 3724541952 1938260992 ( 52%) 5906144 ( 0%)<br>
============= ==================== ===================<br>
(total) 546299707392 110051771392 ( 22%) 6604052448 ( 1%)<br>
<br>
Inode Information<br>
-----------------<br>
Total number of used inodes in all Inode spaces: 154807668<br>
Total number of free inodes in all Inode spaces: 12964492<br>
Total number of allocated inodes in all Inode spaces: 167772160<br>
Total of Maximum number of inodes in all Inode spaces: 276971520</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:Arial,sans-serif">On the head node:</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">df -h<br>
Filesystem Size Used Avail Use% Mounted on<br>
/dev/sda4 430G 216G 215G 51% /<br>
devtmpfs 47G 0 47G 0% /dev<br>
tmpfs 47G 0 47G 0% /dev/shm<br>
tmpfs 47G 4.1G 43G 9% /run<br>
tmpfs 47G 0 47G 0% /sys/fs/cgroup<br>
/dev/sda1 504M 114M 365M 24% /boot<br>
/dev/sda2 100M 9.9M 90M 10% /boot/efi<br>
x.x.x.:/nfs-share 430G 326G 105G 76% /nfs-share<br>
cluster 506T 405T 101T 81% /cluster<br>
tmpfs 9.3G 0 9.3G 0% /run/user/443748<br>
tmpfs 9.3G 0 9.3G 0% /run/user/547288<br>
tmpfs 9.3G 0 9.3G 0% /run/user/551336<br>
tmpfs 9.3G 0 9.3G 0% /run/user/547289</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:Arial,sans-serif">The login nodes have plenty of space in
</span><span style="font-family:"Courier New"">/var:</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">/dev/sda3 50G 8.7G 42G 18% /var</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:Arial,sans-serif">What else should we check? We are just at 81% on the GPFS mounted file system but that should be enough for more space without these errors. Any recommended service(s) that we can restart?</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
</div>
</div>
</div>
</div>
</div>
_______________________________________________<br>
gpfsug-discuss mailing list<br>
gpfsug-discuss at <a href="http://gpfsug.org" rel="noreferrer" target="_blank">gpfsug.org</a><br>
<a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org" rel="noreferrer" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org</a><br>
</div></blockquote></div></div>