<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
</head>
<body>
<div dir="ltr">Hi Andreas,
<div dir="ltr"><br>
</div>
I often start with an lsof to see who has files open on the troubled filesystem and then start stracing the various processes to see which is responsible. It ought to be a process blocked in uninterruptible sleep and ideally would be obvious but on a shared
machine it might not be.
<div dir="ltr"><br>
</div>
<div dir="ltr">Something else you could do is a reverse lookup of the disk addresseses in iohist using mmfileid. This won't help if these are transient files but it could point you in the right direction. Careful though it'll give your metadata disks a tickle
:) the syntax would be "mmfileid $FsName -d :$DiskAddrrss" where $DiskAddress is the 4th field from the iohist". It's not a quick command to return-- it could easily take up to a half hour but it should tell you which file path contains that disk address. <br>
<div dir="ltr"><br>
</div>
<div dir="ltr">Sometimes this is all too tedious and in that case grabbing some trace data can help. When you're experiencing I/O trouble you can run "mmtrace trace=def start" on the node, wait about a minute or so and then run "mmtrace stop". The resulting
trcrpt file is bit of a monster to go through but I do believe you can see which PIDs are responsible for the I/O given some sleuthing. If it comes to that let me know and I'll see if I can point you at some phrases to grep for. It's been a while since I've
done it.
<div dir="ltr"><br>
</div>
<div dir="ltr">-Aaron</div>
</div>
</div>
</div>
<span id="draft-break"></span><br>
<br>
<span id="draft-break"></span><br>
<br>
<div>
<div class="null" dir="auto">On May 30, 2017 at 09:13:09 EDT, Andreas Petzold (SCC) <andreas.petzold@kit.edu> wrote:<br class="null">
</div>
<blockquote type="cite" style="border-left-style:solid;border-width:1px;margin-left:0px;padding-left:10px;" class="null">
<div class="null" dir="auto">
<div class="null">
<div class="null">
<div class="null">Hi John, <br class="null">
<br class="null">
iotop wasn't helpful. It seems to be overwhelmed by what is going on on <br class="null">
the machine. <br class="null">
<br class="null">
Cheers, <br class="null">
<br class="null">
Andreas <br class="null">
<br class="null">
On 05/30/2017 02:28 PM, John Hearns wrote: <br class="null">
> Andreas, <br class="null">
> This is a stupid reply, but please bear with me. <br class="null">
> Not exactly GPFS related, but I once managed an SGI CXFS (Clustered XFS filesystem) setup.
<br class="null">
> We also had a new application which did post-processing One of the users reported that a post-processing job would take about 30 minutes.
<br class="null">
> However when two or more of the same application were running the job would take several hours.
<br class="null">
> <br class="null">
> We finally found that this slowdown was due to the IO size, the application was using the default size.
<br class="null">
> We only found this by stracing the application and spending hours staring at the trace...
<br class="null">
> <br class="null">
> I am sure there are better tools for this, and I do hope you don’t have to strace every application.... really.
<br class="null">
> A good tool to get a general feel for IO pattersn is 'iotop'. It might help? <br class="null">
> <br class="null">
> <br class="null">
> <br class="null">
> <br class="null">
> -----Original Message----- <br class="null">
> From: gpfsug-discuss-bounces@spectrumscale.org [mailto:gpfsug-discuss-bounces@spectrumscale.org] On Behalf Of Andreas Petzold (SCC)
<br class="null">
> Sent: Tuesday, May 30, 2017 2:17 PM <br class="null">
> To: gpfsug-discuss@spectrumscale.org <br class="null">
> Subject: [gpfsug-discuss] Associating I/O operations with files/processes <br class="null">
> <br class="null">
> Dear group, <br class="null">
> <br class="null">
> first a quick introduction: at KIT we are running a 20+PB storage system with several large (1-9PB) file systems. We have a 14 node NSD server cluster and 5 small (~10 nodes) protocol node clusters which each mount one of the file systems. The protocol nodes
run server software (dCache, xrootd) specific to our users which primarily are the LHC experiments at CERN. GPFS version is 4.2.2 everywhere. All servers are connected via IB, while the protocol nodes communicate via Ethernet to their clients.
<br class="null">
> <br class="null">
> Now let me describe the problem we are facing. Since a few days, one of the protocol nodes shows a very strange and as of yet unexplained I/O behaviour. Before we were usually seeing reads like this (iohist example from a well behaved node):
<br class="null">
> <br class="null">
> 14:03:37.637526 R data 32:138835918848 8192 46.626 cli 0A417D79:58E3B179 172.18.224.19
<br class="null">
> 14:03:37.660177 R data 18:12590325760 8192 25.498 cli 0A4179AD:58E3AE66 172.18.224.14
<br class="null">
> 14:03:37.640660 R data 15:106365067264 8192 45.682 cli 0A4179AD:58E3ADD7 172.18.224.14
<br class="null">
> 14:03:37.657006 R data 35:130482421760 8192 30.872 cli 0A417DAD:58E3B266 172.18.224.21
<br class="null">
> 14:03:37.643908 R data 33:107847139328 8192 45.571 cli 0A417DAD:58E3B206 172.18.224.21
<br class="null">
> <br class="null">
> Since a few days we see this on the problematic node: <br class="null">
> <br class="null">
> 14:06:27.253537 R data 46:126258287872 8 15.474 cli 0A4179AB:58E3AE54 172.18.224.13
<br class="null">
> 14:06:27.268626 R data 40:137280768624 8 0.395 cli 0A4179AD:58E3ADE3 172.18.224.14
<br class="null">
> 14:06:27.269056 R data 46:56452781528 8 0.427 cli 0A4179AB:58E3AE54 172.18.224.13
<br class="null">
> 14:06:27.269417 R data 47:97273159640 8 0.293 cli 0A4179AD:58E3AE5A 172.18.224.14
<br class="null">
> 14:06:27.269293 R data 49:59102786168 8 0.425 cli 0A4179AD:58E3AE72 172.18.224.14
<br class="null">
> 14:06:27.269531 R data 46:142387326944 8 0.340 cli 0A4179AB:58E3AE54 172.18.224.13
<br class="null">
> 14:06:27.269377 R data 28:102988517096 8 0.554 cli 0A417879:58E3AD08 172.18.224.10
<br class="null">
> <br class="null">
> The number of read ops has gone up by O(1000) which is what one would expect when going from 8192 sector reads to 8 sector reads.
<br class="null">
> <br class="null">
> We have already excluded problems of node itself so we are focusing on the applications running on the node. What we'd like to to is to associate the I/O requests either with files or specific processes running on the machine in order to be able to blame
the correct application. Can somebody tell us, if this is possible and if now, if there are other ways to understand what application is causing this?
<br class="null">
> <br class="null">
> Thanks, <br class="null">
> <br class="null">
> Andreas <br class="null">
> <br class="null">
> -- <br class="null">
<br class="null">
<br class="null">
_______________________________________________ <br class="null">
gpfsug-discuss mailing list <br class="null">
gpfsug-discuss at spectrumscale.org <br class="null">
http://gpfsug.org/mailman/listinfo/gpfsug-discuss <br class="null">
</div>
</div>
</div>
</div>
</blockquote>
</div>
</body>
</html>