[gpfsug-discuss] Long IO waiters and IBM Storwize V5030

Fri May 28 16:57:52 BST 2021

Hi Folks,

So, we are experiencing some very long IO waiters in our GPFS cluster:

#  mmdiag --waiters

=== mmdiag: waiters ===

Waiting 17.3823 sec since 10:41:01, monitored, thread 21761 NSDThread: for I/O completion

Waiting 16.6140 sec since 10:41:02, monitored, thread 21730 NSDThread: for I/O completion

Waiting 15.3004 sec since 10:41:03, monitored, thread 21763 NSDThread: for I/O completion

Waiting 15.2013 sec since 10:41:03, monitored, thread 22175

However, GPFS support is pointing to our IBM Storwize V5030 disk system as the source of latency. Unfortunately, we don't have paid support for the system so we are polling for anyone who might be able to assist.

Does anyone by chance have any experience with IBM Storwize V5030 or possess a problem determination guide for the V5030?

We've briefly reviewed the V5030 management portal, but we still haven't identified a cause for the increased latencies (i.e. read ~129ms, write ~198ms).

Granted, we have some heavy client workloads, yet we seem to experience this drastic drop in performance every couple of months, probably exacerbated by heavy IO demands.

Any assistance would be much appreciated.

Thanks,

Oluwasijibomi (Siji) Saula

HPC Systems Administrator  /  Information Technology

Research 2 Building 220B / Fargo ND 58108-6050

p: 701.231.7749 / www.ndsu.edu<http://www.ndsu.edu/>

[cid:image001.gif at 01D57DE0.91C300C0]

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210528/f778d057/attachment.htm>