[gpfsug-discuss] Client Latency and High NSD Server Load Average

Wahl, Edward ewahl at osc.edu
Thu Jun 4 00:56:07 BST 2020


I saw something EXACTLY like this way back in the 3.x days when I had a backend storage unit that had a flaky main memory issue and some enclosures were constantly flapping between controllers for ownership.  Some NSDs were affected, some were not.  I can imagine this could still happen in 4.x and 5.0.x with the right hardware problem.

Were things working before or is this a new installation?

What is the backend storage?

If you are using device-mapper-multipath, look for events in the messages/syslog.  Incorrect path weighting? Using ALUA when it isn't supported? (that can be comically bad! helped a friend diagnose that one at a customer once)   Perhaps using the wrong rr_weight or rr_min_io so you have some wacky long io queueing issues where your path_selector cannot keep up with the IO queue?
Most of this is easily fixed by using most vendor's suggested settings anymore, IF the hardware is healthy...

Ed

________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of Saula, Oluwasijibomi <oluwasijibomi.saula at ndsu.edu>
Sent: Wednesday, June 3, 2020 5:45 PM
To: gpfsug-discuss at spectrumscale.org <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Client Latency and High NSD Server Load Average


Hello,

Anyone faced a situation where a majority of NSDs have a high load average and a minority don't?

Also, is 10x NSD server latency for write operations than for read operations expected in any circumstance?

We are seeing client latency between 6 and 9 seconds and are wondering if some GPFS configuration or NSD server condition may be triggering this poor performance.



Thanks,


Oluwasijibomi (Siji) Saula

HPC Systems Administrator  /  Information Technology



Research 2 Building 220B / Fargo ND 58108-6050

p: 701.231.7749 / www.ndsu.edu<https://urldefense.com/v3/__http://www.ndsu.edu/__;!!KGKeukY!l_-oLOSzQXBMPkIss5E_meDVuTAJMWRBddpzexezxNYQVbMEEz9BMf2Bi_eI$>



[cid:image001.gif at 01D57DE0.91C300C0]


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20200603/a7ed716d/attachment-0002.htm>


More information about the gpfsug-discuss mailing list