[gpfsug-discuss] Client Latency and High NSD Server Load Average

Kumaran Rajaram kums at us.ibm.com
Thu Jun 4 16:19:18 BST 2020


Hi,

 >> I do notice nsd03/nsd04 have long waiters, but nsd01 doesn't (nsd02-ib
 is offline for now):

Please issue "mmlsdisk <fs> -m" in NSD client to ascertain the active NSD
server serving a NSD. Since nsd02-ib is offlined, it is possible that some
servers would be serving higher NSDs than the rest.

https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.5/com.ibm.spectrum.scale.v5r05.doc/bl1pdg_PoorPerformanceDuetoDiskFailure.htm
https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.5/com.ibm.spectrum.scale.v5r05.doc/bl1pdg_HealthStateOfNSDserver.htm

>> From the waiters you provided I would guess there is something amiss
with some of your storage systems.

Please ensure there are no "disk rebuild" pertaining to certain
NSDs/storage volumes in progress (in the storage subsystem) as this can
sometimes impact block-level performance and thus impact latency,
especially for write operations. Please ensure that the hardware components
constituting the Spectrum Scale stack are healthy and performing optimally.

https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.5/com.ibm.spectrum.scale.v5r05.doc/bl1pdg_pspduetosyslevelcompissue.htm

Please refer to the Spectrum Scale documentation (link below) for potential
causes (e.g. Scale maintenance operation such as mmapplypolicy/mmestripefs
in progress, slow disks)  that can be contributing to this issue:

https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.5/com.ibm.spectrum.scale.v5r05.doc/bl1pdg_performanceissues.htm

Thanks and Regards,
-Kums

Kumaran Rajaram
Spectrum Scale Development, IBM Systems
kums at us.ibm.com




From:	"Frederick Stock" <stockf at us.ibm.com>
To:	gpfsug-discuss at spectrumscale.org
Cc:	gpfsug-discuss at spectrumscale.org
Date:	06/04/2020 07:08 AM
Subject:	[EXTERNAL] Re: [gpfsug-discuss] Client Latency and High NSD
            Server Load Average
Sent by:	gpfsug-discuss-bounces at spectrumscale.org



>From the waiters you provided I would guess there is something amiss with
some of your storage systems.  Since those waiters are on NSD servers they
are waiting for IO requests to the kernel to complete.  Generally IOs are
expected to complete in milliseconds, not seconds.  You could look at the
output of "mmfsadm dump nsd" to see how the GPFS IO queues are working but
that would be secondary to checking your storage systems.

Fred
__________________________________________________
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
stockf at us.ibm.com


 ----- Original message -----
 From: "Saula, Oluwasijibomi" <oluwasijibomi.saula at ndsu.edu>
 Sent by: gpfsug-discuss-bounces at spectrumscale.org
 To: "gpfsug-discuss at spectrumscale.org" <gpfsug-discuss at spectrumscale.org>
 Cc:
 Subject: [EXTERNAL] Re: [gpfsug-discuss] Client Latency and High NSD
 Server Load Average
 Date: Wed, Jun 3, 2020 6:24 PM

 Frederick,

 Yes on both counts! -  mmdf is showing pretty uniform (ie 5 NSDs out of 30
 report 65% free; All others are uniform at 58% free)...

 NSD servers per disks are called in round-robin fashion as well, for
 example:

  gpfs1         tier2_001    nsd02-ib,nsd03-ib,nsd04-ib,tsm01-ib,nsd01-ib
  gpfs1         tier2_002    nsd03-ib,nsd04-ib,tsm01-ib,nsd01-ib,nsd02-ib
  gpfs1         tier2_003    nsd04-ib,tsm01-ib,nsd01-ib,nsd02-ib,nsd03-ib
  gpfs1         tier2_004    tsm01-ib,nsd01-ib,nsd02-ib,nsd03-ib,nsd04-ib


 Any other potential culprits to investigate?

 I do notice nsd03/nsd04 have long waiters, but nsd01 doesn't (nsd02-ib is
 offline for now):
 [nsd03-ib ~]# mmdiag --waiters
 === mmdiag: waiters ===
 Waiting 6.5113 sec since 17:17:33, monitored, thread 4175 NSDThread: for
 I/O completion
 Waiting 6.3810 sec since 17:17:33, monitored, thread 4127 NSDThread: for
 I/O completion
 Waiting 6.1959 sec since 17:17:34, monitored, thread 4144 NSDThread: for
 I/O completion

 nsd04-ib:

 Waiting 13.1386 sec since 17:19:09, monitored, thread 9971 NSDThread: for
 I/O completion
 Waiting 10.3562 sec since 17:19:12, monitored, thread 9958 NSDThread: for
 I/O completion
 Waiting 10.0338 sec since 17:19:12, monitored, thread 9951 NSDThread: for
 I/O completion



 tsm01-ib:

 Waiting 8.1211 sec since 17:20:24, monitored, thread 3644 NSDThread: for
 I/O completion
 Waiting 7.6690 sec since 17:20:24, monitored, thread 3641 NSDThread: for
 I/O completion
 Waiting 7.4969 sec since 17:20:24, monitored, thread 3658 NSDThread: for
 I/O completion
 Waiting 7.3573 sec since 17:20:24, monitored, thread 3642 NSDThread: for
 I/O completion



 nsd01-ib:

 Waiting 0.2548 sec since 17:21:47, monitored, thread 30513 NSDThread: for
 I/O completion
 Waiting 0.1502 sec since 17:21:47, monitored, thread 30529 NSDThread: for
 I/O completion








 Thanks,

 Oluwasijibomi (Siji) Saula


 HPC Systems Administrator  /  Information Technology





 Research 2 Building 220B / Fargo ND 58108-6050


 p: 701.231.7749 / www.ndsu.edu














 From: gpfsug-discuss-bounces at spectrumscale.org
 <gpfsug-discuss-bounces at spectrumscale.org> on behalf of
 gpfsug-discuss-request at spectrumscale.org
 <gpfsug-discuss-request at spectrumscale.org>
 Sent: Wednesday, June 3, 2020 4:56 PM
 To: gpfsug-discuss at spectrumscale.org <gpfsug-discuss at spectrumscale.org>
 Subject: gpfsug-discuss Digest, Vol 101, Issue 6

 Send gpfsug-discuss mailing list submissions to
         gpfsug-discuss at spectrumscale.org

 To subscribe or unsubscribe via the World Wide Web, visit
         http://gpfsug.org/mailman/listinfo/gpfsug-discuss
 or, via email, send a message with subject or body 'help' to
         gpfsug-discuss-request at spectrumscale.org

 You can reach the person managing the list at
         gpfsug-discuss-owner at spectrumscale.org

 When replying, please edit your Subject line so it is more specific
 than "Re: Contents of gpfsug-discuss digest..."


 Today's Topics:

    1. Introducing SSUG::Digital
       (Simon Thompson (Spectrum Scale User Group Chair))
    2. Client Latency and High NSD Server Load Average
       (Saula, Oluwasijibomi)
    3. Re: Client Latency and High NSD Server Load Average
       (Frederick Stock)


 ----------------------------------------------------------------------

 Message: 1
 Date: Wed, 03 Jun 2020 20:11:17 +0100
 From: "Simon Thompson (Spectrum Scale User Group Chair)"
         <chair at spectrumscale.org>
 To: "gpfsug-discuss at spectrumscale.org"
         <gpfsug-discuss at spectrumscale.org>
 Subject: [gpfsug-discuss] Introducing SSUG::Digital
 Message-ID: <AB923605-E4FE-45EC-A1EA-B61A4A147B06 at spectrumscale.org>
 Content-Type: text/plain; charset="utf-8"

 Hi All.,



 I happy that we can finally announce SSUG:Digital, which will be a series
 of online session based on the types of topic we present at our in-person
 events.



 I know it?s taken use a while to get this up and running, but we?ve been
 working on trying to get the format right. So save the date for the first
 SSUG:Digital event which will take place on Thursday 18th June 2020 at 4pm
 BST. That?s:
 San Francisco, USA at 08:00 PDT
 New York, USA at 11:00 EDT
 London, United Kingdom at 16:00 BST
 Frankfurt, Germany at 17:00 CEST
 Pune, India at 20:30 IST
 We estimate about 90 minutes for the first session, and please forgive any
 teething troubles as we get this going!



 (I know the times don?t work for everyone in the global community!)



 Each of the sessions we run over the next few months will be a different
 Spectrum Scale Experts or Deep Dive session.

 More details at:

 https://www.spectrumscaleug.org/introducing-ssugdigital/



 (We?ll announce the speakers and topic of the first session in the next
 few days ?)



 Thanks to Ulf, Kristy, Bill, Bob and Ted for their help and guidance in
 getting this going.



 We?re keen to include some user talks and site updates later in the
 series, so please let me know if you might be interested in presenting in
 this format.



 Simon Thompson

 SSUG Group Chair

 -------------- next part --------------
 An HTML attachment was scrubbed...
 URL: <
 http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20200603/e839fc73/attachment-0001.html
 >

 ------------------------------

 Message: 2
 Date: Wed, 3 Jun 2020 21:45:05 +0000
 From: "Saula, Oluwasijibomi" <oluwasijibomi.saula at ndsu.edu>
 To: "gpfsug-discuss at spectrumscale.org"
         <gpfsug-discuss at spectrumscale.org>
 Subject: [gpfsug-discuss] Client Latency and High NSD Server Load
         Average
 Message-ID:

 <DM6PR08MB5324B014BC4AA03CCF25557598880 at DM6PR08MB5324.namprd08.prod.outlook.com>


 Content-Type: text/plain; charset="iso-8859-1"


 Hello,

 Anyone faced a situation where a majority of NSDs have a high load average
 and a minority don't?

 Also, is 10x NSD server latency for write operations than for read
 operations expected in any circumstance?

 We are seeing client latency between 6 and 9 seconds and are wondering if
 some GPFS configuration or NSD server condition may be triggering this
 poor performance.



 Thanks,


 Oluwasijibomi (Siji) Saula

 HPC Systems Administrator  /  Information Technology



 Research 2 Building 220B / Fargo ND 58108-6050

 p: 701.231.7749 / www.ndsu.edu<http://www.ndsu.edu/>



 [cid:image001.gif at 01D57DE0.91C300C0]


 -------------- next part --------------
 An HTML attachment was scrubbed...
 URL: <
 http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20200603/2ac14173/attachment-0001.html
 >

 ------------------------------

 Message: 3
 Date: Wed, 3 Jun 2020 21:56:04 +0000
 From: "Frederick Stock" <stockf at us.ibm.com>
 To: gpfsug-discuss at spectrumscale.org
 Cc: gpfsug-discuss at spectrumscale.org
 Subject: Re: [gpfsug-discuss] Client Latency and High NSD Server Load
         Average
 Message-ID:

 <OF4256061C.B3CA8966-ON0025857C.00786C34-0025857C.00787D7D at notes.na.collabserv.com>


 Content-Type: text/plain; charset="us-ascii"

 An HTML attachment was scrubbed...
 URL: <
 http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20200603/c252f3b9/attachment.html
 >

 ------------------------------

 _______________________________________________
 gpfsug-discuss mailing list
 gpfsug-discuss at spectrumscale.org
 http://gpfsug.org/mailman/listinfo/gpfsug-discuss


 End of gpfsug-discuss Digest, Vol 101, Issue 6
 **********************************************
 _______________________________________________
 gpfsug-discuss mailing list
 gpfsug-discuss at spectrumscale.org
 http://gpfsug.org/mailman/listinfo/gpfsug-discuss

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=LdN47e1J6DuQfVtCUGylXISVvrHRgD19C_zEOo8SaJ0&s=ec3M7xE47VugZito3VvpZGvrFrl0faoZl6Oq0-iB-3Y&e=



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20200604/4a475cc9/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20200604/4a475cc9/attachment-0002.gif>


More information about the gpfsug-discuss mailing list