[gpfsug-discuss] Performance problems + (MultiThreadWorkInstanceCond), reason 'waiting for helper threads'

Caubet Serrabou Marc (PSI) marc.caubet at psi.ch
Thu Apr 18 16:32:58 BST 2019


Hi all,

I would like to have some hints about the following problem:

Waiting 26.6431 sec since 17:18:32, ignored, thread 38298 NSPDDiscoveryRunQueueThread: on ThCond 0x7FC98EB6A2B8 (MultiThreadWorkInstanceCond), reason 'waiting for helper threads'
Waiting 2.7969 sec since 17:18:55, monitored, thread 39736 NSDThread: for I/O completion
Waiting 2.8024 sec since 17:18:55, monitored, thread 39580 NSDThread: for I/O completion
Waiting 3.0435 sec since 17:18:55, monitored, thread 39448 NSDThread: for I/O completion

I am testing a new GPFS cluster (GPFS cluster client with computing nodes remotely mounting the Storage GPFS Cluster) and I am running 65 gpfsperf commands (1 command per client in parallell) as follows:

/usr/lpp/mmfs/samples/perf/gpfsperf create seq /gpfs/home/caubet_m/gpfsperf/$(hostname).txt -fsync -n 24g -r 16m -th 8

I am unable to reach more than 6.5GBps (Lenovo DSS G240 GPFS 5.0.2-1, on a testing a 'home' filesystem with 1MB blocksize and subblocks of 8KB). After several seconds I see many waiters for I/O completion (up to 5 seconds)
and also the 'waiting for helper threads' message shown above. Can somebody explain me the meaning for this message? How could I improve that?

Current config in the storage cluster is:

[root at merlindssio02 ~]# mmlsconfig
Configuration data for cluster merlin.psi.ch:
---------------------------------------------
clusterName merlin.psi.ch
clusterId 1511090979434548295
autoload no
dmapiFileHandleSize 32
minReleaseLevel 5.0.2.0
ccrEnabled yes
nsdRAIDFirmwareDirectory /opt/lenovo/dss/firmware
cipherList AUTHONLY
maxblocksize 16m
[merlindssmgt01]
ignorePrefetchLUNCount yes
[common]
pagepool 4096M
[merlindssio01,merlindssio02]
pagepool 270089M
[merlindssmgt01,dssg]
pagepool 57684M
maxBufferDescs 2m
numaMemoryInterleave yes
[common]
prefetchPct 50
[merlindssmgt01,dssg]
prefetchPct 20
nsdRAIDTracks 128k
nsdMaxWorkerThreads 3k
nsdMinWorkerThreads 3k
nsdRAIDSmallThreadRatio 2
nsdRAIDThreadsPerQueue 16
nsdClientCksumTypeLocal ck64
nsdClientCksumTypeRemote ck64
nsdRAIDFlusherFWLogHighWatermarkMB 1000
nsdRAIDBlockDeviceMaxSectorsKB 0
nsdRAIDBlockDeviceNrRequests 0
nsdRAIDBlockDeviceQueueDepth 0
nsdRAIDBlockDeviceScheduler off
nsdRAIDMaxPdiskQueueDepth 128
nsdMultiQueue 512
verbsRdma enable
verbsPorts mlx5_0/1 mlx5_1/1
verbsRdmaSend yes
scatterBufferSize 256K
maxFilesToCache 128k
maxMBpS 40000
workerThreads 1024
nspdQueues 64
[common]
subnets 192.168.196.0/merlin-hpc.psi.ch;merlin.psi.ch
adminMode central

File systems in cluster merlin.psi.ch:
--------------------------------------
/dev/home
/dev/t16M128K
/dev/t16M16K
/dev/t1M8K
/dev/t4M16K
/dev/t4M32K
/dev/test

And for the computing cluster:

[root at merlin-c-001 ~]# mmlsconfig
Configuration data for cluster merlin-hpc.psi.ch:
-------------------------------------------------
clusterName merlin-hpc.psi.ch
clusterId 14097036579263601931
autoload yes
dmapiFileHandleSize 32
minReleaseLevel 5.0.2.0
ccrEnabled yes
cipherList AUTHONLY
maxblocksize 16M
numaMemoryInterleave yes
maxFilesToCache 128k
maxMBpS 20000
workerThreads 1024
verbsRdma enable
verbsPorts mlx5_0/1
verbsRdmaSend yes
scatterBufferSize 256K
ignorePrefetchLUNCount yes
nsdClientCksumTypeLocal ck64
nsdClientCksumTypeRemote ck64
pagepool 32G
subnets 192.168.196.0/merlin-hpc.psi.ch;merlin.psi.ch
adminMode central

File systems in cluster merlin-hpc.psi.ch:
------------------------------------------
(none)

Thanks a lot and best regards,
Marc
_________________________________________
Paul Scherrer Institut
High Performance Computing
Marc Caubet Serrabou
Building/Room: WHGA/019A
Forschungsstrasse, 111
5232 Villigen PSI
Switzerland

Telephone: +41 56 310 46 67
E-Mail: marc.caubet at psi.ch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20190418/8a8987c5/attachment-0001.htm>


More information about the gpfsug-discuss mailing list