<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Hello people,<br>
Its quite a bit of time that i'm triing to solve a problem to our
GPFS system, without much luck so i think its time to ask some help.<br>
<br>
<b>First of a bit of introduction:</b><b><br>
</b><br>
Our GPFS system is made by 3xgss-26, In other words its made with 6x
servers ( 4x10g links each) and several disk enclosures SAS
attacked. The todal amount of spare its roughly 2PB, and the disks
are SATA ( except few SSD dedicated to logtip ). My metadata and on
dedicated vdisks, but both data and metadata vdiosks are in the same
declustered arrays and recovery groups, so in the end they share the
same spindles.<br>
<br>
The clients its a LSF farm configured as another cluster ( standard
multiclustering configuration) of roughly 600 nodes .<br>
<br>
<br>
<b>The issue:</b><b><br>
</b><br>
Recently we became aware that when some massive io request has been
done we experience a lot of client expells. Heres an example of our
logs:<br>
<blockquote>
<blockquote><tt>Fri Aug 15 12:40:24.680 2014: Expel 10.7.28.34
(gss03a) request from 172.16.4.138 (ebi3-138 in
ebi-cluster.ebi.ac.uk). Expelling: 172.16.4.138 (ebi3-138 in
ebi-cluster.ebi.ac.uk)</tt><br>
<tt>Fri Aug 15 12:40:41.652 2014: Expel 10.7.28.66 (gss02b)
request from 10.7.34.38 (ebi5-037 in ebi-cluster.ebi.ac.uk).
Expelling: 10.7.34.38 (ebi5-037 in ebi-cluster.ebi.ac.uk)</tt><br>
<tt>Fri Aug 15 12:40:45.754 2014: Expel 10.7.28.3 (gss01b)
request from 172.16.4.58 (ebi3-058 in ebi-cluster.ebi.ac.uk).
Expelling: 172.16.4.58 (ebi3-058 in ebi-cluster.ebi.ac.uk)</tt><br>
<tt>Fri Aug 15 12:40:52.305 2014: Expel 10.7.28.66 (gss02b)
request from 10.7.34.68 (ebi5-067 in ebi-cluster.ebi.ac.uk).
Expelling: 10.7.34.68 (ebi5-067 in ebi-cluster.ebi.ac.uk)</tt><br>
<tt>Fri Aug 15 12:41:17.069 2014: Expel 10.7.28.35 (gss03b)
request from 172.16.4.161 (ebi3-161 in ebi-cluster.ebi.ac.uk).
Expelling: 172.16.4.161 (ebi3-161 in ebi-cluster.ebi.ac.uk)</tt><br>
<tt>Fri Aug 15 12:41:23.555 2014: Expel 10.7.28.67 (gss02a)
request from 172.16.4.136 (ebi3-136 in ebi-cluster.ebi.ac.uk).
Expelling: 172.16.4.136 (ebi3-136 in ebi-cluster.ebi.ac.uk)</tt><br>
<tt>Fri Aug 15 12:41:54.258 2014: Expel 10.7.28.34 (gss03a)
request from 10.7.34.22 (ebi5-021 in ebi-cluster.ebi.ac.uk).
Expelling: 10.7.34.22 (ebi5-021 in ebi-cluster.ebi.ac.uk)</tt><br>
<tt>Fri Aug 15 12:41:54.540 2014: Expel 10.7.28.66 (gss02b)
request from 10.7.34.57 (ebi5-056 in ebi-cluster.ebi.ac.uk).
Expelling: 10.7.34.57 (ebi5-056 in ebi-cluster.ebi.ac.uk)</tt><br>
<tt>Fri Aug 15 12:42:57.288 2014: Expel 10.7.35.5 (ebi5-132 in
ebi-cluster.ebi.ac.uk) request from 10.7.28.34 (gss03a).
Expelling: 10.7.35.5 (ebi5-132 in ebi-cluster.ebi.ac.uk)</tt><br>
<tt>Fri Aug 15 12:43:24.327 2014: Expel 10.7.28.34 (gss03a)
request from 10.7.37.99 (ebi5-226 in ebi-cluster.ebi.ac.uk).
Expelling: 10.7.37.99 (ebi5-226 in ebi-cluster.ebi.ac.uk)</tt><br>
<tt>Fri Aug 15 12:44:54.202 2014: Expel 10.7.28.67 (gss02a)
request from 172.16.4.165 (ebi3-165 in ebi-cluster.ebi.ac.uk).
Expelling: 172.16.4.165 (ebi3-165 in ebi-cluster.ebi.ac.uk)</tt><br>
<tt>Fri Aug 15 13:15:54.450 2014: Expel 10.7.28.34 (gss03a)
request from 10.7.37.89 (ebi5-216 in ebi-cluster.ebi.ac.uk).
Expelling: 10.7.37.89 (ebi5-216 in ebi-cluster.ebi.ac.uk)</tt><br>
<tt>Fri Aug 15 13:20:16.524 2014: Expel 10.7.28.3 (gss01b)
request from 172.16.4.55 (ebi3-055 in ebi-cluster.ebi.ac.uk).
Expelling: 172.16.4.55 (ebi3-055 in ebi-cluster.ebi.ac.uk)</tt><br>
<tt>Fri Aug 15 13:26:54.177 2014: Expel 10.7.28.34 (gss03a)
request from 10.7.34.64 (ebi5-063 in ebi-cluster.ebi.ac.uk).
Expelling: 10.7.34.64 (ebi5-063 in ebi-cluster.ebi.ac.uk)</tt><br>
<tt>Fri Aug 15 13:27:53.900 2014: Expel 10.7.28.3 (gss01b)
request from 10.7.35.15 (ebi5-142 in ebi-cluster.ebi.ac.uk).
Expelling: 10.7.35.15 (ebi5-142 in ebi-cluster.ebi.ac.uk)</tt><br>
<tt>Fri Aug 15 13:28:24.297 2014: Expel 10.7.28.67 (gss02a)
request from 172.16.4.50 (ebi3-050 in ebi-cluster.ebi.ac.uk).
Expelling: 172.16.4.50 (ebi3-050 in ebi-cluster.ebi.ac.uk)</tt><br>
<tt>Fri Aug 15 13:29:23.913 2014: Expel 10.7.28.3 (gss01b)
request from 172.16.4.156 (ebi3-156 in ebi-cluster.ebi.ac.uk).
Expelling: 172.16.4.156 (ebi3-156 in ebi-cluster.ebi.ac.uk)</tt><br>
</blockquote>
</blockquote>
at the same time we experience also long waiters queue (1000+
lines). An example in case of massive writes ( dd ) :<br>
<tt><br>
</tt>
<blockquote>
<blockquote><tt>0x7F522E1EEF90 waiting 1.861233182 seconds,
NSDThread: on ThCond 0x7F5158019B08 (0x7F5158019B08)
(MsgRecordCondvar), reason 'RPC wait' for getData on node
10.7.37.101 <c0n362></tt><br>
<tt>0x7F522E1EC9B0 waiting 1.490567470 seconds, NSDThread: on
ThCond 0x7F50F4038BA8 (0x7F50F4038BA8) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 10.7.34.45
<c0n271></tt><br>
<tt>0x7F522E1EB6C0 waiting 1.077098046 seconds, NSDThread: on
ThCond 0x7F50B40011F8 (0x7F50B40011F8) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 172.16.4.156
<c0n647></tt><br>
<tt>0x7F522E1EA3D0 waiting 7.714968554 seconds, NSDThread: on
ThCond 0x7F50BC0078B8 (0x7F50BC0078B8) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 10.7.34.107
<c0n455></tt><br>
<tt>0x7F522E1E90E0 waiting 4.774379417 seconds, NSDThread: on
ThCond 0x7F506801B1F8 (0x7F506801B1F8) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 10.7.34.23
<c0n493></tt><br>
<tt>0x7F522E1E7DF0 waiting 0.746172444 seconds, NSDThread: on
ThCond 0x7F5094007D78 (0x7F5094007D78) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 10.7.34.84 <c0n18></tt><br>
<tt>0x7F522E1E6B00 waiting 1.553030487 seconds, NSDThread: on
ThCond 0x7F51C0004C78 (0x7F51C0004C78) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 10.7.34.63
<c0n153></tt><br>
<tt>0x7F522E1E5810 waiting 2.165307633 seconds, NSDThread: on
ThCond 0x7F5178016A08 (0x7F5178016A08) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 10.7.34.29
<c0n686></tt><br>
<tt>0x7F522E1E4520 waiting 1.128089273 seconds, NSDThread: on
ThCond 0x7F5074004D98 (0x7F5074004D98) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 10.7.34.61
<c0n300></tt><br>
<tt>0x7F522E1E3230 waiting 2.515214328 seconds, NSDThread: on
ThCond 0x7F51F400EF08 (0x7F51F400EF08) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 10.7.37.90
<c0n494></tt><br>
<tt>0x7F522E1E1F40 waiting<b> 162.966840834</b> seconds,
NSDThread: on ThCond 0x7F51840207A8 (0x7F51840207A8)
(MsgRecordCondvar), reason 'RPC wait' for getData on node
10.7.34.97 <c0n6></tt><br>
<tt>0x7F522E1E0C50 waiting 1.140787288 seconds, NSDThread: on
ThCond 0x7F51AC005C08 (0x7F51AC005C08) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 10.7.37.94
<c0n664></tt><br>
<tt>0x7F522E1DF960 waiting 41.907415248 seconds, NSDThread: on
ThCond 0x7F5160019038 (0x7F5160019038) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 172.16.4.143
<c0n503></tt><br>
<tt>0x7F522E1DE670 waiting 0.466560418 seconds, NSDThread: on
ThCond 0x7F513802B258 (0x7F513802B258) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 172.16.4.168
<c0n598></tt><br>
<tt>0x7F522E1DD380 waiting 3.102803621 seconds, NSDThread: on
ThCond 0x7F516C0106C8 (0x7F516C0106C8) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 10.7.37.91
<c0n143></tt><br>
<tt>0x7F522E1DC090 waiting 2.751614295 seconds, NSDThread: on
ThCond 0x7F504C0011F8 (0x7F504C0011F8) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 10.7.35.25
<c0n651></tt><br>
<tt>0x7F522E1DADA0 waiting 5.083691891 seconds, NSDThread: on
ThCond 0x7F507401BE88 (0x7F507401BE88) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 10.7.34.61
<c0n300></tt><br>
<tt>0x7F522E1D9AB0 waiting 2.263374184 seconds, NSDThread: on
ThCond 0x7F5080003B98 (0x7F5080003B98) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 10.7.35.36
<c0n225></tt><br>
<tt>0x7F522E1D87C0 waiting 0.206989639 seconds, NSDThread: on
ThCond 0x7F505801F0D8 (0x7F505801F0D8) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 10.7.34.55
<c0n498></tt><br>
<tt>0x7F522E1D74D0 waiting <b>41.841279897</b> seconds,
NSDThread: on ThCond 0x7F5194008B88 (0x7F5194008B88)
(MsgRecordCondvar), reason 'RPC wait' for getData on node
172.16.4.143 <c0n503></tt><br>
<tt>0x7F522E1D61E0 waiting 5.618652361 seconds, NSDThread: on
ThCond 0x1BAB868 (0x1BAB868) (MsgRecordCondvar), reason 'RPC
wait' for getData on node 10.7.35.59 <c0n532></tt><br>
<tt>0x7F522E1D4EF0 waiting 6.185658427 seconds, NSDThread: on
ThCond 0x7F513802AAE8 (0x7F513802AAE8) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 10.7.35.6 <c0n330></tt><br>
<tt>0x7F522E1D3C00 waiting 2.652370892 seconds, NSDThread: on
ThCond 0x7F5130004C78 (0x7F5130004C78) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 10.7.34.45
<c0n271></tt><br>
<tt>0x7F522E1D2910 waiting 11.396142225 seconds, NSDThread: on
ThCond 0x7F51A401C0C8 (0x7F51A401C0C8) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 172.16.4.169
<c0n549></tt><br>
<tt>0x7F522E1D1620 waiting 63.710723043 seconds, NSDThread: on
ThCond 0x7F5038004D08 (0x7F5038004D08) (MsgRecordCondvar),
reason 'RPC wait' for getData on node 10.7.37.120 <c0n8><br>
<br>
<br>
</tt></blockquote>
</blockquote>
or for massive reads:<tt><br>
</tt>
<blockquote>
<blockquote><tt>0x7FBCE69A8C20 waiting 29.262629530 seconds,
NSDThread: on ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE699CEC0 waiting 29.260869141 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE698C5A0 waiting 29.124824888 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE6984110 waiting 22.729479654 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE69512C0 waiting 29.272805926 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE69409A0 waiting 28.833650198 seconds, NSDThread: on
ThCond 0x18033B74D48 (0xFFFFC90033B74D48) (LeaseWaitCondvar),
reason 'Waiting to acquire disklease'</tt><br>
<tt>0x7FBCE6924320 waiting 29.237067128 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE6921D40 waiting 29.237953228 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE6915FE0 waiting 29.046721161 seconds, NSDThread: on
ThCond 0x18033B74D48 (0xFFFFC90033B74D48) (LeaseWaitCondvar),
reason 'Waiting to acquire disklease'</tt><br>
<tt>0x7FBCE6913A00 waiting 29.264534710 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE6900B00 waiting 29.267691105 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE68F7380 waiting 29.266402464 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE68D2870 waiting 29.276298231 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE68BADB0 waiting 28.665700576 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE68B61F0 waiting 29.236878611 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE6885980 waiting <b>144</b>.530487248 seconds,
NSDThread: on ThMutex 0x1803396A670 (0xFFFFC9003396A670)
(DiskSchedulingMutex)</tt><br>
<tt>0x7FBCE68833A0 waiting 29.231066610 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE68820B0 waiting 29.269954514 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE686A5F0 waiting <b>140</b>.662994256 seconds,
NSDThread: on ThMutex 0x180339A3140 (0xFFFFC900339A3140)
(DiskSchedulingMutex)</tt><br>
<tt>0x7FBCE6864740 waiting 29.254180742 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE683FC30 waiting 29.271840565 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE682E020 waiting 29.200969209 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE6825B90 waiting 19.136732919 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE6805C40 waiting 29.236055550 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE67FEAA0 waiting 29.283264161 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE67FC4C0 waiting 29.268992663 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE67DFE40 waiting 29.150900786 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE67D2DF0 waiting 29.199058463 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE67D1B00 waiting 29.203199738 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE67768D0 waiting 29.208231742 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE6768590 waiting 5.228192589 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE67672A0 waiting 29.252839376 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE6757C70 waiting 28.869359044 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE6748640 waiting 29.289284179 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE6734450 waiting 29.253591817 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE6730B80 waiting 29.289987273 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE6720260 waiting 26.597589551 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE66F32C0 waiting 29.177692849 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE66E3C90 waiting 29.160268518 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE66CC1D0 waiting 5.334330188 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE66B3420 waiting 34.274433161 seconds, NSDThread: on
ThMutex 0x180339A3140 (0xFFFFC900339A3140)
(DiskSchedulingMutex)</tt><br>
<tt>0x7FBCE668E910 waiting 27.699999488 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE6689D50 waiting 34.279090465 seconds, NSDThread: on
ThMutex 0x180339A3140 (0xFFFFC900339A3140)
(DiskSchedulingMutex)</tt><br>
<tt>0x7FBCE66805D0 waiting 24.688626241 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE6675B60 waiting 35.367745840 seconds, NSDThread: on
ThCond 0x18033B74D48 (0xFFFFC90033B74D48) (LeaseWaitCondvar),
reason 'Waiting to acquire disklease'</tt><br>
<tt>0x7FBCE665E0A0 waiting 29.235994598 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
<tt>0x7FBCE663CE60 waiting 29.162911979 seconds, NSDThread: on
ThCond 0x7FBBF0045D40 (0x7FBBF0045D40)
(VdiskLogAppendCondvar), reason 'wait for permission to append
to log'</tt><br>
</blockquote>
</blockquote>
<br>
Another example with mmfsadm in case of massive reads:<br>
<blockquote>
<blockquote><tt>[root@gss02b ~]# mmfsadm dump waiters</tt><br>
<tt>0x7F519000AEA0 waiting 28.915010347 seconds,
replyCleanupThread: on ThCond 0x7F51101B27B8 (0x7F51101B27B8)
(MsgRecordCondvar), reason 'RPC wait'</tt><br>
<tt>0x7F511C012A10 waiting 279.522206863 seconds, Msg handler
commMsgCheckMessages: on ThCond 0x7F52000095F8
(0x7F52000095F8) (InuseCondvar), reason 'waiting for exclusive
use of connection for sending msg'</tt><br>
<tt>0x7F5120000B80 waiting 279.524782437 seconds, Msg handler
commMsgCheckMessages: on ThCond 0x7F5214000EE8
(0x7F5214000EE8) (InuseCondvar), reason 'waiting for exclusive
use of connection for sending msg'</tt><br>
<tt>0x7F5154006310 waiting 138.164386224 seconds, Msg handler
commMsgCheckMessages: on ThCond 0x7F5174003F08
(0x7F5174003F08) (InuseCondvar), reason 'waiting for exclusive
use of connection for sending msg'</tt><br>
<tt>0x7F522E1EB6C0 waiting 23.060703000 seconds, NSDThread: for
poll on sock 85</tt><br>
<tt>0x7F522E1E6B00 waiting 0.068456104 seconds, NSDThread: on
ThCond 0x7F50CC00E478 (0x7F50CC00E478) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522E1D0330 waiting 17.207907857 seconds, NSDThread: on
ThCond 0x7F5078001688 (0x7F5078001688) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522E1BFA10 waiting 0.181011711 seconds, NSDThread: on
ThCond 0x7F504000E558 (0x7F504000E558) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522E1B4FA0 waiting 0.021780338 seconds, NSDThread: on
ThCond 0x7F522000E488 (0x7F522000E488) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522E1B3CB0 waiting 0.794718000 seconds, NSDThread: for
poll on sock 799</tt><br>
<tt>0x7F522E186D10 waiting 0.191606803 seconds, NSDThread: on
ThCond 0x7F5184015D58 (0x7F5184015D58) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522E184730 waiting 0.025562000 seconds, NSDThread: for
poll on sock 867</tt><br>
<tt>0x7F522E12CDD0 waiting 0.008921000 seconds, NSDThread: for
poll on sock 543</tt><br>
<tt>0x7F522E126F20 waiting 1.459531000 seconds, NSDThread: for
poll on sock 983</tt><br>
<tt>0x7F522E10F460 waiting 17.177936972 seconds, NSDThread: on
ThCond 0x7F51EC002CE8 (0x7F51EC002CE8) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522E101120 waiting 17.232580316 seconds, NSDThread: on
ThCond 0x7F51BC005BB8 (0x7F51BC005BB8) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522E0F1AF0 waiting 438.556030000 seconds, NSDThread: for
poll on sock 496</tt><br>
<tt>0x7F522E0E7080 waiting 393.702839774 seconds, NSDThread: on
ThCond 0x7F5164013668 (0x7F5164013668) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522E09DA60 waiting 52.746984660 seconds, NSDThread: on
ThCond 0x7F506C008858 (0x7F506C008858) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522E084CB0 waiting 23.096688206 seconds, NSDThread: on
ThCond 0x7F521C008E18 (0x7F521C008E18) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522E0839C0 waiting 0.093456000 seconds, NSDThread: for
poll on sock 962</tt><br>
<tt>0x7F522E076970 waiting 2.236659731 seconds, NSDThread: on
ThCond 0x7F51E0027538 (0x7F51E0027538) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522E044E10 waiting 52.752497765 seconds, NSDThread: on
ThCond 0x7F513802BDD8 (0x7F513802BDD8) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522E033200 waiting 16.157355796 seconds, NSDThread: on
ThCond 0x7F5104240D58 (0x7F5104240D58) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522E02AD70 waiting 436.025203220 seconds, NSDThread: on
ThCond 0x7F50E0016C28 (0x7F50E0016C28) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522E01A450 waiting 393.673252777 seconds, NSDThread: on
ThCond 0x7F50A8009C18 (0x7F50A8009C18) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DFE0460 waiting 1.781358358 seconds, NSDThread: on
ThCond 0x7F51E0027638 (0x7F51E0027638) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DF99420 waiting 0.038405427 seconds, NSDThread: on
ThCond 0x7F50F0172B18 (0x7F50F0172B18) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DF7CDA0 waiting 438.204625355 seconds, NSDThread: on
ThCond 0x7F50900023D8 (0x7F50900023D8) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DF76EF0 waiting 435.903645734 seconds, NSDThread: on
ThCond 0x7F5084004BC8 (0x7F5084004BC8) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DF74910 waiting 21.749325022 seconds, NSDThread: on
ThCond 0x7F507C011F48 (0x7F507C011F48) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DF71040 waiting 1.027274000 seconds, NSDThread: for
poll on sock 866</tt><br>
<tt>0x7F522DF536D0 waiting 52.953847324 seconds, NSDThread: on
ThCond 0x7F5200006FF8 (0x7F5200006FF8) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DF510F0 waiting 0.039278000 seconds, NSDThread: for
poll on sock 837</tt><br>
<tt>0x7F522DF4EB10 waiting 0.085745937 seconds, NSDThread: on
ThCond 0x7F51F0006828 (0x7F51F0006828) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DF4C530 waiting 21.850733000 seconds, NSDThread: for
poll on sock 986</tt><br>
<tt>0x7F522DF4B240 waiting 0.054739884 seconds, NSDThread: on
ThCond 0x7F51EC0168D8 (0x7F51EC0168D8) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DF48C60 waiting 0.186409714 seconds, NSDThread: on
ThCond 0x7F51E4000908 (0x7F51E4000908) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DF41AC0 waiting 438.942861290 seconds, NSDThread: on
ThCond 0x7F51CC010168 (0x7F51CC010168) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DF3F4E0 waiting 0.060235106 seconds, NSDThread: on
ThCond 0x7F51C400A438 (0x7F51C400A438) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DF22E60 waiting 0.361288000 seconds, NSDThread: for
poll on sock 518</tt><br>
<tt>0x7F522DF21B70 waiting 0.060722464 seconds, NSDThread: on
ThCond 0x7F51580162D8 (0x7F51580162D8) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DF12540 waiting 23.077564448 seconds, NSDThread: on
ThCond 0x7F512C13E1E8 (0x7F512C13E1E8) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DEFD060 waiting 0.723370000 seconds, NSDThread: for
poll on sock 503</tt><br>
<tt>0x7F522DEE09E0 waiting 1.565799175 seconds, NSDThread: on
ThCond 0x7F5084004D58 (0x7F5084004D58) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DEDF6F0 waiting 22.063017342 seconds, NSDThread: on
ThCond 0x7F5078003E08 (0x7F5078003E08) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DEDD110 waiting 0.049108780 seconds, NSDThread: on
ThCond 0x7F5070001D78 (0x7F5070001D78) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DEDAB30 waiting 229.603224376 seconds, NSDThread: on
ThCond 0x7F50680221B8 (0x7F50680221B8) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DED7260 waiting 0.071855457 seconds, NSDThread: on
ThCond 0x7F506400A5A8 (0x7F506400A5A8) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DED5F70 waiting 0.648324000 seconds, NSDThread: for
poll on sock 766</tt><br>
<tt>0x7F522DEC3070 waiting 1.809205756 seconds, NSDThread: on
ThCond 0x7F522000E518 (0x7F522000E518) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DEB1460 waiting 436.017396645 seconds, NSDThread: on
ThCond 0x7F51E4000978 (0x7F51E4000978) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DEAC8A0 waiting 393.734102000 seconds, NSDThread: for
poll on sock 609</tt><br>
<tt>0x7F522DEA3120 waiting 17.960778837 seconds, NSDThread: on
ThCond 0x7F51B4001708 (0x7F51B4001708) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DE86AA0 waiting 23.112060045 seconds, NSDThread: on
ThCond 0x7F5154096118 (0x7F5154096118) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DE64570 waiting 0.076167410 seconds, NSDThread: on
ThCond 0x7F50D8005EF8 (0x7F50D8005EF8) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DE1AF50 waiting 17.460836000 seconds, NSDThread: for
poll on sock 737</tt><br>
<tt>0x7F522DE104E0 waiting 0.205037000 seconds, NSDThread: for
poll on sock 865</tt><br>
<tt>0x7F522DDB8B80 waiting 0.106192000 seconds, NSDThread: for
poll on sock 78</tt><br>
<tt>0x7F522DDA36A0 waiting 0.738921180 seconds, NSDThread: on
ThCond 0x7F505400E048 (0x7F505400E048) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DD9C500 waiting 0.731118367 seconds, NSDThread: on
ThCond 0x7F503C00B518 (0x7F503C00B518) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
<tt>0x7F522DD89600 waiting 229.609363000 seconds, NSDThread: for
poll on sock 515</tt><br>
<tt>0x7F522DD567B0 waiting 1.508489195 seconds, NSDThread: on
ThCond 0x7F514C021F88 (0x7F514C021F88) (InuseCondvar), reason
'waiting for exclusive use of connection for sending msg'</tt><br>
</blockquote>
</blockquote>
<br>
<br>
Another thing worth to mention is that the filesystem its totaly
unresponsive. Even a simple "cd" to a directory or an ls to a
directory just hangs for several minutes ( litterally). This happens
also if i try from the NSD servers.<br>
<br>
<br>
<b>Few things i have looked into:</b><br>
* Our network seems fine, there might be some bottleneck on part of
them, and this could explain the waiters, but doesnt explain why ad
some poit those client ask to expel the NSD servers. THis also
doesn't justify why the FS is slow even on NSD itself.<br>
<br>
* Disk bottleneck? i dont think so. NSD servers have cpu usage (and
io wait ) very low. Also mmdiag --iohist seems condirming that the
operation on the disks are reasonable fast:<br>
<br>
<br>
<blockquote>
<blockquote>
<tt>=== mmdiag: iohist ===</tt><br>
<br>
<tt>I/O history:</tt><br>
<br>
<tt> I/O start time RW Buf type disk:sectorNum nSec time
ms Type Device/NSD ID NSD server</tt><br>
<tt>--------------- -- ----------- ----------------- -----
------- ---- ------------------ ---------------</tt><br>
<tt>13:54:29.209276 W data 34:5066338808 2056
88.307 lcl sdtu</tt><br>
<tt>13:54:29.209277 W data 55:5095698936 2056
27.592 lcl sdaab</tt><br>
<tt>13:54:29.209278 W data 171:5104087544 2056
22.801 lcl sdtg</tt><br>
<tt>13:54:29.209279 W data 116:5011812856 2056
65.983 lcl sdqr</tt><br>
<tt>13:54:29.209280 W data 98:4860817912 2056
17.892 lcl sddl</tt><br>
<tt>13:54:29.209281 W data 159:4999229944 2056
21.324 lcl sdjg</tt><br>
<tt>13:54:29.209282 W data 84:5049561592 2056
31.932 lcl sdqz</tt><br>
<tt>13:54:29.209283 W data 8:5003424248 2056
30.912 lcl sdcw</tt><br>
<tt>13:54:29.209284 W data 23:4965675512 2056
27.366 lcl sdpt</tt><br>
<tt>13:54:29.297715 W vdiskMDLog 2:144008496 1
0.236 lcl sdkr</tt><br>
<tt>13:54:29.297717 W vdiskMDLog 0:331703600 1
0.230 lcl sdcm</tt><br>
<tt>13:54:29.297718 W vdiskMDLog 1:273769776 1
0.241 lcl sdbp</tt><br>
<tt>13:54:29.244902 W data 51:3857589752 2056
35.566 lcl sdyi</tt><br>
<tt>13:54:29.244904 W data 10:3773703672 2056
28.512 lcl sdma</tt><br>
<tt>13:54:29.244905 W data 48:3639485944 2056
24.124 lcl sdel</tt><br>
<tt>13:54:29.244906 W data 25:3777897976 2056
18.691 lcl sdgt</tt><br>
<tt>13:54:29.244908 W data 91:3832423928 2056
20.699 lcl sdlc</tt><br>
<tt>13:54:29.244909 W data 115:3723372024 2056
30.783 lcl sdho</tt><br>
<tt>13:54:29.244910 W data 173:3882755576 2056
53.241 lcl sdti</tt><br>
<tt>13:54:29.244911 W data 42:3782092280 2056
22.785 lcl sddz</tt><br>
<tt>13:54:29.244912 W data 45:3647874552 2056
24.289 lcl sdei</tt><br>
<tt>13:54:29.244913 W data 32:3652068856 2056
17.220 lcl sdbn</tt><br>
<tt>13:54:29.244914 W data 39:3677234680 2056
26.017 lcl sddw</tt><br>
<tt>13:54:29.298273 W vdiskMDLog 2:144008497 1
2.522 lcl sduf</tt><br>
<tt>13:54:29.298274 W vdiskMDLog 0:331703601 1
1.025 lcl sdlo</tt><br>
<tt>13:54:29.298275 W vdiskMDLog 1:273769777 1
2.586 lcl sdtt</tt><br>
<tt>13:54:29.288275 W data 27:2249588200 2056
20.071 lcl sdhb</tt><br>
<tt>13:54:29.288279 W data 33:2224422376 2056
19.682 lcl sdts</tt><br>
<tt>13:54:29.288281 W data 47:2115370472 2056
21.667 lcl sdwo</tt><br>
<tt>13:54:29.288282 W data 82:2316697064 2056
21.524 lcl sdxy</tt><br>
<tt>13:54:29.288283 W data 85:2232810984 2056
17.467 lcl sdra</tt><br>
<tt>13:54:29.288285 W data 30:2127953384 2056
18.475 lcl sdqg</tt><br>
<tt>13:54:29.288286 W data 67:1876295144 2056
16.383 lcl sdmx</tt><br>
<tt>13:54:29.288287 W data 64:2127953384 2056
21.908 lcl sduh</tt><br>
<tt>13:54:29.288288 W data 38:2253782504 2056
19.775 lcl sddv</tt><br>
<tt>13:54:29.288290 W data 15:2207645160 2056
20.599 lcl sdet</tt><br>
<tt>13:54:29.288291 W data 157:2283142632 2056
21.198 lcl sdiy</tt><br>
</blockquote>
</blockquote>
<br>
Bonding problem on the interfaces? Mellanox ( interface card
prodicer) drivers and firmware updated, and we even tested the
system with a single link ( without bonding).<br>
<br>
<br>
Could someone help me with this? in particular:<br>
<br>
* What exactly are client are looking to decide that another node is
unresponsive? Ping? i dont think so because both NSD servers and
clients can be pinged, so what they look? if comeone can also
specify what port are they using i can try to tcpdump what exactly
is cauding this expell.<br>
<br>
* How can i monitor metadata operations to understand where EXACTLY
is the bottleneck that causes this:<br>
<br>
<blockquote>
<blockquote><tt>[sdinardo@ebi5-001 ~]$ time ls
/gpfs/nobackup/sdinardo</tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt>1 ebi3-054.ebi.ac.uk
ebi3-154 ebi5-019.ebi.ac.uk ebi5-052
ebi5-101 ebi5-156 ebi5-197
ebi5-228 ebi5-262.ebi.ac.uk</tt><tt><br>
</tt><tt>10 ebi3-055
ebi3-155 ebi5-021.ebi.ac.uk ebi5-053
ebi5-104.ebi.ac.uk ebi5-160.ebi.ac.uk ebi5-198
ebi5-229 ebi5-263</tt><tt><br>
</tt><tt>2 ebi3-056.ebi.ac.uk
ebi3-156 ebi5-022 ebi5-054.ebi.ac.uk
ebi5-106 ebi5-161 ebi5-200
ebi5-230.ebi.ac.uk ebi5-264</tt><tt><br>
</tt><tt>3 ebi3-057
ebi3-157 ebi5-023 ebi5-056
ebi5-109 ebi5-162.ebi.ac.uk ebi5-201
ebi5-231.ebi.ac.uk ebi5-265</tt><tt><br>
</tt><tt>4 ebi3-058
ebi3-158.ebi.ac.uk ebi5-024.ebi.ac.uk ebi5-057
ebi5-110.ebi.ac.uk ebi5-163.ebi.ac.uk ebi5-202.ebi.ac.uk
ebi5-232 ebi5-266.ebi.ac.uk</tt><tt><br>
</tt><tt>5 ebi3-059.ebi.ac.uk
ebi3-160 ebi5-025 ebi5-060
ebi5-111.ebi.ac.uk ebi5-164 ebi5-204
ebi5-233 ebi5-267</tt><tt><br>
</tt><tt>6 ebi3-132
ebi3-161.ebi.ac.uk ebi5-026 ebi5-061.ebi.ac.uk
ebi5-112.ebi.ac.uk ebi5-165 ebi5-205
ebi5-234 ebi5-269.ebi.ac.uk</tt><tt><br>
</tt><tt>7 ebi3-133
ebi3-163.ebi.ac.uk ebi5-028 ebi5-062.ebi.ac.uk
ebi5-129.ebi.ac.uk ebi5-166 ebi5-206.ebi.ac.uk
ebi5-236 ebi5-270</tt><tt><br>
</tt><tt>8 ebi3-134
ebi3-165 ebi5-030 ebi5-064
ebi5-131.ebi.ac.uk ebi5-169.ebi.ac.uk ebi5-207
ebi5-237 ebi5-271</tt><tt><br>
</tt><tt>9 ebi3-135
ebi3-166.ebi.ac.uk ebi5-031 ebi5-065
ebi5-132 ebi5-170.ebi.ac.uk ebi5-209
ebi5-239.ebi.ac.uk launcher.sh</tt><tt><br>
</tt><tt><br>
</tt><font color="#cc0000"><u><b><tt>real 21m14.948s</tt></b></u></font><tt>
( WTH ?!?!?!)<br>
</tt><tt>user 0m0.004s</tt><tt><br>
</tt><tt>sys 0m0.014s</tt><tt><br>
</tt></blockquote>
</blockquote>
<br>
<br>
I know that the question are not easy to answer, and i need to dig
more, but could be very helpful if someone give me some hints about
where to look at. My gpfs skills are limited since this is our first
system and is in production for just few months, and the things
stated to worsen just recenlty. In past we could get over 200Gb/s (
both read and write) without any issue. Now some clients get
expelled even when data thoughuput is ad 4-5Gb/s.<br>
<br>
Thanks in advance for any help.<br>
<br>
Regards,<br>
Salvatore<br>
</body>
</html>