<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>Hi,</p>
<p>We found out using ib_read_bw and ib_write_bw that there were
some links between server and clients degraded, having a bandwith
of 350MB/s<br>
</p>
<p>strangely, nsdperf did not report the same. It reported 12GB/s
write and 9GB/s read, which was much more then we actually could
achieve.</p>
<p>So the problem was some bad ib routing. We changed some ib links,
and then we got also 12GB/s read with nsdperf.</p>
<p>On our clients we then are able to achieve the 7,2GB/s in total
we also saw using the nsd servers!</p>
<p>Many thanks for the help !!</p>
<p>We are now running some tests with different blocksizes and
parameters, because our backend storage is able to do more than
the 7.2GB/s we get with GPFS (more like 14GB/s in total). I guess
prefetchthreads and nsdworkerthreads are the ones to look at?<br>
</p>
<p>Cheers!</p>
<p>Kenneth<br>
</p>
<div class="moz-cite-prefix">On 21/04/17 22:27, Kumaran Rajaram
wrote:<br>
</div>
<blockquote
cite="mid:OFFBA4CE9D.BAA6A93D-ON00258109.006C6C87-85258109.00706A3A@notes.na.collabserv.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<font face="sans-serif" size="2">Hi Kenneth,</font><br>
<br>
<font face="sans-serif" size="2">As it was mentioned earlier, it
will
be good to first verify the raw network performance between the
NSD client
and NSD server using the nsdperf tool that is built with RDMA
support.</font><br>
<font face="Courier New" size="2">g++ -O2 -DRDMA -o nsdperf
-lpthread
-lrt -libverbs -lrdmacm nsdperf.C</font><br>
<br>
<font face="sans-serif" size="2">In addition, since you have 2 x
NSD
servers it will be good to perform NSD client file-system
performance
test with just single NSD server (mmshutdown the other server,
assuming
all the NSDs have primary, server NSD server configured + Quorum
will be
intact when a NSD server is brought down) to see if it helps to
improve
the read performance + if there are variations in the
file-system read
bandwidth results between NSD_server#1 'active' vs. NSD_server
#2 'active'
(with other NSD server in GPFS "down" state). If there is
significant
variation, it can help to isolate the issue to particular NSD
server (HW
or IB issue?).</font><br>
<br>
<font face="sans-serif" size="2">You can issue "mmdiag --waiters"
on NSD client as well as NSD servers during your dd test, to
verify if
there are unsual long GPFS waiters. In addition, you may issue
Linux "perf
top -z" command on the GPFS node to see if there is high
CPU usage by any particular call/event (for e.g., If GPFS config
parameter
verbsRdmaMaxSendBytes has been set to low value from the
default
16M, then it can cause RDMA completion threads to go CPU bound
). Please
verify some performance scenarios detailed in Chapter 22 in
Spectrum Scale
Problem Determination Guide (link below).</font><br>
<br>
<a moz-do-not-send="true"
href="https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/pdf/scale_pdg.pdf?view=kc"><font
face="sans-serif" color="blue" size="2">https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/pdf/scale_pdg.pdf?view=kc</font></a><br>
<br>
<font face="sans-serif" size="2">Thanks,</font><br>
<font face="sans-serif" size="2">-Kums </font><br>
<br>
<br>
<br>
<br>
<br>
<font face="sans-serif" color="#5f5f5f" size="1">From:
</font><font face="sans-serif" size="1">Kenneth Waegeman
<a class="moz-txt-link-rfc2396E" href="mailto:kenneth.waegeman@ugent.be"><kenneth.waegeman@ugent.be></a></font><br>
<font face="sans-serif" color="#5f5f5f" size="1">To:
</font><font face="sans-serif" size="1">gpfsug main discussion
list <a class="moz-txt-link-rfc2396E" href="mailto:gpfsug-discuss@spectrumscale.org"><gpfsug-discuss@spectrumscale.org></a></font><br>
<font face="sans-serif" color="#5f5f5f" size="1">Date:
</font><font face="sans-serif" size="1">04/21/2017 11:43 AM</font><br>
<font face="sans-serif" color="#5f5f5f" size="1">Subject:
</font><font face="sans-serif" size="1">Re: [gpfsug-discuss]
bizarre performance behavior</font><br>
<font face="sans-serif" color="#5f5f5f" size="1">Sent by:
</font><font face="sans-serif" size="1"><a class="moz-txt-link-abbreviated" href="mailto:gpfsug-discuss-bounces@spectrumscale.org">gpfsug-discuss-bounces@spectrumscale.org</a></font><br>
<hr noshade="noshade"><br>
<br>
<br>
<font size="3">Hi,</font>
<p><font size="3">We already verified this on our nsds:</font></p>
<p><font size="3">[root@nsd00 ~]# /opt/dell/toolkit/bin/syscfg
--QpiSpeed<br>
QpiSpeed=maxdatarate<br>
[root@nsd00 ~]# /opt/dell/toolkit/bin/syscfg --turbomode<br>
turbomode=enable<br>
[root@nsd00 ~]# /opt/dell/toolkit/bin/syscfg –-SysProfile <br>
SysProfile=perfoptimized</font></p>
<p><font size="3">so sadly this is not the issue.</font></p>
<p><font size="3">Also the output of the verbs commands look ok,
there are
connections from the client to the nsds are there is data
being read and
writen.</font></p>
<p><font size="3">Thanks again! </font></p>
<p><font size="3">Kenneth</font></p>
<p><br>
<font size="3">On 21/04/17 16:01, Kumaran Rajaram wrote:</font><br>
<font face="sans-serif" size="2">Hi,</font><font size="3"><br>
</font><font face="sans-serif" size="2"><br>
Try enabling the following in the BIOS of the NSD servers
(screen shots
below) </font></p>
<ul>
<li><font face="sans-serif" size="2">Turbo Mode - Enable</font></li>
<li><font face="sans-serif" size="2">QPI Link Frequency - Max
Performance</font></li>
<li><font face="sans-serif" size="2">Operating Mode - Maximum
Performance</font></li>
<li><font face="Arial" size="2">>>>>While we have
even better
performance with sequential reads on raw storage LUNS, using
GPFS we can
only reach 1GB/s in total (each nsd server seems limited by
0,5GB/s) independent
of the number of clients </font>
<p><font face="Arial" size="2">>>We are testing from 2
testing machines
connected to the nsds with infiniband, verbs enabled.</font></p>
</li>
</ul>
<font face="sans-serif" size="2"><br>
Also, It will be good to verify that all the GPFS nodes have
Verbs RDMA
started using "mmfsadm test verbs status" and that the NSD
client-server
communication from client to server during "dd" is actually
using
Verbs RDMA using "mmfsadm test verbs conn" command (on
NSD client doing dd). If not, then GPFS might be using TCP/IP
network over
which the cluster is configured impacting performance (If this
is the case,
GPFS mmfs.log.latest for any Verbs RDMA related errors and
resolve). </font>
<ul>
<li><br>
</li>
</ul>
<img src="cid:part2.5CE865A6.BFE97FE3@ugent.be" style="border:0px
solid;" height="531" width="643"><font size="3"><br>
</font><img src="cid:part3.12F2B6D0.C7A4F1F1@ugent.be"
style="border:0px solid;" height="676" width="811"><font
size="3"><br>
</font><img src="cid:part4.9D411BB5.D6E0AF8B@ugent.be"
style="border:0px solid;" height="676" width="813"><font
size="3"><br>
</font><font face="sans-serif" size="2"><br>
Regards,<br>
-Kums</font><font size="3"><br>
<br>
<br>
<br>
<br>
<br>
</font><font face="sans-serif" color="#5f5f5f" size="1"><br>
From: </font><font face="sans-serif" size="1">"Knister,
Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" </font><a
moz-do-not-send="true" href="mailto:aaron.s.knister@nasa.gov"><font
face="sans-serif" color="blue" size="1"><u><a class="moz-txt-link-rfc2396E" href="mailto:aaron.s.knister@nasa.gov"><aaron.s.knister@nasa.gov></a></u></font></a><font
face="sans-serif" color="#5f5f5f" size="1"><br>
To: </font><font face="sans-serif" size="1">gpfsug
main discussion list </font><a moz-do-not-send="true"
href="mailto:gpfsug-discuss@spectrumscale.org"><font
face="sans-serif" color="blue" size="1"><u><gpfsug-discuss@spectrumscale.org></u></font></a><font
face="sans-serif" color="#5f5f5f" size="1"><br>
Date: </font><font face="sans-serif" size="1">04/21/2017
09:11 AM</font><font face="sans-serif" color="#5f5f5f" size="1"><br>
Subject: </font><font face="sans-serif" size="1">Re:
[gpfsug-discuss] bizarre performance behavior</font><font
face="sans-serif" color="#5f5f5f" size="1"><br>
Sent by: </font><a moz-do-not-send="true"
href="mailto:gpfsug-discuss-bounces@spectrumscale.org"><font
face="sans-serif" color="blue" size="1"><u>gpfsug-discuss-bounces@spectrumscale.org</u></font></a><font
size="3"><br>
</font>
<hr noshade="noshade"><font size="3"><br>
<br>
<br>
Fantastic news! It might also be worth running "cpupower
monitor"
or "turbostat" on your NSD servers while you're running dd tests
from the clients to see what CPU frequency your cores are
actually running
at. <br>
<br>
A typical NSD server workload (especially with IB verbs and for
reads)
can be pretty light on CPU which might not prompt your CPU crew
governor
to up the frequency (which can affect throughout). If your
frequency scaling
governor isn't kicking up the frequency of your CPUs I've seen
that cause
this behavior in my testing. <br>
<br>
-Aaron<br>
<br>
<br>
<br>
<br>
On April 21, 2017 at 05:43:40 EDT, Kenneth Waegeman </font><a
moz-do-not-send="true" href="mailto:kenneth.waegeman@ugent.be"><font
color="blue" size="3"><u><a class="moz-txt-link-rfc2396E" href="mailto:kenneth.waegeman@ugent.be"><kenneth.waegeman@ugent.be></a></u></font></a><font
size="3">wrote: </font>
<p><font size="3">Hi, </font></p>
<p><font size="3">We are running a test setup with 2 NSD Servers
backed by
4 Dell Powervaults MD3460s. nsd00 is primary serving LUNS of
controller
A of the 4 powervaults, nsd02 is primary serving LUNS of
controller B.
</font></p>
<p><font size="3">We are testing from 2 testing machines connected
to the
nsds with infiniband, verbs enabled.</font></p>
<p><font size="3">When we do dd from the NSD servers, we see
indeed performance
going to 5.8GB/s for one nsd, 7.2GB/s for the two! So it looks
like GPFS
is able to get the data at a decent speed. Since we can write
from the
clients at a good speed, I didn't suspect the communication
between clients
and nsds being the issue, especially since total performance
stays the
same using 1 or multiple clients. <br>
<br>
I'll use the nsdperf tool to see if we can find anything, <br>
<br>
thanks!<br>
<br>
K<br>
<br>
On 20/04/17 17:04, Knister, Aaron S. (GSFC-606.2)[COMPUTER
SCIENCE CORP]
wrote:<br>
Interesting. Could you share a little more about your
architecture? Is
it possible to mount the fs on an NSD server and do some dd's
from the
fs on the NSD server? If that gives you decent performance
perhaps try
NSDPERF next </font><a moz-do-not-send="true"
href="https://www.ibm.com/developerworks/community/wikis/home?lang=en#%21/wiki/General+Parallel+File+System+%28GPFS%29/page/Testing+network+performance+with+nsdperf"><font
color="blue" size="3"><u>https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General+Parallel+File+System+(GPFS)/page/Testing+network+performance+with+nsdperf</u></font></a><font
size="3"><br>
<br>
-Aaron<br>
<br>
<br>
<br>
<br>
On April 20, 2017 at 10:53:47 EDT, Kenneth Waegeman </font><a
moz-do-not-send="true" href="mailto:kenneth.waegeman@ugent.be"><font
color="blue" size="3"><u><a class="moz-txt-link-rfc2396E" href="mailto:kenneth.waegeman@ugent.be"><kenneth.waegeman@ugent.be></a></u></font></a><font
size="3">wrote:</font></p>
<p><font size="3">Hi,</font></p>
<p><font size="3">Having an issue that looks the same as this one:
</font></p>
<p><font size="3">We can do sequential writes to the filesystem at
7,8 GB/s
total , which is the expected speed for our current storage
<br>
backend. While we have even better performance with
sequential reads
on raw storage LUNS, using GPFS we can only reach 1GB/s in
total (each
nsd server seems limited by 0,5GB/s) independent of the number
of clients
<br>
(1,2,4,..) or ways we tested (fio,dd). We played with blockdev
params,
MaxMBps, PrefetchThreads, hyperthreading, c1e/cstates, .. as
discussed
in this thread, but nothing seems to impact this read
performance. </font></p>
<p><font size="3">Any ideas?</font></p>
<p><font size="3">Thanks!<br>
<br>
Kenneth<br>
<br>
On 17/02/17 19:29, Jan-Frode Myklebust wrote:<br>
I just had a similar experience from a sandisk infiniflash
system SAS-attached
to s single host. Gpfsperf reported 3,2 Gbyte/s for writes.
and 250-300
Mbyte/s on sequential reads!! Random reads were on the order
of 2 Gbyte/s.<br>
<br>
After a bit head scratching snd fumbling around I found out
that reducing
maxMBpS from 10000 to 100 fixed the problem! Digging further I
found that
reducing prefetchThreads from default=72 to 32 also fixed it,
while leaving
maxMBpS at 10000. Can now also read at 3,2 GByte/s.<br>
<br>
Could something like this be the problem on your box as well?<br>
<br>
<br>
<br>
-jf<br>
fre. 17. feb. 2017 kl. 18.13 skrev Aaron Knister <</font><a
moz-do-not-send="true" href="mailto:aaron.s.knister@nasa.gov"><font
color="blue" size="3"><u><a class="moz-txt-link-abbreviated" href="mailto:aaron.s.knister@nasa.gov">aaron.s.knister@nasa.gov</a></u></font></a><font
size="3">>:<br>
Well, I'm somewhat scrounging for hardware. This is in our
test<br>
environment :) And yep, it's got the 2U gpu-tray in it
although even<br>
without the riser it has 2 PCIe slots onboard (excluding the
on-board<br>
dual-port mezz card) so I think it would make a fine NSD
server even<br>
without the riser.<br>
<br>
-Aaron<br>
<br>
On 2/17/17 11:43 AM, Simon Thompson (Research Computing - IT
Services)<br>
wrote:<br>
> Maybe its related to interrupt handlers somehow? You
drive the load
up on one socket, you push all the interrupt handling to the
other socket
where the fabric card is attached?<br>
><br>
> Dunno ... (Though I am intrigued you use idataplex nodes
as NSD servers,
I assume its some 2U gpu-tray riser one or something !)<br>
><br>
> Simon<br>
> ________________________________________<br>
> From: </font><a moz-do-not-send="true"
href="mailto:gpfsug-discuss-bounces@spectrumscale.org"
target="_blank"><font color="blue" size="3"><u>gpfsug-discuss-bounces@spectrumscale.org</u></font></a><font
size="3">[</font><a moz-do-not-send="true"
href="mailto:gpfsug-discuss-bounces@spectrumscale.org"><font
color="blue" size="3"><u>gpfsug-discuss-bounces@spectrumscale.org</u></font></a><font
size="3">]
on behalf of Aaron Knister [</font><a moz-do-not-send="true"
href="mailto:aaron.s.knister@nasa.gov"><font color="blue"
size="3"><u>aaron.s.knister@nasa.gov</u></font></a><font
size="3">]<br>
> Sent: 17 February 2017 15:52<br>
> To: gpfsug main discussion list<br>
> Subject: [gpfsug-discuss] bizarre performance behavior<br>
><br>
> This is a good one. I've got an NSD server with 4x 16GB
fibre<br>
> connections coming in and 1x FDR10 and 1x QDR connection
going out
to<br>
> the clients. I was having a really hard time getting
anything resembling<br>
> sensible performance out of it (4-5Gb/s writes but maybe
1.2Gb/s for<br>
> reads). The back-end is a DDN SFA12K and I *know* it can
do better
than<br>
> that.<br>
><br>
> I don't remember quite how I figured this out but simply
by running<br>
> "openssl speed -multi 16" on the nsd server to drive up
the load I saw<br>
> an almost 4x performance jump which is pretty much goes
against every<br>
> sysadmin fiber in me (i.e. "drive up the cpu load with
unrelated
crap to<br>
> quadruple your i/o performance").<br>
><br>
> This feels like some type of C-states frequency scaling
shenanigans
that<br>
> I haven't quite ironed down yet. I booted the box with
the following<br>
> kernel parameters "intel_idle.max_cstate=0
processor.max_cstate=0"
which<br>
> didn't seem to make much of a difference. I also tried
setting the<br>
> frequency governer to userspace and setting the minimum
frequency
to<br>
> 2.6ghz (it's a 2.6ghz cpu). None of that really matters--
I still
have<br>
> to run something to drive up the CPU load and then
performance improves.<br>
><br>
> I'm wondering if this could be an issue with the C1E
state? I'm curious<br>
> if anyone has seen anything like this. The node is a
dx360 M4<br>
> (Sandybridge) with 16 2.6GHz cores and 32GB of RAM.<br>
><br>
> -Aaron<br>
><br>
> --<br>
> Aaron Knister<br>
> NASA Center for Climate Simulation (Code 606.2)<br>
> Goddard Space Flight Center<br>
> (301) 286-2776<br>
> _______________________________________________<br>
> gpfsug-discuss mailing list<br>
> gpfsug-discuss at </font><a moz-do-not-send="true"
href="http://spectrumscale.org/" target="_blank"><font
color="blue" size="3"><u>spectrumscale.org</u></font></a><font
size="3"><br>
> </font><a moz-do-not-send="true"
href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss"
target="_blank"><font color="blue" size="3"><u>http://gpfsug.org/mailman/listinfo/gpfsug-discuss</u></font></a><font
size="3"><br>
> _______________________________________________<br>
> gpfsug-discuss mailing list<br>
> gpfsug-discuss at </font><a moz-do-not-send="true"
href="http://spectrumscale.org/" target="_blank"><font
color="blue" size="3"><u>spectrumscale.org</u></font></a><font
size="3"><br>
> </font><a moz-do-not-send="true"
href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss"
target="_blank"><font color="blue" size="3"><u>http://gpfsug.org/mailman/listinfo/gpfsug-discuss</u></font></a><font
size="3"><br>
><br>
<br>
--<br>
Aaron Knister<br>
NASA Center for Climate Simulation (Code 606.2)<br>
Goddard Space Flight Center<br>
(301) 286-2776<br>
_______________________________________________<br>
gpfsug-discuss mailing list<br>
gpfsug-discuss at </font><a moz-do-not-send="true"
href="http://spectrumscale.org/" target="_blank"><font
color="blue" size="3"><u>spectrumscale.org</u></font></a><font
color="blue" size="3"><u><br>
</u></font><a moz-do-not-send="true"
href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss"
target="_blank"><font color="blue" size="3"><u>http://gpfsug.org/mailman/listinfo/gpfsug-discuss</u></font></a><font
size="3"><br>
<br>
</font><tt><font size="3"><br>
_______________________________________________<br>
gpfsug-discuss mailing list<br>
gpfsug-discuss at spectrumscale.org</font></tt><font
color="blue" size="3"><u><br>
</u></font><a moz-do-not-send="true"
href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss"><tt><font
color="blue" size="3"><u>http://gpfsug.org/mailman/listinfo/gpfsug-discuss</u></font></tt></a><font
size="3"><br>
<br>
<br>
<br>
</font><tt><font size="3"><br>
_______________________________________________<br>
gpfsug-discuss mailing list<br>
gpfsug-discuss at spectrumscale.org</font></tt><font
color="blue" size="3"><u><br>
</u></font><a moz-do-not-send="true"
href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss"><tt><font
color="blue" size="3"><u>http://gpfsug.org/mailman/listinfo/gpfsug-discuss</u></font></tt></a><font
size="3"><br>
</font><tt><font size="2"><br>
_______________________________________________<br>
gpfsug-discuss mailing list<br>
gpfsug-discuss at spectrumscale.org</font></tt><font
color="blue" size="3"><u><br>
</u></font><a moz-do-not-send="true"
href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss"><tt><font
color="blue" size="2"><u>http://gpfsug.org/mailman/listinfo/gpfsug-discuss</u></font></tt></a><font
size="3"><br>
<br>
<br>
</font></p>
<p><br>
<tt><font size="3">_______________________________________________<br>
gpfsug-discuss mailing list<br>
gpfsug-discuss at spectrumscale.org<br>
</font></tt><a moz-do-not-send="true"
href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss"><tt><font
color="blue" size="3"><u>http://gpfsug.org/mailman/listinfo/gpfsug-discuss</u></font></tt></a><tt><font
size="3"><br>
</font></tt><br>
<tt><font size="2">_______________________________________________<br>
gpfsug-discuss mailing list<br>
gpfsug-discuss at spectrumscale.org<br>
</font></tt><a moz-do-not-send="true"
href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss"><tt><font
size="2">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</font></tt></a><tt><font
size="2"><br>
</font></tt><br>
<br>
<br>
</p>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
<a class="moz-txt-link-freetext" href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a>
</pre>
</blockquote>
<br>
</body>
</html>