[gpfsug-discuss] mmnetverify

Bill Owen billowen at us.ibm.com
Fri Mar 31 19:23:35 BST 2017


mmnetverify is a new tool that aims to make it easier to identify network
problems.

Regarding bandwidth commands, in 4.2.2, there are two options:
   mmnetverify bandwidth-node   -  [1 to 1] this will communicate from
   local node (or one or more nodes specified with -N option) to one or
   more target nodes.  The bandwidth tests are executed serially from nodes
   in node list to target, iterating through each target node, one by one.
   The serially calculated bandwidth with each node is reported.
   mmnetverify bandwidth-cluster -  [1 to many] this is a measure of
   parallel communication from the local node  (or one or more nodes
   specified with -N option)  to all of the other nodes in the cluster.
   The concurrent bandwidth with each target node in the cluster is
   reported.

In both of these tests, we establish a socket connection, and pass a fixed
number of bytes over the connection and calculate bandwidth based on how
long that transmission took.

For 4.2.3, there is a new bandwidth test called gnr-bandwidth.  It is
similar to the bandwidth-cluster [1 to many] except that it uses the
following steps:
1.  establish connection from node to all other target nodes in the cluster
2.  start sending data to target for some ramp up period
3.  after ramp up period, continue sending data for test period
4.  calculate bandwidth based on bytes transmitted during test period

The bandwidth to each node is summed to return a total bandwidth from the
command node to the other nodes in the cluster.

In future releases, we may modify bandwidth-node & bandwidth-cluster tests
to use the gnr-bandwidth methodology (and deprecate gnr-bandwidth).  Your
feedback on how to improve mmnetverify is appreciated.

Regarding:
> We found some weird looking numbers that i don't quite understand and not
in the places we might expect.
> For example between hosts on the same switch, traffic flowing to another
switch and traffic flowing to
> nodes in another data centre where it's several switch hops. Some nodes
over there were significantly
> faster than switch local nodes.
Note that system load can impact the test results.  Is it possible that the
slow nodes on the local switch were heavily loaded?   Or is it possible
they are using an interface that is lower bandwidth?  (sorry, i had to ask
that one to be sure...)

Regards,
Bill Owen
billowen at us.ibm.com
Spectrum Scale Development
520-799-4829




From:	"Simon Thompson (Research Computing - IT Services)"
            <S.J.Thompson at bham.ac.uk>
To:	gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:	03/17/2017 01:13 PM
Subject:	Re: [gpfsug-discuss] mmnetverify
Sent by:	gpfsug-discuss-bounces at spectrumscale.org



It looks to run sequential tests to each node one at a time and isn't using
NSD protocol but echo server.

We found some weird looking numbers that i don't quite understand and not
in the places we might expect. For example between hosts on the same
switch, traffic flowing to another switch and traffic flowing to nodes in
another data centre where it's several switch hops. Some nodes over there
were significantly faster than switch local nodes.

I think it was only added in 4.2.2 and is listed as "not yet a replacement
for nsdperf". I get that is different as it's using NSD protocol, but was
struggling a bit with what mmnetverify might be doing.

Simon

From: gpfsug-discuss-bounces at spectrumscale.org
[gpfsug-discuss-bounces at spectrumscale.org] on behalf of Sanchez, Paul
[Paul.Sanchez at deshaw.com]
Sent: 17 March 2017 19:43
To: gpfsug-discuss at spectrumscale.org
Subject: Re: [gpfsug-discuss] mmnetverify

Sven will tell you: "RPC isn't streaming" and that may account for the
discrepancy.  If the tests are doing any "fan-in" where multiple nodes are
sending to single node, then it's also possible that you are exhausting
switch buffer memory in a way that a 1:1 iperf wouldn't.

For our internal benchmarking we've used /usr/lpp/mmfs/samples/net/nsdperf
to more closely estimate the real performance.  I haven't played with
mmnetverify yet though.

-Paul

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org [
mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Simon
Thompson (Research Computing - IT Services)
Sent: Friday, March 17, 2017 2:50 PM
To: gpfsug-discuss at spectrumscale.org
Subject: [gpfsug-discuss] mmnetverify

Hi all,

Just wondering if anyone has used the mmnetverify tool at all?

Having made some changes to our internal L3 routing this week, I was
interested to see what it claimed.

As a side-note, it picked up some DNS resolution issues, though I'm not
clear on some of those why it was claiming this as doing a "dig" on the
node, it resolved fine (but adding the NSD servers to the hosts files
cleared the error).

Its actually the bandwidth tests that I'm interested in hearing other
people's experience with as the numbers that some out from it are very
different (lower) than if we use iperf to test performance between two
nodes.

Anyone any thoughts at all on this?

Thanks
Simon

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170331/df638a55/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170331/df638a55/attachment-0002.gif>


More information about the gpfsug-discuss mailing list