<div dir="ltr">I spent a good deal of time exploring this topic when I was at IBM. I think there are two key aspects here; the congestion of the actual interfaces on the [cluster, FS, token] management nodes and competition for other resources like CPU cycles on those nodes.  When using a single Ethernet interface (or for that matter IB RDMA + IPoIB over the same interface), at some point the two kinds of traffic begin to conflict. The management traffic being much more time sensitive suffers as a result.  One solution is to separate the traffic.  For larger clusters though (1000s of nodes), a better solution, that may avoid having to have a 2nd interface on every client node, is to add dedicated nodes as managers and not rely on NSD servers for this.  It does cost you some modest servers and GPFS server licenses.  My previous client generally used previous-generation retired compute nodes for this job. <br><div><div class="gmail_extra"><br><div class="gmail_quote">Scott <br><br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Date: Mon, 13 Jul 2015 15:25:32 +0100<br>

From: Vic Cornell <<a href="mailto:viccornell@gmail.com">viccornell@gmail.com</a>><br>

Subject: Re: [gpfsug-discuss] data interface and management infercace.<br>

<br>

Hi Salvatore,<br>

<br>

I agree that that is what the manual - and some of the wiki entries say.<br>

<br>

However , when we have had problems (typically congestion) with ethernet networks in the past (20GbE or 40GbE) we have resolved them by setting up a separate ?Admin? network.<br>

<br>

The before and after cluster health we have seen measured in number of expels and waiters has been very marked.<br>

<br>

Maybe someone ?in the know? could comment on this split.<br>

<br>

Regards,<br>

<br>

Vic<br>

<br>

<br></blockquote></div><br></div></div></div>