<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div></div><div>Hi,</div><div><br></div><div>Yes having separate data and management networks has been critical for us for keeping health monitoring/communication unimpeded by data movement.</div><div><br></div><div>Not as important, but you can also tune the networks differently (packet sizes, buffer sizes, SAK, etc) which can help.</div><div><br></div><div>Jason</div><div><br>On Jul 13, 2015, at 7:25 AM, Vic Cornell <<a href="mailto:viccornell@gmail.com">viccornell@gmail.com</a>> wrote:<br><br></div><blockquote type="cite"><div><meta http-equiv="Content-Type" content="text/html charset=windows-1252">Hi Salvatore,<div class=""><br class=""></div><div class="">I agree that that is what the manual - and some of the wiki entries say.</div><div class=""><br class=""></div><div class="">However , when we have had problems (typically congestion) with ethernet networks in the past (20GbE or 40GbE) we have resolved them by setting up a separate “Admin” network.</div><div class=""><br class=""></div><div class="">The before and after cluster health we have seen measured in number of expels and waiters has been very marked.</div><div class=""><br class=""></div><div class="">Maybe someone “in the know” could comment on this split.</div><div class=""><br class=""></div><div class="">Regards,</div><div class=""><br class=""></div><div class="">Vic</div><div class=""><br class="">
<br class=""><div><blockquote type="cite" class=""><div class="">On 13 Jul 2015, at 14:29, Salvatore Di Nardo <<a href="mailto:sdinardo@ebi.ac.uk" class="">sdinardo@ebi.ac.uk</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">
<meta content="text/html; charset=windows-1252" http-equiv="Content-Type" class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<font size="-1" class="">Hello Vic.<br class="">
We are currently draining our gpfs to do all the recabling to add
a management network, but looking what the admin interface does (
man mmchnode ) it says something different:<br class="">
<br class="">
</font>
<blockquote class="">
<blockquote class=""><big class=""><font size="-1" class=""><big class=""><tt class="">--admin-interface={hostname
| ip_address}</tt></big></font></big><br class="">
<big class=""><font size="-1" class=""><big class=""><tt class=""> Specifies
the name of the node to be used by GPFS administration
commands when communicating between nodes. The admin
node name must be specified as an IP</tt></big></font></big><br class="">
<big class=""><font size="-1" class=""><big class=""><tt class=""> address
or a hostname that is resolved by the host command to
the desired IP address. If the keyword DEFAULT is
specified, the admin interface for the</tt></big></font></big><br class="">
<big class=""><font size="-1" class=""><big class=""><tt class=""> node is
set to be equal to the daemon interface for the node.</tt></big></font></big><br class="">
</blockquote>
</blockquote>
<big class=""><font size="-1" class=""><big class=""><tt class=""></tt></big></font></big><font size="-1" class=""><br class="">
So, seems used only for commands propagation, hence have nothing
to do with the node-to-node traffic. Infact the other interface
description is:<br class="">
</font><big class=""><font size="-1" class=""><big class=""><tt class=""><br class="">
</tt></big></font></big>
<blockquote class="">
<blockquote class=""><big class=""><font size="-1" class=""><big class=""><tt class=""> --daemon-interface={hostname
| ip_address}</tt></big></font></big><br class="">
<big class=""><font size="-1" class=""><big class=""><tt class=""> Specifies
the host name or IP address </tt><tt class=""><u class=""><b class="">to be used
by the GPFS daemons for node-to-node communication</b></u></tt><tt class="">.
The host name or IP address must refer to the commu-</tt></big></font></big><br class="">
<big class=""><font size="-1" class=""><big class=""><tt class=""> nication
adapter over which the GPFS daemons communicate. Alias
interfaces are not allowed. Use the original address or
a name that is resolved by the</tt></big></font></big><br class="">
<big class=""><font size="-1" class=""><big class=""><tt class=""> host
command to that original address.</tt></big></font></big></blockquote>
</blockquote>
<big class=""><font size="-1" class=""></font></big><font size="-1" class=""><br class="">
The "expired lease" issue and file locking mechanism a( most of
our expells happens when 2 clients try to write in the same file)
are exactly node-to node-comunication, so im wondering what's the
point to separate the "admin network". I want to be sure to plan
the right changes before we do a so massive task. We are talking
about adding a new interface on 700 clients, so the recabling work
its not small. <br class="">
<br class="">
<br class="">
Regards,<br class="">
Salvatore<br class="">
<br class="">
<br class="">
</font><br class="">
<div class="moz-cite-prefix">On 13/07/15 14:00, Vic Cornell wrote:<br class="">
</div>
<blockquote cite="mid:5269A4E9-416B-4D70-AAE0-B86042FC96B9@ddn.com" type="cite" class="">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252" class="">
Hi Salavatore,
<div class=""><br class="">
</div>
<div class=""><span class="Apple-tab-span" style="white-space:
pre;"></span>Does your GSS have the facility for a 1GbE
“management” network? If so I think that changing the “admin”
node names of the cluster members to a set of IPs on the
management network would give you the split that you need.</div>
<div class=""><br class="">
</div>
<div class="">What about the clients? Can they also connect to a
separate admin network?</div>
<div class=""><br class="">
</div>
<div class="">Remember that if you are using multi-cluster all of
the nodes in both networks must share the same admin network.</div>
<div class="">
<div apple-content-edited="true" class="">
<div class=""><br class="">
</div>
</div>
</div>
<div apple-content-edited="true" class="">
<div style="font-family: Helvetica; font-size: 14px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">
<span class="">Kind Regards,</span></div>
<div style="font-family: Helvetica; font-size: 14px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">
<span class=""><br class="">
</span></div>
<div style="font-family: Helvetica; font-size: 14px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">
<span class="">Vic</span></div>
<span style="font-family: Helvetica; font-size: 14px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class="Apple-interchange-newline">
</span><span class=""></span> </div>
<br class="">
<div class="">
<blockquote type="cite" class="">
<div class="">On 13 Jul 2015, at 13:31, Salvatore Di Nardo
<<a moz-do-not-send="true" href="mailto:sdinardo@ebi.ac.uk" class="">sdinardo@ebi.ac.uk</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class=""><font class="" size="-1">Anyone? </font>
<br class="">
<br class="">
<div class="moz-cite-prefix">On 10/07/15 11:07, Salvatore
Di Nardo wrote:<br class="">
</div>
<blockquote cite="mid:559F9960.7010509@ebi.ac.uk" type="cite" class=""><font class="" size="-1">Hello
guys.<br class="">
Quite a while ago i mentioned that we have a big
expel issue on our gss ( first gen) and white a lot
people suggested that the root cause could be that we
use the same interface for all the traffic, and that
we should split the data network from the admin
network. Finally we could plan a downtime and we are
migrating the data out so, i can soon safelly play
with the change, but looking what exactly i should to
do i'm a bit puzzled. Our mmlscluster looks like this:<br class="">
<br class="">
</font>
<blockquote class="">
<blockquote class="">
<blockquote class=""><tt class=""><font class="" size="-1">GPFS cluster information</font></tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1">========================</font></tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1"> GPFS
cluster name: <a moz-do-not-send="true" href="http://gss.ebi.ac.uk/" class="">
GSS.ebi.ac.uk</a></font></tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1"> GPFS
cluster id: 17987981184946329605</font></tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1"> GPFS
UID domain: <a moz-do-not-send="true" href="http://gss.ebi.ac.uk/" class="">
GSS.ebi.ac.uk</a></font></tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1">
Remote shell command: /usr/bin/ssh</font></tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1">
Remote file copy command: /usr/bin/scp</font></tt><tt class=""><br class="">
</tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1">GPFS
cluster configuration servers:</font></tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1">-----------------------------------</font></tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1">
Primary server: <a moz-do-not-send="true" href="http://gss01a.ebi.ac.uk/" class="">
gss01a.ebi.ac.uk</a></font></tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1">
Secondary server: <a moz-do-not-send="true" href="http://gss02b.ebi.ac.uk/" class="">
gss02b.ebi.ac.uk</a></font></tt><tt class=""><br class="">
</tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1"> Node
Daemon node name IP address Admin node
name Designation</font></tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1">-----------------------------------------------------------------------</font></tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1"> 1
<a moz-do-not-send="true" href="http://gss01a.ebi.ac.uk/" class="">
gss01a.ebi.ac.uk</a> 10.7.28.2 <a moz-do-not-send="true" href="http://gss01a.ebi.ac.uk/" class="">gss01a.ebi.ac.uk</a>
quorum-manager</font></tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1"> 2
<a moz-do-not-send="true" href="http://gss01b.ebi.ac.uk/" class="">
gss01b.ebi.ac.uk</a> 10.7.28.3 <a moz-do-not-send="true" href="http://gss01b.ebi.ac.uk/" class="">gss01b.ebi.ac.uk</a>
quorum-manager</font></tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1"> 3
<a moz-do-not-send="true" href="http://gss02a.ebi.ac.uk/" class="">
gss02a.ebi.ac.uk</a> 10.7.28.67 <a moz-do-not-send="true" href="http://gss02a.ebi.ac.uk/" class="">gss02a.ebi.ac.uk</a>
quorum-manager</font></tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1"> 4
<a moz-do-not-send="true" href="http://gss02b.ebi.ac.uk/" class="">
gss02b.ebi.ac.uk</a> 10.7.28.66 <a moz-do-not-send="true" href="http://gss02b.ebi.ac.uk/" class="">gss02b.ebi.ac.uk</a>
quorum-manager</font></tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1"> 5
<a moz-do-not-send="true" href="http://gss03a.ebi.ac.uk/" class="">
gss03a.ebi.ac.uk</a> 10.7.28.34 <a moz-do-not-send="true" href="http://gss03a.ebi.ac.uk/" class="">gss03a.ebi.ac.uk</a>
quorum-manager</font></tt><tt class=""><br class="">
</tt><tt class=""><font class="" size="-1"> 6
<a moz-do-not-send="true" href="http://gss03b.ebi.ac.uk/" class="">
gss03b.ebi.ac.uk</a> 10.7.28.35 <a moz-do-not-send="true" href="http://gss03b.ebi.ac.uk/" class="">gss03b.ebi.ac.uk</a>
quorum-manager</font></tt><tt class=""><br class="">
</tt></blockquote>
</blockquote>
</blockquote>
<font class="" size="-1"><br class="">
It was my understanding that the "admin node" should
use a different interface ( a 1g link copper should be
fine), while the daemon node is where the data was
passing , so should point to the bonded 10g
interfaces. but when i read the mmchnode man page i
start to be quite confused. It says:<br class="">
<br class="">
</font><font class="" size="-1"><tt class="">
--daemon-interface={hostname |
ip_address}</tt><tt class=""><br class="">
</tt><tt class=""> Specifies
the host name or IP address
<u class=""><b class="">to be used by the GPFS
daemons for node-to-node communication</b></u>.
The host name or IP address must refer to the
communication adapter over which the GPFS daemons
communicate.
<br class="">
Alias interfaces are not
allowed. Use the original address or a name that is
resolved by the host command to that original
address.</tt><tt class="">
</tt><tt class=""><br class="">
</tt><tt class=""> </tt><tt class=""><br class="">
</tt><tt class="">
--admin-interface={hostname | ip_address}</tt><tt class=""><br class="">
</tt><tt class=""> Specifies
the name of the node to be used by GPFS
administration commands when communicating between
nodes. The admin node name must be specified as an
IP address or a hostname that is resolved by the
host command
<br class="">
to</tt><tt class=""> </tt><tt class="">the desired IP address. If the keyword
DEFAULT is specified, the admin interface for the
node is set to be equal to the daemon interface for
the node.</tt><tt class=""><br class="">
</tt></font><font class="" size="-1"><br class="">
What exactly means "node-to node-communications" ? <br class="">
Means DATA or also the "lease renew", and the token
communication between the clients to get/steal the
locks to be able to manage concurrent write to thr
same file?
<br class="">
Since we are getting expells ( especially when several
clients contends the same file ) i assumed i have to
split this type of packages from the data stream, but
reading the documentation it looks to me that those
internal comunication between nodes use the
daemon-interface wich i suppose are used also for the
data. so HOW exactly i can split them?<br class="">
</font><font class="" size="-1"><br class="">
</font><font class="" size="-1"><br class="">
Thanks in advance,<br class="">
Salvatore<br class="">
</font><font class="" size="-1"><br class="">
</font><br class="">
<fieldset class="mimeAttachmentHeader"></fieldset>
<br class="">
<pre class="" wrap="">_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at <a moz-do-not-send="true" href="http://gpfsug.org/" class="">gpfsug.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a>
</pre>
</blockquote>
<br class="">
</div>
_______________________________________________<br class="">
gpfsug-discuss mailing list<br class="">
gpfsug-discuss at <a moz-do-not-send="true" href="http://gpfsug.org/" class="">gpfsug.org</a><br class="">
<a moz-do-not-send="true" href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" class="">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a><br class="">
</div>
</blockquote>
</div>
<br class="">
</blockquote>
<br class="">
</div>
_______________________________________________<br class="">gpfsug-discuss mailing list<br class="">gpfsug-discuss at <a href="http://gpfsug.org" class="">gpfsug.org</a><br class=""><a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" class="">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a><br class=""></div></blockquote></div><br class=""></div></div></blockquote><blockquote type="cite"><div><span>_______________________________________________</span><br><span>gpfsug-discuss mailing list</span><br><span>gpfsug-discuss at <a href="http://gpfsug.org">gpfsug.org</a></span><br><span><a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a></span><br></div></blockquote></body></html>