<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">I have a feeling that this is how mmchconfig is supposed to work. You’ve asked it to change the<div class="">configuration of one node, but the database of configuration settings needs to be propagated to</div><div class="">the entire cluster whenever a change is made.  You’ll find a section in the mmlsconfig output specific</div><div class="">to the node(s) that have been changed [node155] …. At this point your configuration may be out of</div><div class="">sync on any number of nodes.</div><div class=""><br class=""></div><div class=""> — ddj</div><div class="">Dave Johnson</div><div class="">Brown University CCV/CIS</div><div class=""><br class=""></div><div class=""><div><blockquote type="cite" class=""><div class="">On Feb 22, 2017, at 10:57 AM, Douglas Duckworth <<a href="mailto:dod2014@med.cornell.edu" class="">dod2014@med.cornell.edu</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Hello!<div class=""><br class=""></div><div class="">I am an HPC admin at Weill Cornell Medicine in the Upper East Side of Manhattan.  It's a great place with researchers working in many computationally demanding fields.  I am asked to do many new things all of the time so it's never boring.  Yesterday we deployed a server that's intended to create atomic-level image of a ribosome.  Pretty serious science!<br class=""><br class=""></div><div class="">We have two DDN GridScaler GPFS clusters with around 3PB of storage.  FDR Infiniband provides the interconnect.  Our compute nodes are Dell PowerEdge 12/13G servers running Centos 6 and 7 while we're using SGE for scheduling.  Hopefully soon Slurm.  We also have some GPU servers from Pengiun Computing, with GTX 1080s, as well a new Ryft FPGA accelerator.  I am hoping our next round of computing power will come from AMD...</div><div class=""><br class=""></div><div class="">Anyway, I've been using Ansible to deploy our new GPFS nodes as well as build all other things we need at WCM.  I thought that this was complete.  However, apparently, the GPFS client's been trying RDMA over port mlx4_0/2 though we need to use mlx4_0/1!  Rather than running mmchconfig against the entire cluster, I have been trying it locally on the node that needs to be addressed.  For example:</div><div class=""><br class=""></div><div class="">sudo mmchconfig verbsPorts=mlx4_0/1 -i -N node155<br class=""></div><div class=""><br class=""></div><div class="">When ran locally the desired change becomes permanent and we see RDMA active after restarting GPFS service on node.  Though mmchconfig still tries to run against all nodes in the cluster!  I kill it of course at the known_hosts step.</div><div class=""><br class=""></div><div class="">In addition I tried:</div><div class=""><br class=""></div><div class="">sudo mmchconfig verbsPorts=mlx4_0/1 -i -N node155 NodeClass=localhost<br class=""></div><div class=""><br class=""></div><div class="">However the same result.</div><div class=""><br class=""></div><div class="">When doing capital "i" mmchconfig does attempt ssh with all nodes.  Yet the change does not persist after restarting GPFS.</div><div class=""><br class=""></div><div class="">So far I consulted the following documentation:</div><div class=""><br class=""></div><div class=""><div class=""><a href="http://ibm.co/2mcjK3P" class="">http://ibm.co/2mcjK3P</a><br class=""></div></div><div class=""><a href="http://ibm.co/2lFSInH" class="">http://ibm.co/2lFSInH</a><br class=""></div><div class=""><br class=""></div><div class="">Could anyone please help?</div><div class=""><br class=""></div><div class="">We're using GPFS client version 4.1.1-3 on Centos 6 nodes as well as 4.2.1-2 on those which are running Centos 7.</div><div class=""><br class=""></div><div class="">Thanks so much!</div><div class=""><br class=""></div><div class="">Best</div><div class="">Doug</div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><div class=""><div class="gmail_signature"><div dir="ltr" class=""><div class=""><div dir="ltr" class=""><div class=""><div dir="ltr" class=""><div class=""><div dir="ltr" class=""><div style="font-size:small" class=""><div class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class="">Thanks,</div><div dir="ltr" class=""><br class="">Douglas Duckworth, MSc, LFCS<br class="">HPC System Administrator<br class=""><span style="font-size:12.8px" class="">Scientific Computing Unit</span><br class=""></div><div dir="ltr" class="">Physiology and Biophysics</div><div dir="ltr" class="">Weill Cornell Medicine<div class="">E: <a href="mailto:doug@med.cornell.edu" target="_blank" class="">doug@med.cornell.edu</a><br class="">O: 212-746-6305<br class="">F: 212-746-8690</div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div>

</div></div>

_______________________________________________<br class="">gpfsug-discuss mailing list<br class="">gpfsug-discuss at <a href="http://spectrumscale.org" class="">spectrumscale.org</a><br class=""><a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" class="">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a><br class=""></div></blockquote></div><br class=""></div></body></html>