<font size=2 face="sans-serif">Hi ...same thing here.. everything after

10 nodes will be truncated.. </font><br><font size=2 face="sans-serif">though I don't have an issue with it

... I 'll open a PMR .. and I recommend you to do the same thing.. ;-)

</font><br><br><font size=2 face="sans-serif">the reason seems simple.. it is the

<i>"| tail" .</i>at the end of the command.. .. which truncates

the output to the last 10 items... </font><br><br><font size=2 face="sans-serif">should be easy to fix.. </font><br><font size=2 face="sans-serif">cheers</font><br><font size=2 face="sans-serif">olaf</font><br><br><br><br><br><br><font size=1 color=#5f5f5f face="sans-serif">From:      

 </font><font size=1 face="sans-serif">Jonathon A Anderson

<jonathon.anderson@colorado.edu></font><br><font size=1 color=#5f5f5f face="sans-serif">To:      

 </font><font size=1 face="sans-serif">"gpfsug-discuss@spectrumscale.org"

<gpfsug-discuss@spectrumscale.org></font><br><font size=1 color=#5f5f5f face="sans-serif">Date:      

 </font><font size=1 face="sans-serif">01/30/2017 11:11 PM</font><br><font size=1 color=#5f5f5f face="sans-serif">Subject:    

   </font><font size=1 face="sans-serif">Re: [gpfsug-discuss]

CES doesn't assign addresses to nodes</font><br><font size=1 color=#5f5f5f face="sans-serif">Sent by:    

   </font><font size=1 face="sans-serif">gpfsug-discuss-bounces@spectrumscale.org</font><br><hr noshade><br><br><br><tt><font size=2>In trying to figure this out on my own, I’m relatively

certain I’ve found a bug in GPFS related to the truncation of output from

`tsctl shownodes up`. Any chance someone in development can confirm?<br><br><br>Here are the details of my investigation:<br><br><br>## GPFS is up on sgate2<br><br>[root@sgate2 ~]# mmgetstate<br><br> Node number  Node name        GPFS state <br>------------------------------------------<br>     414      sgate2-opa      

active<br><br><br>## but if I tell ces to explicitly put one of our ces addresses on that

node, it says that GPFS is down<br><br>[root@sgate2 ~]# mmces address move --ces-ip 10.225.71.102 --ces-node sgate2-opa<br>mmces address move: GPFS is down on this node.<br>mmces address move: Command failed. Examine previous error messages to

determine cause.<br><br><br>## the “GPFS is down on this node” message is defined as code 109 in

mmglobfuncs<br><br>[root@sgate2 ~]# grep --before-context=1 "GPFS is down on this node."

/usr/lpp/mmfs/bin/mmglobfuncs<br>    109 ) msgTxt=\<br>"%s: GPFS is down on this node."<br><br><br>## and is generated by printErrorMsg in mmcesnetmvaddress when it detects

that the current node is identified as “down” by getDownCesNodeList<br><br>[root@sgate2 ~]# grep --before-context=5 'printErrorMsg 109' /usr/lpp/mmfs/bin/mmcesnetmvaddress<br>  downNodeList=$(getDownCesNodeList)<br>  for downNode in $downNodeList<br>  do<br>    if [[ $toNodeName == $downNode ]]<br>    then<br>      printErrorMsg 109 "$mmcmd"<br><br><br>## getDownCesNodeList is the intersection of all ces nodes with GPFS cluster

nodes listed in `tsctl shownodes up`<br><br>[root@sgate2 ~]# grep --after-context=16 '^function getDownCesNodeList'

/usr/lpp/mmfs/bin/mmcesfuncs<br>function getDownCesNodeList<br>{<br>  typeset sourceFile="mmcesfuncs.sh"<br>  [[ -n $DEBUG || -n $DEBUGgetDownCesNodeList ]] && set -x<br>  $mmTRACE_ENTER "$*"<br><br>  typeset upnodefile=${cmdTmpDir}upnodefile<br>  typeset downNodeList<br><br>  # get all CES nodes<br>  $sort -o $nodefile $mmfsCesNodes.dae<br><br>  $tsctl shownodes up | $tr ',' '\n' | $sort -o $upnodefile<br><br>  downNodeList=$($comm -23 $nodefile $upnodefile)<br>  print -- $downNodeList<br>}  #----- end of function getDownCesNodeList --------------------<br><br><br>## but not only are the sgate nodes not listed by `tsctl shownodes up`;

its output is obviously and erroneously truncated<br><br>[root@sgate2 ~]# tsctl shownodes up | tr ',' '\n' | tail<br>shas0251-opa.rc.int.colorado.edu<br>shas0252-opa.rc.int.colorado.edu<br>shas0253-opa.rc.int.colorado.edu<br>shas0254-opa.rc.int.colorado.edu<br>shas0255-opa.rc.int.colorado.edu<br>shas0256-opa.rc.int.colorado.edu<br>shas0257-opa.rc.int.colorado.edu<br>shas0258-opa.rc.int.colorado.edu<br>shas0259-opa.rc.int.colorado.edu<br>shas0260-opa.rc.int.col[root@sgate2 ~]#<br><br><br>## I expect that this is a bug in GPFS, likely related to a maximum output

buffer for `tsctl shownodes up`.<br><br><br><br>On 1/24/17, 12:48 PM, "Jonathon A Anderson" <jonathon.anderson@colorado.edu>

wrote:<br><br>    I think I'm having the same issue described here:<br>    <br>    </font></tt><a href="http://www.spectrumscale.org/pipermail/gpfsug-discuss/2016-October/002288.html"><tt><font size=2>http://www.spectrumscale.org/pipermail/gpfsug-discuss/2016-October/002288.html</font></tt></a><tt><font size=2><br>    <br>    Any advice or further troubleshooting steps would be much

appreciated. Full disclosure: I also have a DDN case open. (78804)<br>    <br>    We've got a four-node (snsd{1..4}) DDN gridscaler system.

I'm trying to add two CES protocol nodes (sgate{1,2}) to serve NFS. <br>    <br>    Here's the steps I took: <br>    <br>    --- <br>    mmcrnodeclass protocol -N sgate1-opa,sgate2-opa <br>    mmcrnodeclass nfs -N sgate1-opa,sgate2-opa <br>    mmchconfig cesSharedRoot=/gpfs/summit/ces <br>    mmchcluster --ccr-enable <br>    mmchnode --ces-enable -N protocol <br>    mmces service enable NFS <br>    mmces service start NFS -N nfs <br>    mmces address add --ces-ip 10.225.71.104,10.225.71.105 <br>    mmces address policy even-coverage <br>    mmces address move --rebalance <br>    --- <br>    <br>    This worked the very first time I ran it, but the CES addresses

weren't re-distributed after restarting GPFS or a node reboot. <br>    <br>    Things I've tried: <br>    <br>    * disabling ces on the sgate nodes and re-running the above

procedure <br>    * moving the cluster and filesystem managers to different

snsd nodes <br>    * deleting and re-creating the cesSharedRoot directory <br>    <br>    Meanwhile, the following log entry appears in mmfs.log.latest

every ~30s: <br>    <br>    --- <br>    Mon Jan 23 20:31:20 MST 2017: mmcesnetworkmonitor: Found

unassigned address 10.225.71.104 <br>    Mon Jan 23 20:31:20 MST 2017: mmcesnetworkmonitor: Found

unassigned address 10.225.71.105 <br>    Mon Jan 23 20:31:20 MST 2017: mmcesnetworkmonitor: handleNetworkProblem

with lock held: assignIP 10.225.71.104_0-_+,10.225.71.105_0-_+ 1 <br>    Mon Jan 23 20:31:20 MST 2017: mmcesnetworkmonitor: Assigning

addresses: 10.225.71.104_0-_+,10.225.71.105_0-_+ <br>    Mon Jan 23 20:31:20 MST 2017: mmcesnetworkmonitor: moveCesIPs:

10.225.71.104_0-_+,10.225.71.105_0-_+ <br>    --- <br>    <br>    Also notable, whenever I add or remove addresses now, I see

this in mmsysmonitor.log (among a lot of other entries): <br>    <br>    --- <br>    2017-01-23T20:40:56.363 sgate1 D ET_cesnetwork Entity state

without requireUnique: ces_network_ips_down WARNING No CES relevant NICs

detected - Service.calculateAndUpdateState:275 <br>    2017-01-23T20:40:11.364 sgate1 D ET_cesnetwork Update multiple

entities at once {'p2p2': 1, 'bond0': 1, 'p2p1': 1} - Service.setLocalState:333

<br>    --- <br>    <br>    For the record, here's the interface I expect to get the

address on sgate1: <br>    <br>    --- <br>    11: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP>

mtu 9000 qdisc noqueue state UP <br>    link/ether 3c:fd:fe:08:a7:c0 brd ff:ff:ff:ff:ff:ff <br>    inet 10.225.71.107/20 brd 10.225.79.255 scope global bond0

<br>    valid_lft forever preferred_lft forever <br>    inet6 fe80::3efd:feff:fe08:a7c0/64 scope link <br>    valid_lft forever preferred_lft forever <br>    --- <br>    <br>    which is a bond of p2p1 and p2p2. <br>    <br>    --- <br>    6: p2p1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu

9000 qdisc mq master bond0 state UP qlen 1000 <br>    link/ether 3c:fd:fe:08:a7:c0 brd ff:ff:ff:ff:ff:ff <br>    7: p2p2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu

9000 qdisc mq master bond0 state UP qlen 1000 <br>    link/ether 3c:fd:fe:08:a7:c0 brd ff:ff:ff:ff:ff:ff <br>    --- <br>    <br>    A similar bond0 exists on sgate2. <br>    <br>    I crawled around in /usr/lpp/mmfs/lib/mmsysmon/CESNetworkService.py

for a while trying to figure it out, but have been unsuccessful so far.<br>    <br>    <br><br>_______________________________________________<br>gpfsug-discuss mailing list<br>gpfsug-discuss at spectrumscale.org<br></font></tt><a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss"><tt><font size=2>http://gpfsug.org/mailman/listinfo/gpfsug-discuss</font></tt></a><tt><font size=2><br></font></tt><br><br><BR>