<font size=2 face="sans-serif">many ways lead to Rome .. and I agree ..

mmexpelnode is a nice command .. </font><br><font size=2 face="sans-serif">another approach... </font><br><font size=2 face="sans-serif">power it off .. (not reachable by ping)

.. mmdelnode ... power on/boot ... mmaddnode .. </font><br><br><br><br><font size=1 color=#5f5f5f face="sans-serif">From:      

 </font><font size=1 face="sans-serif">Aaron Knister <aaron.s.knister@nasa.gov></font><br><font size=1 color=#5f5f5f face="sans-serif">To:      

 </font><font size=1 face="sans-serif"><gpfsug-discuss@spectrumscale.org></font><br><font size=1 color=#5f5f5f face="sans-serif">Date:      

 </font><font size=1 face="sans-serif">02/02/2017 08:37 PM</font><br><font size=1 color=#5f5f5f face="sans-serif">Subject:    

   </font><font size=1 face="sans-serif">Re: [gpfsug-discuss]

proper gpfs shutdown when node disappears</font><br><font size=1 color=#5f5f5f face="sans-serif">Sent by:    

   </font><font size=1 face="sans-serif">gpfsug-discuss-bounces@spectrumscale.org</font><br><hr noshade><br><br><br><tt><font size=2>You could forcibly expel the node (one of my favorite

GPFS commands):<br><br>mmexpelnode -N $nodename<br><br>and then power it off after the expulsion is complete and then do<br><br>mmepelenode -r -N $nodename<br><br>which will allow it to join the cluster next time you try and start up

<br>GPFS on it. You'll still likely have to go through recovery but you'll

skip the part where GPFS wonders where the node went prior to it  expelling it. -Aaron On 2/2/17 2:28 PM, valdis.kletnieks@vt.edu wrote: > On Thu, 02 Feb 2017 18:28:22 +0100, "Olaf Weiser" said: > >> but the /var/mmfs DIR is obviously damaged/empty .. what ever..

that's why you<br>>> see a message like this..<br>>> have you reinstalled that node / any backup/restore thing ?<br>><br>> The internal RAID controller died a horrid death and basically took<br>> all the OS partitions with it.  So the node was just sort of

limping along,<br>> where the mmfsd process was still coping because it wasn't doing any<br>> I/O to the OS partitions - but 'ssh bad-node mmshutdown' wouldn't

work<br>> because that requires accessing stuff in /var.<br>><br>> At that point, it starts getting tempting to just use ipmitool from<br>> another node to power the comatose one down - but that often causes<br>> a cascade of other issues while things are stuck waiting for timeouts.<br>><br>><br>> _______________________________________________<br>> gpfsug-discuss mailing list<br>> gpfsug-discuss at spectrumscale.org<br>> </font></tt><a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss"><tt><font size=2>http://gpfsug.org/mailman/listinfo/gpfsug-discuss</font></tt></a><tt><font size=2><br>><br><br>-- <br>Aaron Knister<br>NASA Center for Climate Simulation (Code 606.2)<br>Goddard Space Flight Center<br>(301) 286-2776<br>_______________________________________________<br>gpfsug-discuss mailing list<br>gpfsug-discuss at spectrumscale.org<br></font></tt><a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss"><tt><font size=2>http://gpfsug.org/mailman/listinfo/gpfsug-discuss</font></tt></a><tt><font size=2><br><br></font></tt><br><br><BR>