<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">


<head>


<meta http-equiv="Content-Type" content="text/html; charset=utf-8">


<meta name="Generator" content="Microsoft Word 14 (filtered medium)">


<style><!--


/* Font Definitions */


@font-face


        {font-family:Calibri;


        panose-1:2 15 5 2 2 2 4 3 2 4;}


@font-face


        {font-family:Tahoma;


        panose-1:2 11 6 4 3 5 4 4 2 4;}


/* Style Definitions */


p.MsoNormal, li.MsoNormal, div.MsoNormal


        {margin:0in;


        margin-bottom:.0001pt;


        font-size:12.0pt;


        font-family:"Times New Roman","serif";}


a:link, span.MsoHyperlink


        {mso-style-priority:99;


        color:blue;


        text-decoration:underline;}


a:visited, span.MsoHyperlinkFollowed


        {mso-style-priority:99;


        color:purple;


        text-decoration:underline;}


p.m-4981009086590633322inbox-inbox-p1, li.m-4981009086590633322inbox-inbox-p1, div.m-4981009086590633322inbox-inbox-p1


        {mso-style-name:m_-4981009086590633322inbox-inbox-p1;


        mso-margin-top-alt:auto;


        margin-right:0in;


        mso-margin-bottom-alt:auto;


        margin-left:0in;


        font-size:12.0pt;


        font-family:"Times New Roman","serif";}


span.m-4981009086590633322inbox-inbox-apple-converted-space


        {mso-style-name:m_-4981009086590633322inbox-inbox-apple-converted-space;}


span.EmailStyle19


        {mso-style-type:personal-reply;


        font-family:"Calibri","sans-serif";


        color:#1F497D;}


.MsoChpDefault


        {mso-style-type:export-only;


        font-family:"Calibri","sans-serif";}


@page WordSection1


        {size:8.5in 11.0in;


        margin:1.0in 1.0in 1.0in 1.0in;}


div.WordSection1


        {page:WordSection1;}


--></style><!--[if gte mso 9]><xml>


<o:shapedefaults v:ext="edit" spidmax="1026" />


</xml><![endif]--><!--[if gte mso 9]><xml>


<o:shapelayout v:ext="edit">


<o:idmap v:ext="edit" data="1" />


</o:shapelayout></xml><![endif]-->


</head>


<body lang="EN-US" link="blue" vlink="purple">


<div class="WordSection1">


<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">I’ve also done the “panic stripe group everywhere” trick on a test cluster for a large FPO filesystem solution.  With FPO it’s not very hard to get a filesystem


 to become unmountable due to missing disks.  Sometimes the best answer, especially in a scratch use-case, may be to throw the  filesystem away and start again empty so that research can resume (even though there will be work loss and repeated effort for some). 


 But the stuck mounts problem can make this a long-lived problem.  In my case, I just repeatedly panic any nodes which continue to mount the filesystem and try mmdelfs until it works (usually takes a few attempts). 


<o:p></o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">In this case, I really don’t want/need the filesystem to be recovered.  I just want the cluster to forget about it as quickly as possible.  So far, in testing,


 the panic/destroy times aren’t bad, but I don’t have heavy user workloads running against it yet.  It would be interesting to know if there were any shortcuts to skip SG manager reassignment and recovery attempts.<o:p></o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Thx<o:p></o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Paul


<o:p></o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>


<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> gpfsug-discuss-bounces@spectrumscale.org [mailto:gpfsug-discuss-bounces@spectrumscale.org]


<b>On Behalf Of </b>Sven Oehme<br>


<b>Sent:</b> Monday, January 23, 2017 12:28 AM<br>


<b>To:</b> gpfsug main discussion list<br>


<b>Subject:</b> Re: [gpfsug-discuss] forcibly panic stripegroup everywhere?<o:p></o:p></span></p>


<p class="MsoNormal"><o:p> </o:p></p>


<div>


<p class="MsoNormal">Aaron, <o:p></o:p></p>


<div>


<p class="MsoNormal"><o:p> </o:p></p>


</div>


<div>


<p class="MsoNormal">hold a bit with the upgrade , i just got word that while 4.2.1+ most likely addresses the issues i mentioned, there was a defect in the initial release of the parallel log recovery code. i will get the exact minimum version you need to


 deploy and send another update to this thread. <o:p></o:p></p>


</div>


<div>


<p class="MsoNormal"><o:p> </o:p></p>


</div>


<div>


<p class="MsoNormal">sven<o:p></o:p></p>


</div>


</div>


<p class="MsoNormal"><o:p> </o:p></p>


<div>


<div>


<p class="MsoNormal">On Mon, Jan 23, 2017 at 5:03 AM Sven Oehme <<a href="mailto:oehmes@gmail.com">oehmes@gmail.com</a>> wrote:<o:p></o:p></p>


</div>


<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">


<div>


<p class="MsoNormal">Then i would suggest to move up to at least 4.2.1.LATEST , there is a high chance your problem might already be fixed. <o:p></o:p></p>


<div>


<p class="MsoNormal"><o:p> </o:p></p>


</div>


<div>


<p class="MsoNormal">i see 2 potential area that got significant improvements , Token Manager recovery and Log Recovery, both are in latest 4.2.1 code enabled : <o:p></o:p></p>


</div>


<div>


<p class="MsoNormal"><o:p> </o:p></p>


</div>


<div>


<p class="MsoNormal">2 significant improvements on Token Recovery in 4.2.1 : <o:p></o:p></p>


</div>


<div>


<p class="m-4981009086590633322inbox-inbox-p1"><span class="m-4981009086590633322inbox-inbox-apple-converted-space"> </span>1. Extendible hashing for token hash table.<span class="m-4981009086590633322inbox-inbox-apple-converted-space">  This</span> speeds


 up token lookup and thereby reduce tcMutex hold times for configurations with a large ratio of clients to token servers.<br>


<span class="m-4981009086590633322inbox-inbox-apple-converted-space">  </span>2. Cleaning up tokens held by failed nodes was making multiple passes over the whole token table, one for each failed node.<span class="m-4981009086590633322inbox-inbox-apple-converted-space"> 


</span>The loops are now inverted, so it makes a single pass over the able, and for each token  found, does cleanup for all failed nodes.<o:p></o:p></p>


<p class="m-4981009086590633322inbox-inbox-p1">there are multiple smaller enhancements beyond 4.2.1 but thats the minimum level you want to be. i have seen token recovery of 10's of minutes similar to what you described going down to a minute with this change. <o:p></o:p></p>


<p class="m-4981009086590633322inbox-inbox-p1">on Log Recovery -  in case of an unclean unmount/shutdown of a node prior 4.2.1 the Filesystem manager would only recover one Log file at a time, using a single thread, with 4.2.1 this is now done with multiple


 threads and multiple log files in parallel . <o:p></o:p></p>


</div>


</div>


<div>


<div>


<p class="m-4981009086590633322inbox-inbox-p1">Sven<o:p></o:p></p>


</div>


</div>


<p class="MsoNormal"><o:p> </o:p></p>


<div>


<div>


<p class="MsoNormal">On Mon, Jan 23, 2017 at 4:22 AM Aaron Knister <<a href="mailto:aaron.s.knister@nasa.gov" target="_blank">aaron.s.knister@nasa.gov</a>> wrote:<o:p></o:p></p>


</div>


<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">


<p class="MsoNormal">It's at 4.1.1.10.<br>


<br>


On 1/22/17 11:12 PM, Sven Oehme wrote:<br>


> What version of Scale/ GPFS code is this cluster on ?<br>


><br>


> ------------------------------------------<br>


> Sven Oehme<br>


> Scalable Storage Research<br>


> email: <a href="mailto:oehmes@us.ibm.com" target="_blank">oehmes@us.ibm.com</a><br>


> Phone: <a href="tel:(408)%20824-8904" target="_blank">+1 (408) 824-8904</a><br>


> IBM Almaden Research Lab<br>


> ------------------------------------------<br>


><br>


> Inactive hide details for Aaron Knister ---01/23/2017 01:31:29 AM---I<br>


> was afraid someone would ask :) One possible use would beAaron Knister<br>


> ---01/23/2017 01:31:29 AM---I was afraid someone would ask :) One<br>


> possible use would be testing how monitoring reacts to and/or<br>


><br>


> From: Aaron Knister <<a href="mailto:aaron.s.knister@nasa.gov" target="_blank">aaron.s.knister@nasa.gov</a>><br>


> To: <<a href="mailto:gpfsug-discuss@spectrumscale.org" target="_blank">gpfsug-discuss@spectrumscale.org</a>><br>


> Date: 01/23/2017 01:31 AM<br>


> Subject: Re: [gpfsug-discuss] forcibly panic stripegroup everywhere?<br>


> Sent by: <a href="mailto:gpfsug-discuss-bounces@spectrumscale.org" target="_blank">


gpfsug-discuss-bounces@spectrumscale.org</a><br>


><br>


> ------------------------------------------------------------------------<br>


><br>


><br>


><br>


> I was afraid someone would ask :)<br>


><br>


> One possible use would be testing how monitoring reacts to and/or<br>


> corrects stale filesystems.<br>


><br>


> The use in my case is there's an issue we see quite often where a<br>


> filesystem won't unmount when trying to shut down gpfs. Linux insists<br>


> its still busy despite every process being killed on the node just about<br>


> except init. It's a real pain because it complicates maintenance,<br>


> requiring a reboot of some nodes prior to patching for example.<br>


><br>


> I dug into it and it appears as though when this happens the<br>


> filesystem's mnt_count is ridiculously high (300,000+ in one case). I'm<br>


> trying to debug it further but I need to actually be able to make the<br>


> condition happen a few more times to debug it. A stripegroup panic isn't<br>


> a surefire way but it's the only way I've found so far to trigger this<br>


> behavior somewhat on demand.<br>


><br>


> One way I've found to trigger a mass stripegroup panic is to induce what<br>


> I call a  "301 error":<br>


><br>


> loremds07: Sun Jan 22 00:30:03.367 2017: [X] File System ttest unmounted<br>


> by the system with return code 301 reason code 0<br>


> loremds07: Sun Jan 22 00:30:03.368 2017: Invalid argument<br>


><br>


> and tickle a known race condition between nodes being expelled from the<br>


> cluster and a manager node joining the cluster. When this happens it<br>


> seems to cause a mass stripe group panic that's over in a few minutes.<br>


> The trick there is that it doesn't happen every time I go through the<br>


> exercise and when it does there's no guarantee the filesystem that<br>


> panics is the one in use. If it's not an fs in use then it doesn't help<br>


> me reproduce the error condition. I was trying to use the "mmfsadm test<br>


> panic" command to try a more direct approach.<br>


><br>


> Hope that helps shed some light.<br>


><br>


> -Aaron<br>


><br>


> On 1/22/17 8:16 PM, Andrew Beattie wrote:<br>


>> Out of curiosity -- why would you want to?<br>


>> Andrew Beattie<br>


>> Software Defined Storage  - IT Specialist<br>


>> Phone: 614-2133-7927<br>


>> E-mail: <a href="mailto:abeattie@au1.ibm.com" target="_blank">abeattie@au1.ibm.com</a> <mailto:<a href="mailto:abeattie@au1.ibm.com" target="_blank">abeattie@au1.ibm.com</a>><br>


>><br>


>><br>


>><br>


>>     ----- Original message -----<br>


>>     From: Aaron Knister <<a href="mailto:aaron.s.knister@nasa.gov" target="_blank">aaron.s.knister@nasa.gov</a>><br>


>>     Sent by: <a href="mailto:gpfsug-discuss-bounces@spectrumscale.org" target="_blank">


gpfsug-discuss-bounces@spectrumscale.org</a><br>


>>     To: gpfsug main discussion list <<a href="mailto:gpfsug-discuss@spectrumscale.org" target="_blank">gpfsug-discuss@spectrumscale.org</a>><br>


>>     Cc:<br>


>>     Subject: [gpfsug-discuss] forcibly panic stripegroup everywhere?<br>


>>     Date: Mon, Jan 23, 2017 11:11 AM<br>


>><br>


>>     This is going to sound like a ridiculous request, but, is there a way to<br>


>>     cause a filesystem to panic everywhere in one "swell foop"? I'm assuming<br>


>>     the answer will come with an appropriate disclaimer of "don't ever do<br>


>>     this, we don't support it, it might eat your data, summon cthulu, etc.".<br>


>>     I swear I've seen the fs manager initiate this type of operation before.<br>


>><br>


>>     I can seem to do it on a per-node basis with "mmfsadm test panic <fs><br>


>>     <error code>" but if I do that over all 1k nodes in my test cluster at<br>


>>     once it results in about 45 minutes of almost total deadlock while each<br>


>>     panic is processed by the fs manager.<br>


>><br>


>>     -Aaron<br>


>><br>


>>     --<br>


>>     Aaron Knister<br>


>>     NASA Center for Climate Simulation (Code 606.2)<br>


>>     Goddard Space Flight Center<br>


>>     <a href="tel:(301)%20286-2776" target="_blank">(301) 286-2776</a><br>


>>     _______________________________________________<br>


>>     gpfsug-discuss mailing list<br>


>>     gpfsug-discuss at <a href="http://spectrumscale.org" target="_blank">spectrumscale.org</a><br>


>>     <a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a><br>


>><br>


>><br>


>><br>


>><br>


>><br>


>><br>


>> _______________________________________________<br>


>> gpfsug-discuss mailing list<br>


>> gpfsug-discuss at <a href="http://spectrumscale.org" target="_blank">spectrumscale.org</a><br>


>> <a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a><br>


>><br>


><br>


> --<br>


> Aaron Knister<br>


> NASA Center for Climate Simulation (Code 606.2)<br>


> Goddard Space Flight Center<br>


> <a href="tel:(301)%20286-2776" target="_blank">(301) 286-2776</a><br>


> _______________________________________________<br>


> gpfsug-discuss mailing list<br>


> gpfsug-discuss at <a href="http://spectrumscale.org" target="_blank">spectrumscale.org</a><br>


> <a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a><br>


><br>


><br>


><br>


><br>


><br>


><br>


> _______________________________________________<br>


> gpfsug-discuss mailing list<br>


> gpfsug-discuss at <a href="http://spectrumscale.org" target="_blank">spectrumscale.org</a><br>


> <a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a><br>


><br>


<br>


--<br>


Aaron Knister<br>


NASA Center for Climate Simulation (Code 606.2)<br>


Goddard Space Flight Center<br>


<a href="tel:(301)%20286-2776" target="_blank">(301) 286-2776</a><br>


_______________________________________________<br>


gpfsug-discuss mailing list<br>


gpfsug-discuss at <a href="http://spectrumscale.org" target="_blank">spectrumscale.org</a><br>


<a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a><o:p></o:p></p>


</blockquote>


</div>


</blockquote>


</div>


</div>


</body>


</html>