<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:Consolas;
        panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
tt
        {mso-style-priority:99;
        font-family:"Courier New";}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
        {mso-style-priority:99;
        mso-style-link:"Balloon Text Char";
        margin:0in;
        margin-bottom:.0001pt;
        font-size:8.0pt;
        font-family:"Tahoma","sans-serif";}
span.BalloonTextChar
        {mso-style-name:"Balloon Text Char";
        mso-style-priority:99;
        mso-style-link:"Balloon Text";
        font-family:"Tahoma","sans-serif";}
span.EmailStyle20
        {mso-style-type:personal-reply;
        font-family:Consolas;
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">Sven, output below:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">--<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">[root@mmmnsd5 ~]# /var/mmfs/etc/nsddevices<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">mapper/dcs3800u31a_lun0 dmm<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">mapper/dcs3800u31a_lun10 dmm<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">mapper/dcs3800u31a_lun2 dmm<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">mapper/dcs3800u31a_lun4 dmm<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">mapper/dcs3800u31a_lun6 dmm<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">mapper/dcs3800u31a_lun8 dmm<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">mapper/dcs3800u31b_lun1 dmm<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">mapper/dcs3800u31b_lun11 dmm<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">mapper/dcs3800u31b_lun3 dmm<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">mapper/dcs3800u31b_lun5 dmm<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">mapper/dcs3800u31b_lun7 dmm<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">mapper/dcs3800u31b_lun9 dmm<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">[root@mmmnsd5 ~]#<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">--<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">--<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">[root@mmmnsd5 /]# dd if=/dev/dm-0 bs=1k count=32 | strings<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">32+0 records in<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">32+0 records out<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">32768 bytes (33 kB) copied, 0.000739083 s, 44.3 MB/s<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">EFI PART<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">system<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">[root@mmmnsd5 /]#<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">--<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D">Thanks, Jared<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Consolas;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> gpfsug-discuss-bounces@gpfsug.org [mailto:gpfsug-discuss-bounces@gpfsug.org]
<b>On Behalf Of </b>Sven Oehme<br>
<b>Sent:</b> Wednesday, October 29, 2014 1:41 PM<br>
<b>To:</b> gpfsug main discussion list<br>
<b>Subject:</b> Re: [gpfsug-discuss] Server lost NSD mappings<o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif"">can you please post the content of your nsddevices script ?
</span><br>
<br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif"">also please run  </span>
<br>
<br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif"">dd if=/dev/dm-0 bs=1k count=32 |strings</span>
<br>
<br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif"">and post the output</span>
<br>
<br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif"">thx. Sven</span> <br>
<br>
<br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif"">------------------------------------------<br>
Sven Oehme <br>
Scalable Storage Research <br>
email: <a href="mailto:oehmes@us.ibm.com">oehmes@us.ibm.com</a> <br>
Phone: +1 (408) 824-8904 <br>
IBM Almaden Research Lab <br>
------------------------------------------</span> <br>
<br>
<br>
<br>
<span style="font-size:7.5pt;font-family:"Arial","sans-serif";color:#5F5F5F">From:        </span><span style="font-size:7.5pt;font-family:"Arial","sans-serif"">Jared David Baker <<a href="mailto:Jared.Baker@uwyo.edu">Jared.Baker@uwyo.edu</a>></span>
<br>
<span style="font-size:7.5pt;font-family:"Arial","sans-serif";color:#5F5F5F">To:        </span><span style="font-size:7.5pt;font-family:"Arial","sans-serif"">gpfsug main discussion list <<a href="mailto:gpfsug-discuss@gpfsug.org">gpfsug-discuss@gpfsug.org</a>></span>
<br>
<span style="font-size:7.5pt;font-family:"Arial","sans-serif";color:#5F5F5F">Date:        </span><span style="font-size:7.5pt;font-family:"Arial","sans-serif"">10/29/2014 12:27 PM</span>
<br>
<span style="font-size:7.5pt;font-family:"Arial","sans-serif";color:#5F5F5F">Subject:        </span><span style="font-size:7.5pt;font-family:"Arial","sans-serif"">Re: [gpfsug-discuss] Server lost NSD mappings</span>
<br>
<span style="font-size:7.5pt;font-family:"Arial","sans-serif";color:#5F5F5F">Sent by:        </span><span style="font-size:7.5pt;font-family:"Arial","sans-serif""><a href="mailto:gpfsug-discuss-bounces@gpfsug.org">gpfsug-discuss-bounces@gpfsug.org</a></span>
<o:p></o:p></p>
<div class="MsoNormal" align="center" style="text-align:center">
<hr size="2" width="100%" noshade="" style="color:#A0A0A0" align="center">
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><br>
<br>
<br>
<tt><span style="font-size:10.0pt">Thanks Ed,</span></tt><span style="font-size:10.0pt;font-family:"Courier New""><br>
<br>
<tt>I can see the multipath devices inside the OS after reboot. The storage is all SAS attached. Two servers which can see the multipath LUNS for failover, then export the gpfs filesystem to the compute cluster.</tt><br>
<br>
<tt>--</tt><br>
<tt>[root@mmmnsd5 ~]# multipath -l</tt><br>
<tt>dcs3800u31a_lun8 (360080e500029600c000001e953cf8291) dm-4 IBM,1813      FAStT</tt><br>
<tt>size=29T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw</tt><br>
<tt>|-+- policy='round-robin 0' prio=0 status=active</tt><br>
<tt>| `- 0:0:0:8  sdi 8:128  active undef running</tt><br>
<tt>`-+- policy='round-robin 0' prio=0 status=enabled</tt><br>
<tt> `- 0:0:1:8  sdu 65:64  active undef running</tt><br>
<tt>dcs3800u31b_lun9 (360080e5000295c68000001c253cf8221) dm-9 IBM,1813      FAStT</tt><br>
<tt>size=29T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw</tt><br>
<tt>|-+- policy='round-robin 0' prio=0 status=active</tt><br>
<tt>| `- 0:0:1:9  sdv 65:80  active undef running</tt><br>
<tt>`-+- policy='round-robin 0' prio=0 status=enabled</tt><br>
<tt> `- 0:0:0:9  sdj 8:144  active undef running</tt><br>
<tt>dcs3800u31a_lun6 (360080e500029600c000001e653cf8210) dm-3 IBM,1813      FAStT</tt><br>
<tt>size=29T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw</tt><br>
<tt>|-+- policy='round-robin 0' prio=0 status=active</tt><br>
<tt>| `- 0:0:0:6  sdg 8:96   active undef running</tt><br>
<tt>`-+- policy='round-robin 0' prio=0 status=enabled</tt><br>
<tt> `- 0:0:1:6  sds 65:32  active undef running</tt><br>
<tt>mpathm (3600605b007ca57d01b1b8a7a1a107bdd) dm-12 IBM,ServeRAID M1115</tt><br>
<tt>size=558G features='0' hwhandler='0' wp=rw</tt><br>
<tt>`-+- policy='round-robin 0' prio=0 status=active</tt><br>
<tt> `- 1:2:0:0  sdy 65:128 active undef running</tt><br>
<tt>dcs3800u31b_lun7 (360080e5000295c68000001bd53cf81a9) dm-8 IBM,1813      FAStT</tt><br>
<tt>size=29T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw</tt><br>
<tt>|-+- policy='round-robin 0' prio=0 status=active</tt><br>
<tt>| `- 0:0:1:7  sdt 65:48  active undef running</tt><br>
<tt>`-+- policy='round-robin 0' prio=0 status=enabled</tt><br>
<tt> `- 0:0:0:7  sdh 8:112  active undef running</tt><br>
<tt>dcs3800u31a_lun10 (360080e500029600c000001ec53cf8301) dm-5 IBM,1813      FAStT</tt><br>
<tt>size=29T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw</tt><br>
<tt>|-+- policy='round-robin 0' prio=0 status=active</tt><br>
<tt>| `- 0:0:0:10 sdk 8:160  active undef running</tt><br>
<tt>`-+- policy='round-robin 0' prio=0 status=enabled</tt><br>
<tt> `- 0:0:1:10 sdw 65:96  active undef running</tt><br>
<tt>dcs3800u31a_lun4 (360080e500029600c000001e353cf8189) dm-1 IBM,1813      FAStT</tt><br>
<tt>size=29T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw</tt><br>
<tt>|-+- policy='round-robin 0' prio=0 status=active</tt><br>
<tt>| `- 0:0:0:4  sde 8:64   active undef running</tt><br>
<tt>`-+- policy='round-robin 0' prio=0 status=enabled</tt><br>
<tt> `- 0:0:1:4  sdq 65:0   active undef running</tt><br>
<tt>dcs3800u31b_lun5 (360080e5000295c68000001b853cf8125) dm-10 IBM,1813      FAStT</tt><br>
<tt>size=29T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw</tt><br>
<tt>|-+- policy='round-robin 0' prio=0 status=active</tt><br>
<tt>| `- 0:0:1:5  sdr 65:16  active undef running</tt><br>
<tt>`-+- policy='round-robin 0' prio=0 status=enabled</tt><br>
<tt> `- 0:0:0:5  sdf 8:80   active undef running</tt><br>
<tt>dcs3800u31a_lun2 (360080e500029600c000001e053cf80f9) dm-2 IBM,1813      FAStT</tt><br>
<tt>size=29T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw</tt><br>
<tt>|-+- policy='round-robin 0' prio=0 status=active</tt><br>
<tt>| `- 0:0:0:2  sdc 8:32   active undef running</tt><br>
<tt>`-+- policy='round-robin 0' prio=0 status=enabled</tt><br>
<tt> `- 0:0:1:2  sdo 8:224  active undef running</tt><br>
<tt>dcs3800u31b_lun11 (360080e5000295c68000001c753cf828e) dm-11 IBM,1813      FAStT</tt><br>
<tt>size=29T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw</tt><br>
<tt>|-+- policy='round-robin 0' prio=0 status=active</tt><br>
<tt>| `- 0:0:1:11 sdx 65:112 active undef running</tt><br>
<tt>`-+- policy='round-robin 0' prio=0 status=enabled</tt><br>
<tt> `- 0:0:0:11 sdl 8:176  active undef running</tt><br>
<tt>dcs3800u31b_lun3 (360080e5000295c68000001b353cf8097) dm-6 IBM,1813      FAStT</tt><br>
<tt>size=29T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw</tt><br>
<tt>|-+- policy='round-robin 0' prio=0 status=active</tt><br>
<tt>| `- 0:0:1:3  sdp 8:240  active undef running</tt><br>
<tt>`-+- policy='round-robin 0' prio=0 status=enabled</tt><br>
<tt> `- 0:0:0:3  sdd 8:48   active undef running</tt><br>
<tt>dcs3800u31a_lun0 (360080e500029600c000001da53cf7ec1) dm-0 IBM,1813      FAStT</tt><br>
<tt>size=29T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw</tt><br>
<tt>|-+- policy='round-robin 0' prio=0 status=active</tt><br>
<tt>| `- 0:0:0:0  sda 8:0    active undef running</tt><br>
<tt>`-+- policy='round-robin 0' prio=0 status=enabled</tt><br>
<tt> `- 0:0:1:0  sdm 8:192  active undef running</tt><br>
<tt>dcs3800u31b_lun1 (360080e5000295c68000001ac53cf7e8d) dm-7 IBM,1813      FAStT</tt><br>
<tt>size=29T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw</tt><br>
<tt>|-+- policy='round-robin 0' prio=0 status=active</tt><br>
<tt>| `- 0:0:1:1  sdn 8:208  active undef running</tt><br>
<tt>`-+- policy='round-robin 0' prio=0 status=enabled</tt><br>
<tt> `- 0:0:0:1  sdb 8:16   active undef running</tt><br>
<tt>[root@mmmnsd5 ~]#</tt><br>
<tt>--</tt><br>
<br>
<tt>--</tt><br>
<tt>[root@mmmnsd5 ~]# cat /proc/partitions</tt><br>
<tt>major minor  #blocks  name</tt><br>
<br>
<tt>  8       48 31251951616 sdd</tt><br>
<tt>  8       32 31251951616 sdc</tt><br>
<tt>  8       80 31251951616 sdf</tt><br>
<tt>  8       16 31251951616 sdb</tt><br>
<tt>  8      128 31251951616 sdi</tt><br>
<tt>  8      112 31251951616 sdh</tt><br>
<tt>  8       96 31251951616 sdg</tt><br>
<tt>  8      192 31251951616 sdm</tt><br>
<tt>  8      240 31251951616 sdp</tt><br>
<tt>  8      208 31251951616 sdn</tt><br>
<tt>  8      144 31251951616 sdj</tt><br>
<tt>  8       64 31251951616 sde</tt><br>
<tt>  8      224 31251951616 sdo</tt><br>
<tt>  8      160 31251951616 sdk</tt><br>
<tt>  8      176 31251951616 sdl</tt><br>
<tt> 65        0 31251951616 sdq</tt><br>
<tt> 65       48 31251951616 sdt</tt><br>
<tt> 65       16 31251951616 sdr</tt><br>
<tt> 65      128  584960000 sdy</tt><br>
<tt> 65       80 31251951616 sdv</tt><br>
<tt> 65       96 31251951616 sdw</tt><br>
<tt> 65       64 31251951616 sdu</tt><br>
<tt> 65      112 31251951616 sdx</tt><br>
<tt> 65       32 31251951616 sds</tt><br>
<tt>  8        0 31251951616 sda</tt><br>
<tt>253        0 31251951616 dm-0</tt><br>
<tt>253        1 31251951616 dm-1</tt><br>
<tt>253        2 31251951616 dm-2</tt><br>
<tt>253        3 31251951616 dm-3</tt><br>
<tt>253        4 31251951616 dm-4</tt><br>
<tt>253        5 31251951616 dm-5</tt><br>
<tt>253        6 31251951616 dm-6</tt><br>
<tt>253        7 31251951616 dm-7</tt><br>
<tt>253        8 31251951616 dm-8</tt><br>
<tt>253        9 31251951616 dm-9</tt><br>
<tt>253       10 31251951616 dm-10</tt><br>
<tt>253       11 31251951616 dm-11</tt><br>
<tt>253       12  584960000 dm-12</tt><br>
<tt>253       13     524288 dm-13</tt><br>
<tt>253       14   16777216 dm-14</tt><br>
<tt>253       15  567657472 dm-15</tt><br>
<tt>[root@mmmnsd5 ~]#</tt><br>
<tt>--</tt><br>
<br>
<tt>The NSDs had no failure group defined on creation.</tt><br>
<br>
<tt>Regards,</tt><br>
<br>
<tt>Jared</tt><br>
<br>
<br>
<br>
<br>
<tt>-----Original Message-----</tt><br>
<tt>From: <a href="mailto:gpfsug-discuss-bounces@gpfsug.org">gpfsug-discuss-bounces@gpfsug.org</a> [</tt></span><a href="mailto:gpfsug-discuss-bounces@gpfsug.org"><tt><span style="font-size:10.0pt">mailto:gpfsug-discuss-bounces@gpfsug.org</span></tt></a><tt><span style="font-size:10.0pt">]
 On Behalf Of Ed Wahl</span></tt><span style="font-size:10.0pt;font-family:"Courier New""><br>
<tt>Sent: Wednesday, October 29, 2014 1:08 PM</tt><br>
<tt>To: gpfsug main discussion list</tt><br>
<tt>Subject: Re: [gpfsug-discuss] Server lost NSD mappings</tt><br>
<br>
<tt> Can you see the block devices from inside the OS after the reboot?  I don't see where you mention this.  How is the storage attached to the server?  As a DCS37|800 can be FC/SAS/IB which is yours? Do the nodes share the storage?  All nsds in same failure
 group?     I was quickly brought to mind of a failed SRP_DAEMON lookup to IB storage from a badly updated IB card but I would hope you'd notice the lack of block devices.</tt><br>
<br>
<br>
<tt>cat /proc/partitions ?</tt><br>
<tt>multipath -l ?</tt><br>
<br>
<br>
<tt>Our GPFS changes device mapper multipath names all the time (dm-127 one day, dm-something else another), so that is no problem.  But wacking the volume label is a pain.  </tt><br>
<tt>When hardware dies if you have nsds sharing the same LUNs you can just transfer  /var/mmfs/gen/mmsdrfs from another node and Bob's your uncle.</tt><br>
<br>
<tt>Ed Wahl</tt><br>
<tt>OSC</tt><br>
<br>
<br>
<tt>________________________________________</tt><br>
<tt>From: <a href="mailto:gpfsug-discuss-bounces@gpfsug.org">gpfsug-discuss-bounces@gpfsug.org</a> [gpfsug-discuss-bounces@gpfsug.org] on behalf of Jared David Baker [Jared.Baker@uwyo.edu]</tt><br>
<tt>Sent: Wednesday, October 29, 2014 11:31 AM</tt><br>
<tt>To: <a href="mailto:gpfsug-discuss@gpfsug.org">gpfsug-discuss@gpfsug.org</a></tt><br>
<tt>Subject: [gpfsug-discuss] Server lost NSD mappings</tt><br>
<br>
<tt>Hello all,</tt><br>
<br>
<tt>I'm hoping that somebody can shed some light on a problem that I experienced yesterday. I've been working with GPFS for a couple months as an admin now, but I've come across a problem that I'm unable to see the answer to. Hopefully the solution is not listed
 somewhere blatantly on the web, but I spent a fair amount of time looking last night. Here is the situation: yesterday, I needed to update some firmware on a Mellanox HCA FDR14 card and reboot one of our GPFS servers and repeat for the sister node (IBM x3550
 and DCS3850) as HPSS for our main campus cluster. However, upon reboot, the server seemed to lose the path mappings to the multipath devices for the NSDs. Output below:</tt><br>
<br>
<tt>--</tt><br>
<tt>[root@mmmnsd5 ~]# mmlsnsd -m -f gscratch</tt><br>
<br>
<tt>Disk name    NSD volume ID      Device         Node name                Remarks</tt><br>
<tt>---------------------------------------------------------------------------------------</tt><br>
<tt>dcs3800u31a_lun0 0A62001B54235577   -              mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun0 0A62001B54235577   -              mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun10 0A62001C542355AA   -              mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun10 0A62001C542355AA   -              mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun2 0A62001C54235581   -              mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun2 0A62001C54235581   -              mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun4 0A62001B5423558B   -              mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun4 0A62001B5423558B   -              mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun6 0A62001C54235595   -              mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun6 0A62001C54235595   -              mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun8 0A62001B5423559F   -              mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun8 0A62001B5423559F   -              mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun1 0A62001B5423557C   -              mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun1 0A62001B5423557C   -              mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun11 0A62001C542355AF   -              mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun11 0A62001C542355AF   -              mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun3 0A62001C54235586   -              mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun3 0A62001C54235586   -              mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun5 0A62001B54235590   -              mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun5 0A62001B54235590   -              mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun7 0A62001C5423559A   -              mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun7 0A62001C5423559A   -              mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun9 0A62001B542355A4   -              mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun9 0A62001B542355A4   -              mminsd6.infini           (not found) server node</tt><br>
<br>
<tt>[root@mmmnsd5 ~]#</tt><br>
<tt>--</tt><br>
<br>
<tt>Also, the system was working fantastically before the reboot, but now I'm unable to mount the GPFS filesystem. The disk names look like they are there and mapped to the NSD volume ID, but there is no Device. I've created the /var/mmfs/etc/nsddevices script
 and it has the following output with user return 0:</tt><br>
<br>
<tt>--</tt><br>
<tt>[root@mmmnsd5 ~]# /var/mmfs/etc/nsddevices</tt><br>
<tt>mapper/dcs3800u31a_lun0 dmm</tt><br>
<tt>mapper/dcs3800u31a_lun10 dmm</tt><br>
<tt>mapper/dcs3800u31a_lun2 dmm</tt><br>
<tt>mapper/dcs3800u31a_lun4 dmm</tt><br>
<tt>mapper/dcs3800u31a_lun6 dmm</tt><br>
<tt>mapper/dcs3800u31a_lun8 dmm</tt><br>
<tt>mapper/dcs3800u31b_lun1 dmm</tt><br>
<tt>mapper/dcs3800u31b_lun11 dmm</tt><br>
<tt>mapper/dcs3800u31b_lun3 dmm</tt><br>
<tt>mapper/dcs3800u31b_lun5 dmm</tt><br>
<tt>mapper/dcs3800u31b_lun7 dmm</tt><br>
<tt>mapper/dcs3800u31b_lun9 dmm</tt><br>
<tt>[root@mmmnsd5 ~]#</tt><br>
<tt>--</tt><br>
<br>
<tt>That output looks correct to me based on the documentation. So I went digging in the GPFS log file and found this relevant information:</tt><br>
<br>
<tt>--</tt><br>
<tt>Tue Oct 28 23:44:48.405 2014: I/O to NSD disk, dcs3800u31a_lun0, fails. No such NSD locally found.</tt><br>
<tt>Tue Oct 28 23:44:48.481 2014: I/O to NSD disk, dcs3800u31b_lun1, fails. No such NSD locally found.</tt><br>
<tt>Tue Oct 28 23:44:48.555 2014: I/O to NSD disk, dcs3800u31a_lun2, fails. No such NSD locally found.</tt><br>
<tt>Tue Oct 28 23:44:48.629 2014: I/O to NSD disk, dcs3800u31b_lun3, fails. No such NSD locally found.</tt><br>
<tt>Tue Oct 28 23:44:48.703 2014: I/O to NSD disk, dcs3800u31a_lun4, fails. No such NSD locally found.</tt><br>
<tt>Tue Oct 28 23:44:48.775 2014: I/O to NSD disk, dcs3800u31b_lun5, fails. No such NSD locally found.</tt><br>
<tt>Tue Oct 28 23:44:48.844 2014: I/O to NSD disk, dcs3800u31a_lun6, fails. No such NSD locally found.</tt><br>
<tt>Tue Oct 28 23:44:48.919 2014: I/O to NSD disk, dcs3800u31b_lun7, fails. No such NSD locally found.</tt><br>
<tt>Tue Oct 28 23:44:48.989 2014: I/O to NSD disk, dcs3800u31a_lun8, fails. No such NSD locally found.</tt><br>
<tt>Tue Oct 28 23:44:49.060 2014: I/O to NSD disk, dcs3800u31b_lun9, fails. No such NSD locally found.</tt><br>
<tt>Tue Oct 28 23:44:49.128 2014: I/O to NSD disk, dcs3800u31a_lun10, fails. No such NSD locally found.</tt><br>
<tt>Tue Oct 28 23:44:49.199 2014: I/O to NSD disk, dcs3800u31b_lun11, fails. No such NSD locally found.</tt><br>
<tt>--</tt><br>
<br>
<tt>Okay, so the NSDs don't seem to be able to be found, so I attempt to rediscover the NSD by executing the command mmnsddiscover:</tt><br>
<br>
<tt>--</tt><br>
<tt>[root@mmmnsd5 ~]# mmnsddiscover</tt><br>
<tt>mmnsddiscover:  Attempting to rediscover the disks.  This may take a while ...</tt><br>
<tt>mmnsddiscover:  Finished.</tt><br>
<tt>[root@mmmnsd5 ~]#</tt><br>
<tt>--</tt><br>
<br>
<tt>I was hoping that finished, but then upon restarting GPFS, there was no success. Verifying with mmlsnsd -X -f gscratch</tt><br>
<br>
<tt>--</tt><br>
<tt>[root@mmmnsd5 ~]# mmlsnsd -X -f gscratch</tt><br>
<br>
<tt>Disk name    NSD volume ID      Device         Devtype  Node name                Remarks</tt><br>
<tt>---------------------------------------------------------------------------------------------------</tt><br>
<tt>dcs3800u31a_lun0 0A62001B54235577   -              -        mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun0 0A62001B54235577   -              -        mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun10 0A62001C542355AA   -              -        mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun10 0A62001C542355AA   -              -        mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun2 0A62001C54235581   -              -        mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun2 0A62001C54235581   -              -        mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun4 0A62001B5423558B   -              -        mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun4 0A62001B5423558B   -              -        mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun6 0A62001C54235595   -              -        mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun6 0A62001C54235595   -              -        mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun8 0A62001B5423559F   -              -        mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31a_lun8 0A62001B5423559F   -              -        mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun1 0A62001B5423557C   -              -        mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun1 0A62001B5423557C   -              -        mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun11 0A62001C542355AF   -              -        mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun11 0A62001C542355AF   -              -        mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun3 0A62001C54235586   -              -        mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun3 0A62001C54235586   -              -        mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun5 0A62001B54235590   -              -        mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun5 0A62001B54235590   -              -        mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun7 0A62001C5423559A   -              -        mminsd6.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun7 0A62001C5423559A   -              -        mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun9 0A62001B542355A4   -              -        mminsd5.infini           (not found) server node</tt><br>
<tt>dcs3800u31b_lun9 0A62001B542355A4   -              -        mminsd6.infini           (not found) server node</tt><br>
<br>
<tt>[root@mmmnsd5 ~]#</tt><br>
<tt>--</tt><br>
<br>
<tt>I'm wondering if somebody has seen this type of issue before? Will recreating my NSDs destroy the filesystem? I'm thinking that all the data is intact, but there is no crucial data on this file system yet, so I could recreate the file system, but I would
 like to learn how to solve a problem like this. Thanks for all help and information.</tt><br>
<br>
<tt>Regards,</tt><br>
<br>
<tt>Jared</tt><br>
<br>
<tt>_______________________________________________</tt><br>
<tt>gpfsug-discuss mailing list</tt><br>
<tt>gpfsug-discuss at gpfsug.org</tt><br>
</span><a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss"><tt><span style="font-size:10.0pt">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</span></tt></a><span style="font-size:10.0pt;font-family:"Courier New""><br>
<tt>_______________________________________________</tt><br>
<tt>gpfsug-discuss mailing list</tt><br>
<tt>gpfsug-discuss at gpfsug.org</tt><br>
</span><a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss"><tt><span style="font-size:10.0pt">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</span></tt></a><span style="font-size:10.0pt;font-family:"Courier New""><br>
<br>
</span><o:p></o:p></p>
</div>
</body>
</html>