<font size=2 face="sans-serif">What is your unmountOnDiskFail configuration
setting on the cluster? You need to set unmountOnDiskFail
to meta if you only have metadata replication.</font><br><font size=2 face="sans-serif"><br>Steve Y. Xiao</font><tt><font size=2><br><br>> ----------------------------------------------------------------------<br>> <br>> Message: 1<br>> Date: Fri, 13 Apr 2018 20:05:53 +0000<br>> From: "Simon Thompson (IT Research Support)" <S.J.Thompson@bham.ac.uk><br>> To: "gpfsug-discuss@spectrumscale.org"<br>> <gpfsug-discuss@spectrumscale.org><br>> Subject: [gpfsug-discuss] Replicated and non replicated data<br>> Message-ID: <98F781F7-7063-4293-A5BC-1E8F5A0C98EC@bham.ac.uk><br>> Content-Type: text/plain; charset="utf-8"<br>> <br>> I have a question about file-systems with replicated an non replicated
data.<br>> <br>> We have a file-system where metadata is set to copies=2 and data <br>> copies=2, we then use a placement policy to selectively replicate
<br>> some data only once based on file-set. We also place the non-<br>> replicated data into a specific pool (6tnlsas) to ensure we know <br>> where it is placed.<br>> <br>> My understanding was that in doing this, if we took the disks with
<br>> the non replicated data offline, we?d still have the FS available
<br>> for users as the metadata is replicated. Sure accessing a non-<br>> replicated data file would give an IO error, but the rest of the FS
<br>> should be up.<br>> <br>> We had a situation today where we wanted to take stg01 offline <br>> today, so tried using mmchdisk stop -d ?. Once we got to about disk
<br>> stg01-01_12_12, GPFS would refuse to stop any more disks and <br>> complain about too many disks, similarly if we shutdown the NSD <br>> servers hosting the disks, the filesystem would have an SGPanic and
<br>> force unmount.<br>> <br>> First, am I correct in thinking that a FS with non-replicated data,
<br>> but replicated metadata should still be accessible (not the non-<br>> replicated data) when the LUNS hosting it are down?<br>> <br>> If so, any suggestions why my FS is panic-ing when we take down the
<br>> one set of disks?<br>> <br>> I thought at first we had some non-replicated metadata, tried a <br>> mmrestripefs -R ?metadata-only to force it to ensure 2 replicas, but<br>> this didn?t help.<br>> <br>> Running 5.0.0.2 on the NSD server nodes.<br>> <br>> (First time we went round this we didn?t have a FS descriptor disk,
<br>> but you can see below that we added this)<br>> <br>> Thanks<br>> <br>> Simon<br>> <br>> [root@nsd01 ~]# mmlsdisk castles -L<br>> disk driver sector
failure holds holds
storage<br>> name type size
group metadata data status <br>> availability disk id pool remarks<br>> ------------ -------- ------ ----------- -------- ----- <br>> ------------- ------------ ------- ------------ ---------<br>> CASTLES_GPFS_DESCONLY01 nsd 512
310 no no <br>> ready up
1 system desc<br>> stg01-01_3_3 nsd 4096
210 no yes ready
<br>> down 4 6tnlsas<br>> stg01-01_4_4 nsd 4096
210 no yes ready
<br>> down 5 6tnlsas<br>> stg01-01_5_5 nsd 4096
210 no yes ready
<br>> down 6 6tnlsas<br>> stg01-01_6_6 nsd 4096
210 no yes ready
<br>> down 7 6tnlsas<br>> stg01-01_7_7 nsd 4096
210 no yes ready
<br>> down 8 6tnlsas<br>> stg01-01_8_8 nsd 4096
210 no yes ready
<br>> down 9 6tnlsas<br>> stg01-01_9_9 nsd 4096
210 no yes ready
<br>> down 10 6tnlsas<br>> stg01-01_10_10 nsd 4096
210 no yes ready <br>> down 11 6tnlsas<br>> stg01-01_11_11 nsd 4096
210 no yes ready <br>> down 12 6tnlsas<br>> stg01-01_12_12 nsd 4096
210 no yes ready <br>> down 13 6tnlsas<br>> stg01-01_13_13 nsd 4096
210 no yes ready <br>> down 14 6tnlsas<br>> stg01-01_14_14 nsd 4096
210 no yes ready <br>> down 15 6tnlsas<br>> stg01-01_15_15 nsd 4096
210 no yes ready <br>> down 16 6tnlsas<br>> stg01-01_16_16 nsd 4096
210 no yes ready <br>> down 17 6tnlsas<br>> stg01-01_17_17 nsd 4096
210 no yes ready <br>> down 18 6tnlsas<br>> stg01-01_18_18 nsd 4096
210 no yes ready <br>> down 19 6tnlsas<br>> stg01-01_19_19 nsd 4096
210 no yes ready <br>> down 20 6tnlsas<br>> stg01-01_20_20 nsd 4096
210 no yes ready <br>> down 21 6tnlsas<br>> stg01-01_21_21 nsd 4096
210 no yes ready <br>> down 22 6tnlsas<br>> stg01-01_ssd_54_54 nsd 4096
210 yes no ready <br>> down 23 system<br>> stg01-01_ssd_56_56 nsd 4096
210 yes no ready <br>> down 24 system<br>> stg02-01_0_0 nsd 4096
110 no yes ready
<br>> up 25 6tnlsas<br>> stg02-01_1_1 nsd 4096
110 no yes ready
<br>> up 26 6tnlsas<br>> stg02-01_2_2 nsd 4096
110 no yes ready
<br>> up 27 6tnlsas<br>> stg02-01_3_3 nsd 4096
110 no yes ready
<br>> up 28 6tnlsas<br>> stg02-01_4_4 nsd 4096
110 no yes ready
<br>> up 29 6tnlsas<br>> stg02-01_5_5 nsd 4096
110 no yes ready
<br>> up 30 6tnlsas<br>> stg02-01_6_6 nsd 4096
110 no yes ready
<br>> up 31 6tnlsas<br>> stg02-01_7_7 nsd 4096
110 no yes ready
<br>> up 32 6tnlsas<br>> stg02-01_8_8 nsd 4096
110 no yes ready
<br>> up 33 6tnlsas<br>> stg02-01_9_9 nsd 4096
110 no yes ready
<br>> up 34 6tnlsas<br>> stg02-01_10_10 nsd 4096
110 no yes ready <br>> up 35 6tnlsas<br>> stg02-01_11_11 nsd 4096
110 no yes ready <br>> up 36 6tnlsas<br>> stg02-01_12_12 nsd 4096
110 no yes ready <br>> up 37 6tnlsas<br>> stg02-01_13_13 nsd 4096
110 no yes ready <br>> up 38 6tnlsas<br>> stg02-01_14_14 nsd 4096
110 no yes ready <br>> up 39 6tnlsas<br>> stg02-01_15_15 nsd 4096
110 no yes ready <br>> up 40 6tnlsas<br>> stg02-01_16_16 nsd 4096
110 no yes ready <br>> up 41 6tnlsas<br>> stg02-01_17_17 nsd 4096
110 no yes ready <br>> up 42 6tnlsas<br>> stg02-01_18_18 nsd 4096
110 no yes ready <br>> up 43 6tnlsas<br>> stg02-01_19_19 nsd 4096
110 no yes ready <br>> up 44 6tnlsas<br>> stg02-01_20_20 nsd 4096
110 no yes ready <br>> up 45 6tnlsas<br>> stg02-01_21_21 nsd 4096
110 no yes ready <br>> up 46 6tnlsas<br>> stg02-01_ssd_22_22 nsd 4096
110 yes no ready <br>> up 47 system
desc<br>> stg02-01_ssd_23_23 nsd 4096
110 yes no ready <br>> up 48 system<br>> stg02-01_ssd_24_24 nsd 4096
110 yes no ready <br>> up 49 system<br>> stg02-01_ssd_25_25 nsd 4096
110 yes no ready <br>> up 50 system<br>> stg01-01_22_22 nsd 4096
210 no yes ready <br>> up 51 6tnlsasnonrepl
desc<br>> stg01-01_23_23 nsd 4096
210 no yes ready <br>> up 52 6tnlsasnonrepl<br>> stg01-01_24_24 nsd 4096
210 no yes ready <br>> up 53 6tnlsasnonrepl<br>> stg01-01_25_25 nsd 4096
210 no yes ready <br>> up 54 6tnlsasnonrepl<br>> stg01-01_26_26 nsd 4096
210 no yes ready <br>> up 55 6tnlsasnonrepl<br>> stg01-01_27_27 nsd 4096
210 no yes ready <br>> up 56 6tnlsasnonrepl<br>> stg01-01_31_31 nsd 4096
210 no yes ready <br>> up 58 6tnlsasnonrepl<br>> stg01-01_32_32 nsd 4096
210 no yes ready <br>> up 59 6tnlsasnonrepl<br>> stg01-01_33_33 nsd 4096
210 no yes ready <br>> up 60 6tnlsasnonrepl<br>> stg01-01_34_34 nsd 4096
210 no yes ready <br>> up 61 6tnlsasnonrepl<br>> stg01-01_35_35 nsd 4096
210 no yes ready <br>> up 62 6tnlsasnonrepl<br>> stg01-01_36_36 nsd 4096
210 no yes ready <br>> up 63 6tnlsasnonrepl<br>> stg01-01_37_37 nsd 4096
210 no yes ready <br>> up 64 6tnlsasnonrepl<br>> stg01-01_38_38 nsd 4096
210 no yes ready <br>> up 65 6tnlsasnonrepl<br>> stg01-01_39_39 nsd 4096
210 no yes ready <br>> up 66 6tnlsasnonrepl<br>> stg01-01_40_40 nsd 4096
210 no yes ready <br>> up 67 6tnlsasnonrepl<br>> stg01-01_41_41 nsd 4096
210 no yes ready <br>> up 68 6tnlsasnonrepl<br>> stg01-01_42_42 nsd 4096
210 no yes ready <br>> up 69 6tnlsasnonrepl<br>> stg01-01_43_43 nsd 4096
210 no yes ready <br>> up 70 6tnlsasnonrepl<br>> stg01-01_44_44 nsd 4096
210 no yes ready <br>> up 71 6tnlsasnonrepl<br>> stg01-01_45_45 nsd 4096
210 no yes ready <br>> up 72 6tnlsasnonrepl<br>> stg01-01_46_46 nsd 4096
210 no yes ready <br>> up 73 6tnlsasnonrepl<br>> stg01-01_47_47 nsd 4096
210 no yes ready <br>> up 74 6tnlsasnonrepl<br>> stg01-01_48_48 nsd 4096
210 no yes ready <br>> up 75 6tnlsasnonrepl<br>> stg01-01_49_49 nsd 4096
210 no yes ready <br>> up 76 6tnlsasnonrepl<br>> stg01-01_50_50 nsd 4096
210 no yes ready <br>> up 77 6tnlsasnonrepl<br>> stg01-01_51_51 nsd 4096
210 no yes ready <br>> up 78 6tnlsasnonrepl<br>> Number of quorum disks: 3<br>> Read quorum value: 2<br>> Write quorum value: 2<br>> <br>> -------------- next part --------------<br>> An HTML attachment was scrubbed...<br>> URL: <</font></tt><a href=https://urldefense.proofpoint.com/v2/url?><tt><font size=2>https://urldefense.proofpoint.com/v2/url?</font></tt></a><tt><font size=2><br>> u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20180413_c22c8133_attachment.html&d=DwICAg&c=jf_iaSHvJObTbx-<br>> siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=BX4uqSaNFY5Jl4ZNPLYjML8nanjAa57Nuz_7J2jSqMs&s=2P7GHehsFTuGZ39pBTBsUzcdwo9jkidie2etD8_llas&e=<br>> ><br>> <br>> ------------------------------<br>> <br>> _______________________________________________<br>> gpfsug-discuss mailing list<br>> gpfsug-discuss at spectrumscale.org<br>> </font></tt><a href=https://urldefense.proofpoint.com/v2/url?><tt><font size=2>https://urldefense.proofpoint.com/v2/url?</font></tt></a><tt><font size=2><br>> u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-<br>> siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=BX4uqSaNFY5Jl4ZNPLYjML8nanjAa57Nuz_7J2jSqMs&s=Q5EVJvSbunfieiHUrDHMpC3WAhP1fX2sQFwLLgLFb8Y&e=<br>> <br>> <br>> End of gpfsug-discuss Digest, Vol 75, Issue 23<br>> **********************************************<br>> <br></font></tt><BR>