[gpfsug-discuss] Replicated and non replicated data
Steve Xiao
sxiao at us.ibm.com
Sat Apr 14 02:42:28 BST 2018
What is your unmountOnDiskFail configuration setting on the cluster? You
need to set unmountOnDiskFail to meta if you only have metadata
replication.
Steve Y. Xiao
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 13 Apr 2018 20:05:53 +0000
> From: "Simon Thompson (IT Research Support)" <S.J.Thompson at bham.ac.uk>
> To: "gpfsug-discuss at spectrumscale.org"
> <gpfsug-discuss at spectrumscale.org>
> Subject: [gpfsug-discuss] Replicated and non replicated data
> Message-ID: <98F781F7-7063-4293-A5BC-1E8F5A0C98EC at bham.ac.uk>
> Content-Type: text/plain; charset="utf-8"
>
> I have a question about file-systems with replicated an non replicated
data.
>
> We have a file-system where metadata is set to copies=2 and data
> copies=2, we then use a placement policy to selectively replicate
> some data only once based on file-set. We also place the non-
> replicated data into a specific pool (6tnlsas) to ensure we know
> where it is placed.
>
> My understanding was that in doing this, if we took the disks with
> the non replicated data offline, we?d still have the FS available
> for users as the metadata is replicated. Sure accessing a non-
> replicated data file would give an IO error, but the rest of the FS
> should be up.
>
> We had a situation today where we wanted to take stg01 offline
> today, so tried using mmchdisk stop -d ?. Once we got to about disk
> stg01-01_12_12, GPFS would refuse to stop any more disks and
> complain about too many disks, similarly if we shutdown the NSD
> servers hosting the disks, the filesystem would have an SGPanic and
> force unmount.
>
> First, am I correct in thinking that a FS with non-replicated data,
> but replicated metadata should still be accessible (not the non-
> replicated data) when the LUNS hosting it are down?
>
> If so, any suggestions why my FS is panic-ing when we take down the
> one set of disks?
>
> I thought at first we had some non-replicated metadata, tried a
> mmrestripefs -R ?metadata-only to force it to ensure 2 replicas, but
> this didn?t help.
>
> Running 5.0.0.2 on the NSD server nodes.
>
> (First time we went round this we didn?t have a FS descriptor disk,
> but you can see below that we added this)
>
> Thanks
>
> Simon
>
> [root at nsd01 ~]# mmlsdisk castles -L
> disk driver sector failure holds holds storage
> name type size group metadata data status
> availability disk id pool remarks
> ------------ -------- ------ ----------- -------- -----
> ------------- ------------ ------- ------------ ---------
> CASTLES_GPFS_DESCONLY01 nsd 512 310 no no
> ready up 1 system desc
> stg01-01_3_3 nsd 4096 210 no yes ready
> down 4 6tnlsas
> stg01-01_4_4 nsd 4096 210 no yes ready
> down 5 6tnlsas
> stg01-01_5_5 nsd 4096 210 no yes ready
> down 6 6tnlsas
> stg01-01_6_6 nsd 4096 210 no yes ready
> down 7 6tnlsas
> stg01-01_7_7 nsd 4096 210 no yes ready
> down 8 6tnlsas
> stg01-01_8_8 nsd 4096 210 no yes ready
> down 9 6tnlsas
> stg01-01_9_9 nsd 4096 210 no yes ready
> down 10 6tnlsas
> stg01-01_10_10 nsd 4096 210 no yes ready
> down 11 6tnlsas
> stg01-01_11_11 nsd 4096 210 no yes ready
> down 12 6tnlsas
> stg01-01_12_12 nsd 4096 210 no yes ready
> down 13 6tnlsas
> stg01-01_13_13 nsd 4096 210 no yes ready
> down 14 6tnlsas
> stg01-01_14_14 nsd 4096 210 no yes ready
> down 15 6tnlsas
> stg01-01_15_15 nsd 4096 210 no yes ready
> down 16 6tnlsas
> stg01-01_16_16 nsd 4096 210 no yes ready
> down 17 6tnlsas
> stg01-01_17_17 nsd 4096 210 no yes ready
> down 18 6tnlsas
> stg01-01_18_18 nsd 4096 210 no yes ready
> down 19 6tnlsas
> stg01-01_19_19 nsd 4096 210 no yes ready
> down 20 6tnlsas
> stg01-01_20_20 nsd 4096 210 no yes ready
> down 21 6tnlsas
> stg01-01_21_21 nsd 4096 210 no yes ready
> down 22 6tnlsas
> stg01-01_ssd_54_54 nsd 4096 210 yes no ready
> down 23 system
> stg01-01_ssd_56_56 nsd 4096 210 yes no ready
> down 24 system
> stg02-01_0_0 nsd 4096 110 no yes ready
> up 25 6tnlsas
> stg02-01_1_1 nsd 4096 110 no yes ready
> up 26 6tnlsas
> stg02-01_2_2 nsd 4096 110 no yes ready
> up 27 6tnlsas
> stg02-01_3_3 nsd 4096 110 no yes ready
> up 28 6tnlsas
> stg02-01_4_4 nsd 4096 110 no yes ready
> up 29 6tnlsas
> stg02-01_5_5 nsd 4096 110 no yes ready
> up 30 6tnlsas
> stg02-01_6_6 nsd 4096 110 no yes ready
> up 31 6tnlsas
> stg02-01_7_7 nsd 4096 110 no yes ready
> up 32 6tnlsas
> stg02-01_8_8 nsd 4096 110 no yes ready
> up 33 6tnlsas
> stg02-01_9_9 nsd 4096 110 no yes ready
> up 34 6tnlsas
> stg02-01_10_10 nsd 4096 110 no yes ready
> up 35 6tnlsas
> stg02-01_11_11 nsd 4096 110 no yes ready
> up 36 6tnlsas
> stg02-01_12_12 nsd 4096 110 no yes ready
> up 37 6tnlsas
> stg02-01_13_13 nsd 4096 110 no yes ready
> up 38 6tnlsas
> stg02-01_14_14 nsd 4096 110 no yes ready
> up 39 6tnlsas
> stg02-01_15_15 nsd 4096 110 no yes ready
> up 40 6tnlsas
> stg02-01_16_16 nsd 4096 110 no yes ready
> up 41 6tnlsas
> stg02-01_17_17 nsd 4096 110 no yes ready
> up 42 6tnlsas
> stg02-01_18_18 nsd 4096 110 no yes ready
> up 43 6tnlsas
> stg02-01_19_19 nsd 4096 110 no yes ready
> up 44 6tnlsas
> stg02-01_20_20 nsd 4096 110 no yes ready
> up 45 6tnlsas
> stg02-01_21_21 nsd 4096 110 no yes ready
> up 46 6tnlsas
> stg02-01_ssd_22_22 nsd 4096 110 yes no ready
> up 47 system desc
> stg02-01_ssd_23_23 nsd 4096 110 yes no ready
> up 48 system
> stg02-01_ssd_24_24 nsd 4096 110 yes no ready
> up 49 system
> stg02-01_ssd_25_25 nsd 4096 110 yes no ready
> up 50 system
> stg01-01_22_22 nsd 4096 210 no yes ready
> up 51 6tnlsasnonrepl desc
> stg01-01_23_23 nsd 4096 210 no yes ready
> up 52 6tnlsasnonrepl
> stg01-01_24_24 nsd 4096 210 no yes ready
> up 53 6tnlsasnonrepl
> stg01-01_25_25 nsd 4096 210 no yes ready
> up 54 6tnlsasnonrepl
> stg01-01_26_26 nsd 4096 210 no yes ready
> up 55 6tnlsasnonrepl
> stg01-01_27_27 nsd 4096 210 no yes ready
> up 56 6tnlsasnonrepl
> stg01-01_31_31 nsd 4096 210 no yes ready
> up 58 6tnlsasnonrepl
> stg01-01_32_32 nsd 4096 210 no yes ready
> up 59 6tnlsasnonrepl
> stg01-01_33_33 nsd 4096 210 no yes ready
> up 60 6tnlsasnonrepl
> stg01-01_34_34 nsd 4096 210 no yes ready
> up 61 6tnlsasnonrepl
> stg01-01_35_35 nsd 4096 210 no yes ready
> up 62 6tnlsasnonrepl
> stg01-01_36_36 nsd 4096 210 no yes ready
> up 63 6tnlsasnonrepl
> stg01-01_37_37 nsd 4096 210 no yes ready
> up 64 6tnlsasnonrepl
> stg01-01_38_38 nsd 4096 210 no yes ready
> up 65 6tnlsasnonrepl
> stg01-01_39_39 nsd 4096 210 no yes ready
> up 66 6tnlsasnonrepl
> stg01-01_40_40 nsd 4096 210 no yes ready
> up 67 6tnlsasnonrepl
> stg01-01_41_41 nsd 4096 210 no yes ready
> up 68 6tnlsasnonrepl
> stg01-01_42_42 nsd 4096 210 no yes ready
> up 69 6tnlsasnonrepl
> stg01-01_43_43 nsd 4096 210 no yes ready
> up 70 6tnlsasnonrepl
> stg01-01_44_44 nsd 4096 210 no yes ready
> up 71 6tnlsasnonrepl
> stg01-01_45_45 nsd 4096 210 no yes ready
> up 72 6tnlsasnonrepl
> stg01-01_46_46 nsd 4096 210 no yes ready
> up 73 6tnlsasnonrepl
> stg01-01_47_47 nsd 4096 210 no yes ready
> up 74 6tnlsasnonrepl
> stg01-01_48_48 nsd 4096 210 no yes ready
> up 75 6tnlsasnonrepl
> stg01-01_49_49 nsd 4096 210 no yes ready
> up 76 6tnlsasnonrepl
> stg01-01_50_50 nsd 4096 210 no yes ready
> up 77 6tnlsasnonrepl
> stg01-01_51_51 nsd 4096 210 no yes ready
> up 78 6tnlsasnonrepl
> Number of quorum disks: 3
> Read quorum value: 2
> Write quorum value: 2
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <https://urldefense.proofpoint.com/v2/url?
>
u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20180413_c22c8133_attachment.html&d=DwICAg&c=jf_iaSHvJObTbx-
>
siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=BX4uqSaNFY5Jl4ZNPLYjML8nanjAa57Nuz_7J2jSqMs&s=2P7GHehsFTuGZ39pBTBsUzcdwo9jkidie2etD8_llas&e=
> >
>
> ------------------------------
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.proofpoint.com/v2/url?
>
u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-
>
siA1ZOg&r=ck4PYlaRFvCcNKlfHMPhoA&m=BX4uqSaNFY5Jl4ZNPLYjML8nanjAa57Nuz_7J2jSqMs&s=Q5EVJvSbunfieiHUrDHMpC3WAhP1fX2sQFwLLgLFb8Y&e=
>
>
> End of gpfsug-discuss Digest, Vol 75, Issue 23
> **********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180413/b8f3bd62/attachment-0002.htm>
More information about the gpfsug-discuss
mailing list