[gpfsug-discuss] Replicated and non replicated data
Simon Thompson (IT Research Support)
S.J.Thompson at bham.ac.uk
Fri Apr 13 21:05:53 BST 2018
I have a question about file-systems with replicated an non replicated data.
We have a file-system where metadata is set to copies=2 and data copies=2, we then use a placement policy to selectively replicate some data only once based on file-set. We also place the non-replicated data into a specific pool (6tnlsas) to ensure we know where it is placed.
My understanding was that in doing this, if we took the disks with the non replicated data offline, we’d still have the FS available for users as the metadata is replicated. Sure accessing a non-replicated data file would give an IO error, but the rest of the FS should be up.
We had a situation today where we wanted to take stg01 offline today, so tried using mmchdisk stop -d …. Once we got to about disk stg01-01_12_12, GPFS would refuse to stop any more disks and complain about too many disks, similarly if we shutdown the NSD servers hosting the disks, the filesystem would have an SGPanic and force unmount.
First, am I correct in thinking that a FS with non-replicated data, but replicated metadata should still be accessible (not the non-replicated data) when the LUNS hosting it are down?
If so, any suggestions why my FS is panic-ing when we take down the one set of disks?
I thought at first we had some non-replicated metadata, tried a mmrestripefs -R –metadata-only to force it to ensure 2 replicas, but this didn’t help.
Running 5.0.0.2 on the NSD server nodes.
(First time we went round this we didn’t have a FS descriptor disk, but you can see below that we added this)
Thanks
Simon
[root at nsd01 ~]# mmlsdisk castles -L
disk driver sector failure holds holds storage
name type size group metadata data status availability disk id pool remarks
------------ -------- ------ ----------- -------- ----- ------------- ------------ ------- ------------ ---------
CASTLES_GPFS_DESCONLY01 nsd 512 310 no no ready up 1 system desc
stg01-01_3_3 nsd 4096 210 no yes ready down 4 6tnlsas
stg01-01_4_4 nsd 4096 210 no yes ready down 5 6tnlsas
stg01-01_5_5 nsd 4096 210 no yes ready down 6 6tnlsas
stg01-01_6_6 nsd 4096 210 no yes ready down 7 6tnlsas
stg01-01_7_7 nsd 4096 210 no yes ready down 8 6tnlsas
stg01-01_8_8 nsd 4096 210 no yes ready down 9 6tnlsas
stg01-01_9_9 nsd 4096 210 no yes ready down 10 6tnlsas
stg01-01_10_10 nsd 4096 210 no yes ready down 11 6tnlsas
stg01-01_11_11 nsd 4096 210 no yes ready down 12 6tnlsas
stg01-01_12_12 nsd 4096 210 no yes ready down 13 6tnlsas
stg01-01_13_13 nsd 4096 210 no yes ready down 14 6tnlsas
stg01-01_14_14 nsd 4096 210 no yes ready down 15 6tnlsas
stg01-01_15_15 nsd 4096 210 no yes ready down 16 6tnlsas
stg01-01_16_16 nsd 4096 210 no yes ready down 17 6tnlsas
stg01-01_17_17 nsd 4096 210 no yes ready down 18 6tnlsas
stg01-01_18_18 nsd 4096 210 no yes ready down 19 6tnlsas
stg01-01_19_19 nsd 4096 210 no yes ready down 20 6tnlsas
stg01-01_20_20 nsd 4096 210 no yes ready down 21 6tnlsas
stg01-01_21_21 nsd 4096 210 no yes ready down 22 6tnlsas
stg01-01_ssd_54_54 nsd 4096 210 yes no ready down 23 system
stg01-01_ssd_56_56 nsd 4096 210 yes no ready down 24 system
stg02-01_0_0 nsd 4096 110 no yes ready up 25 6tnlsas
stg02-01_1_1 nsd 4096 110 no yes ready up 26 6tnlsas
stg02-01_2_2 nsd 4096 110 no yes ready up 27 6tnlsas
stg02-01_3_3 nsd 4096 110 no yes ready up 28 6tnlsas
stg02-01_4_4 nsd 4096 110 no yes ready up 29 6tnlsas
stg02-01_5_5 nsd 4096 110 no yes ready up 30 6tnlsas
stg02-01_6_6 nsd 4096 110 no yes ready up 31 6tnlsas
stg02-01_7_7 nsd 4096 110 no yes ready up 32 6tnlsas
stg02-01_8_8 nsd 4096 110 no yes ready up 33 6tnlsas
stg02-01_9_9 nsd 4096 110 no yes ready up 34 6tnlsas
stg02-01_10_10 nsd 4096 110 no yes ready up 35 6tnlsas
stg02-01_11_11 nsd 4096 110 no yes ready up 36 6tnlsas
stg02-01_12_12 nsd 4096 110 no yes ready up 37 6tnlsas
stg02-01_13_13 nsd 4096 110 no yes ready up 38 6tnlsas
stg02-01_14_14 nsd 4096 110 no yes ready up 39 6tnlsas
stg02-01_15_15 nsd 4096 110 no yes ready up 40 6tnlsas
stg02-01_16_16 nsd 4096 110 no yes ready up 41 6tnlsas
stg02-01_17_17 nsd 4096 110 no yes ready up 42 6tnlsas
stg02-01_18_18 nsd 4096 110 no yes ready up 43 6tnlsas
stg02-01_19_19 nsd 4096 110 no yes ready up 44 6tnlsas
stg02-01_20_20 nsd 4096 110 no yes ready up 45 6tnlsas
stg02-01_21_21 nsd 4096 110 no yes ready up 46 6tnlsas
stg02-01_ssd_22_22 nsd 4096 110 yes no ready up 47 system desc
stg02-01_ssd_23_23 nsd 4096 110 yes no ready up 48 system
stg02-01_ssd_24_24 nsd 4096 110 yes no ready up 49 system
stg02-01_ssd_25_25 nsd 4096 110 yes no ready up 50 system
stg01-01_22_22 nsd 4096 210 no yes ready up 51 6tnlsasnonrepl desc
stg01-01_23_23 nsd 4096 210 no yes ready up 52 6tnlsasnonrepl
stg01-01_24_24 nsd 4096 210 no yes ready up 53 6tnlsasnonrepl
stg01-01_25_25 nsd 4096 210 no yes ready up 54 6tnlsasnonrepl
stg01-01_26_26 nsd 4096 210 no yes ready up 55 6tnlsasnonrepl
stg01-01_27_27 nsd 4096 210 no yes ready up 56 6tnlsasnonrepl
stg01-01_31_31 nsd 4096 210 no yes ready up 58 6tnlsasnonrepl
stg01-01_32_32 nsd 4096 210 no yes ready up 59 6tnlsasnonrepl
stg01-01_33_33 nsd 4096 210 no yes ready up 60 6tnlsasnonrepl
stg01-01_34_34 nsd 4096 210 no yes ready up 61 6tnlsasnonrepl
stg01-01_35_35 nsd 4096 210 no yes ready up 62 6tnlsasnonrepl
stg01-01_36_36 nsd 4096 210 no yes ready up 63 6tnlsasnonrepl
stg01-01_37_37 nsd 4096 210 no yes ready up 64 6tnlsasnonrepl
stg01-01_38_38 nsd 4096 210 no yes ready up 65 6tnlsasnonrepl
stg01-01_39_39 nsd 4096 210 no yes ready up 66 6tnlsasnonrepl
stg01-01_40_40 nsd 4096 210 no yes ready up 67 6tnlsasnonrepl
stg01-01_41_41 nsd 4096 210 no yes ready up 68 6tnlsasnonrepl
stg01-01_42_42 nsd 4096 210 no yes ready up 69 6tnlsasnonrepl
stg01-01_43_43 nsd 4096 210 no yes ready up 70 6tnlsasnonrepl
stg01-01_44_44 nsd 4096 210 no yes ready up 71 6tnlsasnonrepl
stg01-01_45_45 nsd 4096 210 no yes ready up 72 6tnlsasnonrepl
stg01-01_46_46 nsd 4096 210 no yes ready up 73 6tnlsasnonrepl
stg01-01_47_47 nsd 4096 210 no yes ready up 74 6tnlsasnonrepl
stg01-01_48_48 nsd 4096 210 no yes ready up 75 6tnlsasnonrepl
stg01-01_49_49 nsd 4096 210 no yes ready up 76 6tnlsasnonrepl
stg01-01_50_50 nsd 4096 210 no yes ready up 77 6tnlsasnonrepl
stg01-01_51_51 nsd 4096 210 no yes ready up 78 6tnlsasnonrepl
Number of quorum disks: 3
Read quorum value: 2
Write quorum value: 2
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180413/c22c8133/attachment.htm>
More information about the gpfsug-discuss
mailing list