[gpfsug-discuss] data integrity documentation

Stijn De Weirdt stijn.deweirdt at ugent.be
Wed Aug 2 22:53:50 BST 2017


hi steve,

> The nsdChksum settings for none GNR/ESS based system is not officially 
> supported.    It will perform checksum on data transfer over the network 
> only and can be used to help debug data corruption when network is a 
> suspect.
i'll take not officially supported over silent bitrot any day.

> 
> Did any of those "Encountered XYZ checksum errors on network I/O to NSD 
> Client disk" warning messages resulted in disk been changed to "down" 
> state due to IO error? 
no.

 If no disk IO error was reported in GPFS log,
> that means data was retransmitted successfully on retry. 
we suspected as much. as sven already asked, mmfsck now reports clean
filesystem.
i have an ibdump of 2 involved nsds during the reported checksums, i'll
have a closer look if i can spot these retries.

> 
> As sven said, only GNR/ESS provids the full end to end data integrity.
so with the silent network error, we have high probabilty that the data
is corrupted.

we are now looking for a test to find out what adapters are affected. we
hoped that nsdperf with verify=on would tell us, but it doesn't.

> 
> Steve Y. Xiao
> 
> 
> 
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 



More information about the gpfsug-discuss mailing list