[gpfsug-discuss] NSD network checksums (nsdCksumTraditional)
Kumaran Rajaram
kums at us.ibm.com
Mon Oct 29 21:29:33 GMT 2018
In non-GNR setup, nsdCksumTraditional=yes enables data-integrity checking
between a traditional NSD client node and its NSD server, at the network
level only.
The ESS storage supports end-to-end checksum, NSD client to the ESS IO
servers (at the network level) as well as from ESS IO servers to the
disk/storage. This is further detailed in the docs (link below):
https://www.ibm.com/support/knowledgecenter/en/SSYSP8_5.3.1/com.ibm.spectrum.scale.raid.v5r01.adm.doc/bl1adv_introe2echecksum.htm
Best,
-Kums
From: Stephen Ulmer <ulmer at ulmer.org>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 10/29/2018 04:52 PM
Subject: Re: [gpfsug-discuss] NSD network checksums
(nsdCksumTraditional)
Sent by: gpfsug-discuss-bounces at spectrumscale.org
So the ESS checksums that are highly touted as "protecting all the way to
the disk surface" completely ignore the transfer between the client and
the NSD server? It sounds like you are saying that all of the checksumming
done for GNR is internal to GNR and only protects against bit-flips on the
disk (and in staging buffers, etc.)
I’m asking because your explanation completely ignores calculating
anything on the NSD client and implies that the client could not
participate, given that it does not know about the structure of the vdisks
under the NSD — but that has to be a performance factor for both types if
the transfer is protected starting at the client — which it is in the case
of nsdCksumTraditional which is what we are comparing to ESS checksumming.
If ESS checksumming doesn’t protect on the wire I’d say that marketing has
run amok, because that has *definitely* been implied in meetings for which
I’ve been present. In fact, when asked if Spectrum Scale provides
checksumming for data in-flight, IBM sales has used it as an ESS up-sell
opportunity.
--
Stephen
On Oct 29, 2018, at 3:56 PM, Kumaran Rajaram <kums at us.ibm.com> wrote:
Hi,
>>How can it be that the I/O performance degradation warning only seems to
accompany the nsdCksumTraditional setting and not GNR?
>>Why is there such a penalty for "traditional" environments?
In GNR IO/NSD servers (ESS IO nodes), the checksums are computed in
parallel for a NSD (storage volume/vdisk) across the threads handling
each pdisk/drive (that constitutes the vdisk/volume). This is possible
since the GNR software on the ESS IO servers is tightly integrated with
underlying storage and is aware of the vdisk DRAID configuration
(strip-size, pdisk constituting vdisk etc.) to perform parallel checksum
operations.
In non-GNR + external storage model, the GPFS software on the NSD
server(s) does not manage the underlying storage volume (this is done by
storage RAID controllers) and the checksum is computed serially. This
would contribute to increase in CPU usage and I/O performance degradation
(depending on I/O access patterns, I/O load etc).
My two cents.
Regards,
-Kums
From: Aaron Knister <aaron.s.knister at nasa.gov>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 10/29/2018 12:34 PM
Subject: [gpfsug-discuss] NSD network checksums
(nsdCksumTraditional)
Sent by: gpfsug-discuss-bounces at spectrumscale.org
Flipping through the slides from the recent SSUG meeting I noticed that
in 5.0.2 one of the features mentioned was the nsdCksumTraditional flag.
Reading up on it it seems as though it comes with a warning about
significant I/O performance degradation and increase in CPU usage. I
also recall that data integrity checking is performed by default with
GNR. How can it be that the I/O performance degradation warning only
seems to accompany the nsdCksumTraditional setting and not GNR? As
someone who knows exactly 0 of the implementation details, I'm just
naively assuming that the checksum are being generated (in the same
way?) in both cases and transferred to the NSD server. Why is there such
a penalty for "traditional" environments?
-Aaron
--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20181029/ff9f0cfb/attachment-0002.htm>
More information about the gpfsug-discuss
mailing list