[gpfsug-discuss] GPFS and replication.. not a mirror?
Uwe Falke
UWEFALKE at de.ibm.com
Fri Apr 29 10:22:10 BST 2016
Zach,
GPFS replication does not include automatically a comparison of the
replica copies.
It protects against one part (i.e. one FG, or two with 3-fold replication)
of the storage being down.
How should GPFS know what version is the good one if both replica copies
are readable?
There are tools in 4.x to compare the replicas, but do use them only from
4.2 onward (problems with prior versions). Still then you need to decide
what is the "good" copy (there is a consistency check on MD replicas
though, but correct/incorrect data blocks cannot be auto-detected for
obvious reasons). E2E Check-summing (as in GNR) would of course help here.
Mit freundlichen Grüßen / Kind regards
Dr. Uwe Falke
IT Specialist
High Performance Computing Services / Integrated Technology Services /
Data Center Services
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland
Rathausstr. 7
09111 Chemnitz
Phone: +49 371 6978 2165
Mobile: +49 175 575 2877
E-Mail: uwefalke at de.ibm.com
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland Business & Technology Services GmbH / Geschäftsführung:
Frank Hammer, Thorsten Moehring
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 17122
From: Zachary Giles <zgiles at gmail.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 04/29/2016 06:22 AM
Subject: [gpfsug-discuss] GPFS and replication.. not a mirror?
Sent by: gpfsug-discuss-bounces at spectrumscale.org
Fellow GPFS Users,
I have a silly question about file replicas... I've been playing around
with copies=2 (or 3) and hoping that this would protect against data
corruption on poor-quality RAID controllers.. but it seems that if I
purposefully corrupt blocks on a LUN used by GPFS, the "replica" doesn't
take over, rather GPFS just returns corrupt data. This includes if just
"dd" into the disk, or if I break the RAID controller somehow by yanking
whole chassis and the controller responds poorly for a few seconds.
Originally my thinking was that replicas were for mirroring and GPFS would
somehow return whichever is the "good" copy of your data, but now I'm
thinking it's just intended for better file placement.. such as having a
near replica and a far replica so you dont have to cross buildings for
access, etc. That, and / or, disk outages where the outage is not
corruption, just simply outage either by failure or for disk-moves, SAN
rewiring, etc. In those cases you wouldn't have to "move" all the data
since you already have a second copy. I can see how that would makes
sense..
Somehow I guess I always knew this.. but it seems many people say they
will just turn on copies=2 and be "safe".. but it's not the case..
Which way is the intended?
Has anyone else had experience with this realization?
Thanks,
-Zach
--
Zach Giles
zgiles at gmail.com_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
More information about the gpfsug-discuss
mailing list