[gpfsug-discuss] GPFS and replication.. not a mirror?

Uwe Falke UWEFALKE at de.ibm.com
Fri Apr 29 10:22:10 BST 2016


Zach, 
GPFS replication does not include automatically a comparison of the 
replica copies. 
It protects against one part (i.e. one FG, or two with 3-fold replication) 
of the storage being down. 
How should GPFS know what version is the good one if both replica copies 
are readable?

There are tools in 4.x to compare the replicas, but do use them only from 
4.2 onward (problems with prior versions). Still then you need to decide 
what is the "good" copy (there is a consistency check on MD replicas 
though, but correct/incorrect data blocks cannot be auto-detected for 
obvious reasons). E2E Check-summing (as in GNR) would of course help here.

 
Mit freundlichen Grüßen / Kind regards

 
Dr. Uwe Falke
 
IT Specialist
High Performance Computing Services / Integrated Technology Services / 
Data Center Services
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland
Rathausstr. 7
09111 Chemnitz
Phone: +49 371 6978 2165
Mobile: +49 175 575 2877
E-Mail: uwefalke at de.ibm.com
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland Business & Technology Services GmbH / Geschäftsführung: 
Frank Hammer, Thorsten Moehring
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, 
HRB 17122 




From:   Zachary Giles <zgiles at gmail.com>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   04/29/2016 06:22 AM
Subject:        [gpfsug-discuss] GPFS and replication.. not a mirror?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org



Fellow GPFS Users,

I have a silly question about file replicas... I've been playing around 
with copies=2 (or 3) and hoping that this would protect against data 
corruption on poor-quality RAID controllers.. but it seems that if I 
purposefully corrupt blocks on a LUN used by GPFS, the "replica" doesn't 
take over, rather GPFS just returns corrupt data.  This includes if just 
"dd" into the disk, or if I break the RAID controller somehow by yanking 
whole chassis and the controller responds poorly for a few seconds.

Originally my thinking was that replicas were for mirroring and GPFS would 
somehow return whichever is the "good" copy of your data, but now I'm 
thinking it's just intended for better file placement.. such as having a 
near replica and a far replica so you dont have to cross buildings for 
access, etc. That, and / or,  disk outages where the outage is not 
corruption, just simply outage either by failure or for disk-moves, SAN 
rewiring, etc. In those cases you wouldn't have to "move" all the data 
since you already have a second copy. I can see how that would makes 
sense..

Somehow I guess I always knew this.. but it seems many people say they 
will just turn on copies=2 and be "safe".. but it's not the case..

Which way is the intended?
Has anyone else had experience with this realization?

Thanks,
-Zach


-- 
Zach Giles
zgiles at gmail.com_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss







More information about the gpfsug-discuss mailing list