[gpfsug-discuss] RAID config for SSD's - potential pitfalls

Buterbaugh, Kevin L Kevin.Buterbaugh at Vanderbilt.Edu
Wed Apr 19 22:12:35 BST 2017


Hi Marc,

But the limitation on GPFS replication is that I can set replication separately for metadata and data, but no matter whether I have one data pool or ten data pools they all must have the same replication, correct?

And believe me I *love* GPFS replication … I would hope / imagine that I am one of the few people on this mailing list who has actually gotten to experience a “fire scenario” … electrical fire, chemical suppressant did it’s thing, and everything in the data center had a nice layer of soot, ash, and chemical suppressant on and in it and therefore had to be professionally cleaned.  Insurance bought us enough disk space that we could (temporarily) turn on GPFS data replication and clean storage arrays one at a time!

But in my current hypothetical scenario I’m stretching the budget just to get that one storage array with 12 x 1.8 TB SSD’s in it.  Two are out of the question.

My current metadata that I’ve got on SSDs is on RAID 1 mirrors and has GPFS replication set to 2.  I thought the multiple RAID 1 mirrors approach was the way to go for SSDs for data as well, as opposed to one big RAID 6 LUN, but wanted to get the advice of those more knowledgeable than me.

Thanks!

Kevin

On Apr 19, 2017, at 3:49 PM, Marc A Kaplan <makaplan at us.ibm.com<mailto:makaplan at us.ibm.com>> wrote:

As I've mentioned before, RAID choices for GPFS are not so simple.    Here are  a couple points to consider, I'm sure there's more.  And if I'm wrong, someone will please correct me - but I believe the two biggest pitfalls are:

  *   Some RAID configurations (classically 5 and 6) work best with large, full block writes.  When the file system does a partial block write, RAID may have to read a full "stripe" from several devices, compute the differences and then write back the modified data to several devices.  This is certainly true with RAID that is configured over several storage devices, with error correcting codes.  SO, you do NOT want to put GPFS metadata (system pool!) on RAID configured with large stripes and error correction. This is the Read-Modify-Write Raid pitfall.
  *   GPFS has built-in replication features - consider using those instead of RAID replication (classically Raid-1).  GPFS replication can work with storage devices that are in different racks, separated by significant physical space, and from different manufacturers.  This can be more robust than RAID in a single box or single rack.  Consider a fire scenario, or exploding power supply or similar physical disaster.  Consider that storage devices and controllers from the same manufacturer may have the same bugs, defects, failures.


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss



—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170419/6252633c/attachment-0002.htm>


More information about the gpfsug-discuss mailing list