[gpfsug-discuss] Bad disk but not failed in DSS-G

Jonathan Buzzard jonathan.buzzard at strath.ac.uk
Mon Jun 24 10:41:50 BST 2024


On 20/06/2024 23:32, Achim Rehor wrote:

[SNIP]

> Fred is most probably correct here. the two errors are not necessarily 
> the same.
> 

Turns out Fred was incorrect and having pushed the bad disk out the file 
system the backups magically started working again. Not that, that 
should come as the slightest surprise to anyone.

However finding I have a bad disk because the backups are failing is not 
good at all because it means I can't rely on GPFS's health monitoring to 
accurately report the state of the file system :-(

It also begs the question with hundreds of I/O errors on a disk why was 
it not failed by GPFS? What criteria does GPFS use for deciding if a 
disk is bad as clearly they are not accurate.


JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG




More information about the gpfsug-discuss mailing list