[gpfsug-discuss] Bad disk but not failed in DSS-G
Jonathan Buzzard
jonathan.buzzard at strath.ac.uk
Mon Jun 24 10:41:50 BST 2024
On 20/06/2024 23:32, Achim Rehor wrote:
[SNIP]
> Fred is most probably correct here. the two errors are not necessarily
> the same.
>
Turns out Fred was incorrect and having pushed the bad disk out the file
system the backups magically started working again. Not that, that
should come as the slightest surprise to anyone.
However finding I have a bad disk because the backups are failing is not
good at all because it means I can't rely on GPFS's health monitoring to
accurately report the state of the file system :-(
It also begs the question with hundreds of I/O errors on a disk why was
it not failed by GPFS? What criteria does GPFS use for deciding if a
disk is bad as clearly they are not accurate.
JAB.
--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
More information about the gpfsug-discuss
mailing list