[gpfsug-discuss] How to simulate an NSD failure?

John Hearns john.hearns at asml.com
Fri Oct 13 13:56:18 BST 2017


I have set up a small testbed, consisting of three nodes. Two of the nodes have a disk which is being used as an NSD.
This is being done for some preparation for fun and games with some whizzy new servers. The testbed has spinning drives.
I have created two NSDs and have set the data replication to 1 (this is deliberate).
I am trying to fail an NSD and find which files have parts on the failed NSD.
A first test with 'mmdeldisk' didn't have much effect as SpectrumScale is smart enough to copy the data off the drive.

I now take the drive offline and delete it by
echo offline > /sys/block/sda/device/state
echo 1 > /sys/block/sda/delete

Short of going to the data centre and physically pulling the drive that's a pretty final way of stopping access to a drive.
I then wrote 100 files to the filesystem, the node with the NSD did log "rejecting I/O to offline device"
However mmlsdisk <filesystem>   says that this disk is status 'ready'

I am going to stop that NSD and run an mmdeldisk - at which point I do expect things to go south rapidly.
I just am not understanding at what point a failed write would be detected? Or once a write fails are all the subsequent writes
Routed off to the active NSD(s) ??

Sorry if I am asking an idiot question.

Inspector.clouseau at surete.fr











-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. Neither the sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20171013/f5012fdf/attachment-0001.htm>


More information about the gpfsug-discuss mailing list