[gpfsug-discuss] GPFS Remote Cluster Co-existence with CTDB/NFS Re-exporting

Howard, Stewart Jameson sjhoward at iu.edu
Mon Dec 7 17:23:34 GMT 2015


Hi All,


Thanks to Doug and Kevin for the replies.  In answer to Kevin's question about our choice of clustering solution for NFS:  the choice was made hoping to maintain some simplicity by not using more than one HA solution at a time.  However, it seems that this choice might have introduced more wrinkles than it's ironed out.


An update on our situation:  we have actually uncovered another clue since my last posting.  One thing that this now known to be correlated *very* closely with instability in the NFS layer is running `mmcrsnapshot`.    We had noticed that flapping happened like clockwork at midnight every night.  This happens to be the same time at which our crontab was running the `mmcrsnapshot` so, as an experiment, we moved the snapshot to happen at 1a.


After this change, the late-night flapping has moved to 1a and now happens reliably every night at that time.  I saw a post on this list from 2013 stating that `mmcrsnapshot` was known to hang up the filesystem with race conditions that result in deadlocks and am wondering if that is still a problem with the `mmcrsnapthost` command.  Running the snapshots had not been an obvious problem before, but seems to have become one since we deployed ~300 additional GPFS clients in a remote cluster configuration about a week ago.


Can anybody comment on the safety of running `mmcrsnapshot` with a ~300 node remote cluster accessing the filesystem?


Also, I would comment that this is not the only condition under which we see instability in the NFS layer.  We continue to see intermittent instability through the day.  The creation of a snapshot is simply the one well-correlated condition that we've discovered so far.


Thanks so much to everyone for your help  :)


Stewart
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151207/eb39da46/attachment-0002.htm>


More information about the gpfsug-discuss mailing list