[gpfsug-discuss] mmunlinkfileset problems

Jonathan Buzzard jonathan at buzzard.me.uk
Fri Mar 27 12:49:38 GMT 2015


On Fri, 2015-03-27 at 12:16 +0000, Luke Raimbach wrote:
> Hi All,
> 
> I'm having a problem where unlinking a fileset is taking 7 or 8 minutes
> to complete. We are running 3.5.0.22 and the cluster is small (9 nodes:
> 8 Linux plus 1 Windows node), one file system remote mounted by a
> 3-node Linux cluster, and generally not very busy.
> 
> I wouldn't expect mmunlinkfileset to require the file system to quiesce
> (and even if it did 8 minutes would seem like an awfully long time).
> Previous unlink commands (on version 3.5.0.19) have returned within
> milliseconds.

If there is a waiter around mmunlinkfileset can block even with an empty
fileset and turn your file system unresponsive.

> 
> This very long running command appeared to pause IO for a sufficiently
> long time that cNFS clients timed out too and various services
> elsewhere crashed.
> 

Yep, lesson for the day only unlink filesets when the file system is
quiet, aka a maintenance window.

> Is there any reason why unlinking a fileset would take this long? Also,
> I can't find anywhere a list of commands which might require the FS to
> quiesce - is this information available somewhere?
> 

Deleting snapshots, at least in the past. It might be that guidance from
IBM has changed but once you have been stung by this you kind of thing
you don't like to try it again any time soon.

JAB.


-- 
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.





More information about the gpfsug-discuss mailing list