[gpfsug-discuss] Best of spectrum scale

IBM Spectrum Scale scale at us.ibm.com
Tue Sep 8 18:37:59 BST 2020


I think it is incorrect to assume that a command that continues after 
detecting the working directory has been removed is going to cause damage 
to the file system.  Further, there is no a priori means to confirm if the 
lack of a working directory will cause the command to fail.  I will agree 
that there may be admins that would prefer the command fail fast and allow 
them to restart the command anew, but I suspect there are admins that 
prefer the command press ahead in hopes that it can complete successfully 
and not require another execution.  I'm sure we can conjure scenarios that 
support both points of view.  Perhaps what is desired is a message that 
more clearly describes what is being undertaken.  For example, "The 
current working directory, <directory_name>, no longer exists.  Execution 
continues."

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of  Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
. 

If your query concerns a potential software error in Spectrum Scale (GPFS) 
and you have an IBM software maintenance contract please contact 
1-800-237-5511 in the United States or your local IBM Service Center in 
other countries. 

The forum is informally monitored as time permits and should not be used 
for priority messages to the Spectrum Scale (GPFS) team.



From:   Jonathan Buzzard <jonathan.buzzard at strath.ac.uk>
To:     gpfsug-discuss at spectrumscale.org
Date:   09/08/2020 12:10 PM
Subject:        [EXTERNAL] Re: [gpfsug-discuss] Best of spectrum scale
Sent by:        gpfsug-discuss-bounces at spectrumscale.org



On 08/09/2020 14:04, IBM Spectrum Scale wrote:
> I think a better metaphor is that the bridge we just crossed has 
> collapsed and as long as we do not need to cross it again our journey 
> should reach its intended destination :-)  As I understand the intent of 

> this message is to alert the user (and our support teams) that the 
> directory from which a command was executed no longer exist.  Should 
> that be of consequence to the execution of the command then failure is 
> not unexpected, however, many commands do not make use of the current 
> directory so they likely will succeed.  If you consider the view point 
> of a command failing because the working directory was removed, but not 
> knowing that was the root cause, I think you can see why this message 
> was added into the administration infrastructure.  It allows this odd 
> failure scenario to be quickly recognized saving time for both the user 
> and IBM support, in tracking down the root cause.
> 

I think the issue being taken is that you get an error message of

     The command may fail in an unexpected way.  Processing continues ..

Now to my mind that is an instant WTF, and if your description is 
correct the command should IMHO have exiting saying something like

     Working directory vanished, exiting command

If there is any chance of the command failing then it should not be 
executed IMHO. I would rather issue it again from a directory that exists.

The way I look at it is that file systems have "state", that is if 
something goes wrong then you could be looking at extended downtime as 
you break the backup out and start restoring. GPFS file systems have a 
tendency to be large, so even if you have a backup it is not a pleasant 
process and could easily take weeks to get things back to rights.

Consequently most system admins would prefer the command does not 
continue if there is any possibility of it failing and messing up the 
"state" of my file system.

That's unlike say the configuration on a network switch that can be 
quickly be put back with minimal interruption.

JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss 






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20200908/684aa7cb/attachment-0002.htm>


More information about the gpfsug-discuss mailing list