[gpfsug-discuss] CES log files

Simon Thompson (Research Computing - IT Services) S.J.Thompson at bham.ac.uk
Wed Jan 11 14:29:39 GMT 2017


What did the smb log claim on the nodes? Should be in /var/adm/ras, for example if SMB failed, then I could see that CES would mark the node as degraded.

Simon

From: <gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>> on behalf of "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Reply-To: "gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date: Wednesday, 11 January 2017 at 13:59
To: "gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: Re: [gpfsug-discuss] CES log files

Thanks. Some of the node would just say “failed” or “degraded” with the DCs offline. Of those that thought they were happy to host a CES IP address, they did not respond and winbindd process would take up 100% CPU as seen through top with no users on it.

Interesting that even though all CES nodes had the same configuration, three of them never had a problem at all.

JF – I’ll look at the protocol tracing next time this happens. It’s a rare thing that three DCs go offline at once but even so there should have been enough resiliency to cope.

Thanks
Richard

From: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org> [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Andrew Beattie
Sent: 11 January 2017 09:55
To: gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>
Cc: gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] CES log files

mmhealth might be a good place to start

CES should probably throw a message along the lines of the following:

mmhealth shows something is wrong with AD server:
...
CES                      DEGRADED                 ads_down
...
Andrew Beattie
Software Defined Storage  - IT Specialist
Phone: 614-2133-7927
E-mail: abeattie at au1.ibm.com<mailto:abeattie at au1.ibm.com>


----- Original message -----
From: "Sobey, Richard A" <r.sobey at imperial.ac.uk<mailto:r.sobey at imperial.ac.uk>>
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
To: "'gpfsug-discuss at spectrumscale.org<mailto:'gpfsug-discuss at spectrumscale.org>'" <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Cc:
Subject: [gpfsug-discuss] CES log files
Date: Wed, Jan 11, 2017 7:27 PM


Which files do I need to look in to determine what’s happening with CES… supposing for example a load of domain controllers were shut down and CES had no clue how to handle this and stopped working until the DCs were switched back on again.



Mmfs.log.latest said everything was fine btw.



Thanks

Richard
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170111/ea6eae27/attachment-0002.htm>


More information about the gpfsug-discuss mailing list