<html><body><p><font size="2">Hi,</font><br><br><font size="2">that message is still in memory. </font><br><font size="2">"mmhealth node eventlog --clear" deletes all old events but those which are currently active are not affected.</font><br><br><font size="2">I think this is related to multiple Collector Nodes, will dig deeper into that code to find out if some issue lurks there.</font><br><font size="2">As a stop-gap measure one could execute "mmsysmoncontrol restart" on the affected node(s) as this stops the monitoring process and doing so clears the event in memory.</font><br><br><font size="2">The data used for the event comes from mmlspool (should be close or identical to mmdf)</font><br><br><font size="2" face="Arial">Mit freundlichen Grüßen / Kind regards</font><p><b><font face="Arial">Norbert Schuld</font></b><p><br><br><br><img width="16" height="16" src="cid:1__=8FBB0847DFBB31ED8f9e8a93df938690918c8FB@" border="0" alt="Inactive hide details for valdis.kletnieks---20/07/2018 00:15:33---So I'm trying to tidy up things like 'mmhealth' etc. Got mo"><font size="2" color="#424282">valdis.kletnieks---20/07/2018 00:15:33---So I'm trying to tidy up things like 'mmhealth' etc. Got most of it fixed, but stuck on one thing..</font><br><br><font size="2" color="#5F5F5F">From: </font><font size="2">valdis.kletnieks@vt.edu</font><br><font size="2" color="#5F5F5F">To: </font><font size="2">gpfsug-discuss@spectrumscale.org</font><br><font size="2" color="#5F5F5F">Date: </font><font size="2">20/07/2018 00:15</font><br><font size="2" color="#5F5F5F">Subject: </font><font size="2">[gpfsug-discuss] mmhealth - where is the info hiding?</font><br><font size="2" color="#5F5F5F">Sent by: </font><font size="2">gpfsug-discuss-bounces@spectrumscale.org</font><br><hr width="100%" size="2" align="left" noshade style="color:#8091A5; "><br><br><br><tt><font size="2">So I'm trying to tidy up things like 'mmhealth' etc. Got most of it fixed, but stuck on<br>one thing..<br><br>Note: I already did a 'mmhealth node eventlog --clear -N all' yesterday, which<br>cleaned out a bunch of other long-past events that were "stuck" as failed /<br>degraded even though they were corrected days/weeks ago - keep this in mind as<br>you read on....<br><br># mmhealth cluster show<br><br>Component Total Failed Degraded Healthy Other<br>-------------------------------------------------------------------------------------<br>NODE 10 0 0 10 0<br>GPFS 10 0 0 10 0<br>NETWORK 10 0 0 10 0<br>FILESYSTEM 1 0 1 0 0<br>DISK 102 0 0 102 0<br>CES 4 0 0 4 0<br>GUI 1 0 0 1 0<br>PERFMON 10 0 0 10 0<br>THRESHOLD 10 0 0 10 0<br><br>Great. One hit for 'degraded' filesystem.<br><br># mmhealth node show --unhealthy -N all<br>(skipping all the nodes that show healthy)<br><br>Node name: arnsd3-vtc.nis.internal<br>Node status: HEALTHY<br>Status Change: 21 hours ago<br><br>Component Status Status Change Reasons<br>-----------------------------------------------------------------------------------<br>FILESYSTEM FAILED 24 days ago pool-data_high_error(archive/system)<br>(...)<br>Node name: arproto2-isb.nis.internal<br>Node status: HEALTHY<br>Status Change: 21 hours ago<br><br>Component Status Status Change Reasons<br>----------------------------------------------------------------------------------<br>FILESYSTEM DEGRADED 6 days ago pool-data_high_warn(archive/system)<br><br>mmdf tells me:<br>nsd_isb_01 13103005696 1 No Yes 1747905536 ( 13%) 111667200 ( 1%)<br>nsd_isb_02 13103005696 1 No Yes 1748245504 ( 13%) 111724384 ( 1%)<br>(94 more LUNs all within 0.2% of these for usage - data is striped out pretty well)<br><br>There's also 6 SSD LUNs for metadata:<br>nsd_isb_flash_01 2956984320 1 Yes No 2116091904 ( 72%) 26996992 ( 1%)<br>(again, evenly striped)<br><br>So who is remembering that status, and how to clear it?<br>[attachment "attccdgx.dat" deleted by Norbert Schuld/Germany/IBM] _______________________________________________<br>gpfsug-discuss mailing list<br>gpfsug-discuss at spectrumscale.org<br></font></tt><tt><font size="2"><a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a></font></tt><tt><font size="2"><br></font></tt><br><BR>
</body></html>