[gpfsug-discuss] mmhealth with 4.2.3-5 gives many false alarms ib_rdma_nic_unrecognized
Mathias Dietz
MDIETZ at de.ibm.com
Tue Jan 9 09:43:58 GMT 2018
Hello Heiner,
with 4.2.3-5 mmhealth is always monitoring all ports of a configured IB
adapter even if the port is not specified in verbsPorts.
Development has implemented a fix which is planned to be part of 4.2.3-7
(February).
To get rid of the false alarm in the meantime you could disable the
Infiniband monitoring altogether.
To disable Infiniband monitoring on a node:
1. Open the file /var/mmfs/mmsysmon/mmsysmonitor.conf
2. Locate the [network]section
3. Add below: ib_rdma_enable_monitoring=False
4. Save file and run "mmsysmoncontrol restart"
If you have questions feel free to contact me directly by email.
Mit freundlichen Grüßen / Kind regards
Mathias Dietz
Spectrum Scale RAS Architect & Release Lead Architect (4.2.3/5.0)
---------------------------------------------------------------------------
IBM Deutschland
Am Weiher 24
65451 Kelsterbach
Phone: +49 70342744105
Mobile: +49-15152801035
E-Mail: mdietz at de.ibm.com
-----------------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martina Koederitz, Geschäftsführung: Dirk
WittkoppSitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht
Stuttgart, HRB 243294
From: "Billich Heinrich Rainer (PSI)" <heiner.billich at psi.ch>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 01/09/2018 09:31 AM
Subject: [gpfsug-discuss] mmhealth with 4.2.3-5 gives many false
alarms ib_rdma_nic_unrecognized
Sent by: gpfsug-discuss-bounces at spectrumscale.org
Hello,
I just upgraded to 4.2.3-5 and now see many failures
?ib_rdma_nic_unrecognized? in mmhealth, like
Component Status Status Change Reasons
------------------------------------------------------------------------------------------
NETWORK DEGRADED 2018-01-06 15:57:21
ib_rdma_nic_unrecognized(mlx4_0/1)
mlx4_0/1 FAILED 2018-01-06 15:57:21
ib_rdma_nic_unrecognized
I didn?t see this messages with 4.2.3-4. The relevant lines in
/usr/lpp/mmfs/lib/mmsysmon/NetworkService.py changed between -4 and -5.
What seems to happen: I have Mellanox VPI cards with one port Infiniband
and one port Ethernet. mmhealth complains about the Ethernet port. Hmm ?
I did specify the active Infiniband ports only in verbsPorts, I don?t see
why mmhealth cares about any other ports when it checks RDMA.
So probably a bug, I?ll open a PMR unless somebody points me to a
different solution. I tried but I can?t hide this event in mmhealth.
Cheers,
Heiner
--
Paul Scherrer Institut
Science IT
Heiner Billich
WHGA 106
CH 5232 Villigen PSI
056 310 36 02
https://www.psi.ch
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180109/6ef32c83/attachment-0002.htm>
More information about the gpfsug-discuss
mailing list