[gpfsug-discuss] mmhealth with 4.2.3-5 gives many false alarms ib_rdma_nic_unrecognized
Bryan Banister
bbanister at jumptrading.com
Tue Jan 9 15:51:03 GMT 2018
I can't help but comment that it's amazing that GPFS is using a txt config file instead of requiring a command run that stores config data into a non-editable (but still editable) flat file database... Wow 2018!!
Hahahahaha!
-Bryan
From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Mathias Dietz
Sent: Tuesday, January 09, 2018 3:44 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] mmhealth with 4.2.3-5 gives many false alarms ib_rdma_nic_unrecognized
Note: External Email
________________________________
Hello Heiner,
with 4.2.3-5 mmhealth is always monitoring all ports of a configured IB adapter even if the port is not specified in verbsPorts.
Development has implemented a fix which is planned to be part of 4.2.3-7 (February).
To get rid of the false alarm in the meantime you could disable the Infiniband monitoring altogether.
To disable Infiniband monitoring on a node:
1. Open the file /var/mmfs/mmsysmon/mmsysmonitor.conf
2. Locate the [network]section
3. Add below: ib_rdma_enable_monitoring=False
4. Save file and run "mmsysmoncontrol restart"
If you have questions feel free to contact me directly by email.
Mit freundlichen Grüßen / Kind regards
Mathias Dietz
Spectrum Scale RAS Architect & Release Lead Architect (4.2.3/5.0)
---------------------------------------------------------------------------
IBM Deutschland
Am Weiher 24
65451 Kelsterbach
Phone: +49 70342744105
Mobile: +49-15152801035
E-Mail: mdietz at de.ibm.com<mailto:mdietz at de.ibm.com>
-----------------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martina Koederitz, Geschäftsführung: Dirk WittkoppSitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
From: "Billich Heinrich Rainer (PSI)" <heiner.billich at psi.ch<mailto:heiner.billich at psi.ch>>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Date: 01/09/2018 09:31 AM
Subject: [gpfsug-discuss] mmhealth with 4.2.3-5 gives many false alarms ib_rdma_nic_unrecognized
Sent by: gpfsug-discuss-bounces at spectrumscale.org<mailto:gpfsug-discuss-bounces at spectrumscale.org>
________________________________
Hello,
I just upgraded to 4.2.3-5 and now see many failures 'ib_rdma_nic_unrecognized' in mmhealth, like
Component Status Status Change Reasons
------------------------------------------------------------------------------------------
NETWORK DEGRADED 2018-01-06 15:57:21 ib_rdma_nic_unrecognized(mlx4_0/1)
mlx4_0/1 FAILED 2018-01-06 15:57:21 ib_rdma_nic_unrecognized
I didn't see this messages with 4.2.3-4. The relevant lines in /usr/lpp/mmfs/lib/mmsysmon/NetworkService.py changed between -4 and -5.
What seems to happen: I have Mellanox VPI cards with one port Infiniband and one port Ethernet. mmhealth complains about the Ethernet port. Hmm - I did specify the active Infiniband ports only in verbsPorts, I don't see why mmhealth cares about any other ports when it checks RDMA.
So probably a bug, I'll open a PMR unless somebody points me to a different solution. I tried but I can't hide this event in mmhealth.
Cheers,
Heiner
--
Paul Scherrer Institut
Science IT
Heiner Billich
WHGA 106
CH 5232 Villigen PSI
056 310 36 02
https://www.psi.ch<https://www.psi.ch/>
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
________________________________
Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180109/730808ed/attachment-0002.htm>
More information about the gpfsug-discuss
mailing list