[gpfsug-discuss] nosmap parameter for RHEL7 x86_64 on Haswell/Broadwell?

Bryan Banister bbanister at jumptrading.com
Tue Jul 5 15:58:35 BST 2016


Wanted to comment that we also hit this issue and agree with Paul that it would be nice in the FAQ to at least have something like the vertical bars that denote changed or added lines in a document, which are seen in the GPFS Admin guides.  This should make it easy to see what has changed.

Would also be nice to "Follow this page" to get notifications of when the FAQ changes from my IBM Knowledge Center account... or maybe the person that publishes the changes could announce the update on the GPFS - Announce  Developer Works page.

https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000001606

Cheers,
-Bryan

From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Sanchez, Paul
Sent: Friday, June 03, 2016 2:38 PM
To: gpfsug main discussion list (gpfsug-discuss at spectrumscale.org) <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] nosmap parameter for RHEL7 x86_64 on Haswell/Broadwell?

After some puzzling debugging on our new Broadwell servers, all of which slowly became brick-like upon after getting stuck starting GPFS, we discovered that this was already a known issue in the FAQ.  Adding "nosmap" to the kernel command line in grub prevents SMAP from seeing the kernel-userspace memory interactions of GPFS as a reason to slowly grind all cores to a standstill, apparently spinning on stuck locks(?).  (Big thanks go to RedHat for turning us on to the answer when we opened a case.)

>From https://www.ibm.com/support/knowledgecenter/STXKQY/gpfsclustersfaq.html, section 3.2:

Note:  In order for IBM Spectrum Scale on RHEL 7 to run on the Haswell processor
*        Disable the Supervisor Mode Access Prevention (smap) kernel parameter
*        Reboot the RHEL 7 node before using GPFS


Some observations worth noting:

1.      We've been running for a year with Haswell processors and have hundreds of Haswell RHEL7 nodes which do not exhibit this problem.  So maybe this only really affects Broadwell CPUs?
2.      It would be very nice for SpectrumScale to take a peek at /proc/cpuinfo and /proc/cmdline before starting up, and refuse to break the host when it has affected processors and kernel without "nosmap".  Instead, an error message describing the fix would have made my day.
3.      I'm going to have to start using a script to diff the FAQ for these gotchas, unless anyone knows of a better way to subscribe just to updates to this doc.

Thanks,
Paul Sanchez


________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160705/0fdd50a6/attachment-0001.htm>


More information about the gpfsug-discuss mailing list