[gpfsug-discuss] GPFS autoload - wait for IB portstobecomeactive
Frederick Stock
stockf at us.ibm.com
Fri Mar 16 12:05:29 GMT 2018
I have my doubts that mmdiag can be used in this script. In general the
guidance is to avoid or be very careful with mm* commands in a callback
due to the potential for deadlock.
Fred
__________________________________________________
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
stockf at us.ibm.com
From: Jan-Frode Myklebust <janfrode at tanso.net>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 03/16/2018 04:30 AM
Subject: Re: [gpfsug-discuss] GPFS autoload - wait for IB ports
tobecomeactive
Sent by: gpfsug-discuss-bounces at spectrumscale.org
Thanks Olaf, but we don't use NetworkManager on this cluster..
I now created this simple script:
-------------------------------------------------------------------------------------------------------------------------------------------------------------
#! /bin/bash -
#
# Fail mmstartup if not all configured IB ports are active.
#
# Install with:
#
# mmaddcallback fail-if-ibfail --command /var/mmfs/etc/fail-if-ibfail
--event preStartup --sync --onerror shutdown
#
for port in $(/usr/lpp/mmfs/bin/mmdiag --config|grep verbsPorts | cut -f
4- -d " ")
do
grep -q ACTIVE /sys/class/infiniband/${port%/*}/ports/${port##*/}/state
|| exit 1
done
-------------------------------------------------------------------------------------------------------------------------------------------------------------
which I haven't tested, but assume should work. Suggestions for
improvements would be much appreciated!
-jf
On Thu, Mar 15, 2018 at 6:30 PM, Olaf Weiser <olaf.weiser at de.ibm.com>
wrote:
you can try :
systemctl enable NetworkManager-wait-online
ln -s '/usr/lib/systemd/system/NetworkManager-wait-online.service'
'/etc/systemd/system/multi-user.target.wants/NetworkManager-wait-online.service'
in many cases .. it helps ..
From: Jan-Frode Myklebust <janfrode at tanso.net>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 03/15/2018 06:18 PM
Subject: Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to
becomeactive
Sent by: gpfsug-discuss-bounces at spectrumscale.org
I found some discussion on this at
https://www.ibm.com/developerworks/community/forums/html/threadTopic?id=77777777-0000-0000-0000-000014471957&ps=25
and there it's claimed that none of the callback events are early enough
to resolve this. That we need a pre-preStartup trigger. Any idea if this
has changed -- or is the callback option then only to do a "--onerror
shutdown" if it has failed to connect IB ?
On Thu, Mar 8, 2018 at 1:42 PM, Frederick Stock <stockf at us.ibm.com> wrote:
You could also use the GPFS prestartup callback (mmaddcallback) to execute
a script synchronously that waits for the IB ports to become available
before returning and allowing GPFS to continue. Not systemd integrated
but it should work.
Fred
__________________________________________________
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
stockf at us.ibm.com
From: david_johnson at brown.edu
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 03/08/2018 07:34 AM
Subject: Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to
become active
Sent by: gpfsug-discuss-bounces at spectrumscale.org
Until IBM provides a solution, here is my workaround. Add it so it runs
before the gpfs script, I call it from our custom xcat diskless boot
scripts. Based on rhel7, not fully systemd integrated. YMMV!
Regards,
— ddj
——-
[ddj at storage041 ~]$ cat /etc/init.d/ibready
#! /bin/bash
#
# chkconfig: 2345 06 94
# /etc/rc.d/init.d/ibready
# written in 2016 David D Johnson (ddj <at> brown.edu)
#
### BEGIN INIT INFO
# Provides: ibready
# Required-Start:
# Required-Stop:
# Default-Stop:
# Description: Block until infiniband is ready
# Short-Description: Block until infiniband is ready
### END INIT INFO
RETVAL=0
if [[ -d /sys/class/infiniband ]]
then
IBDEVICE=$(dirname $(grep -il infiniband
/sys/class/infiniband/*/ports/1/link* | head -n 1))
fi
# See how we were called.
case "$1" in
start)
if [[ -n $IBDEVICE && -f $IBDEVICE/state ]]
then
echo -n "Polling for InfiniBand link up: "
for (( count = 60; count > 0; count-- ))
do
if grep -q ACTIVE $IBDEVICE/state
then
echo ACTIVE
break
fi
echo -n "."
sleep 5
done
if (( count <= 0 ))
then
echo DOWN - $0 timed out
fi
fi
;;
stop|restart|reload|force-reload|condrestart|try-restart)
;;
status)
if [[ -n $IBDEVICE && -f $IBDEVICE/state ]]
then
echo "$IBDEVICE is $(< $IBDEVICE/state) $(<
$IBDEVICE/rate)"
else
echo "No IBDEVICE found"
fi
;;
*)
echo "Usage: ibready
{start|stop|status|restart|reload|force-reload|condrestart|try-restart}"
exit 2
esac
exit ${RETVAL}
————
-- ddj
Dave Johnson
On Mar 8, 2018, at 6:10 AM, Caubet Serrabou Marc (PSI) <marc.caubet at psi.ch
> wrote:
Hi all,
with autoload = yes we do not ensure that GPFS will be started after the
IB link becomes up. Is there a way to force GPFS waiting to start until IB
ports are up? This can be probably done by adding something like
After=network-online.target and Wants=network-online.target in the systemd
file but I would like to know if this is natively possible from the GPFS
configuration.
Thanks a lot,
Marc
_________________________________________
Paul Scherrer Institut
High Performance Computing
Marc Caubet Serrabou
WHGA/036
5232 Villigen PSI
Switzerland
Telephone: +41 56 310 46 67
E-Mail: marc.caubet at psi.ch
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=JCfJgq6pZnKUI6d-rIgJXVcdZh7vmA5ypB1_goP_FFA&e=
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s=veOZZz80aBzoCTKusx6WOpVlYs64eNkp5pM9kbHgvic&e=
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180316/277f5975/attachment-0002.htm>
More information about the gpfsug-discuss
mailing list