[gpfsug-discuss] SS 4.2.1.0 upgrade pain

Greg.Lehmann at csiro.au Greg.Lehmann at csiro.au
Wed Aug 3 06:06:32 BST 2016


On Debian I am seeing this when trying to upgrade:

mmshutdown
dpkg -I gpfs.base_4.2.1-0_amd64.deb gpfs.docs_4.2.1-0_all.deb gpfs.ext_4.2.1-0_amd64.deb gpfs.gpl_4.2.1-0_all.deb gpfs.gskit_8.0.50-57_amd64.deb gpfs.msg.en-us_4.2.1-0_all.deb
(Reading database ... 65194 files and directories currently installed.)
Preparing to replace gpfs.base 4.1.0-6 (using gpfs.base_4.2.1-0_amd64.deb) ...
Unpacking replacement gpfs.base ...
Preparing to replace gpfs.docs 4.1.0-6 (using gpfs.docs_4.2.1-0_all.deb) ...
Unpacking replacement gpfs.docs ...
Preparing to replace gpfs.ext 4.1.0-6 (using gpfs.ext_4.2.1-0_amd64.deb) ...
Unpacking replacement gpfs.ext ...

Etc.

Unpacking replacement gpfs.gpl ...
Preparing to replace gpfs.gskit 8.0.50-32 (using gpfs.gskit_8.0.50-57_amd64.deb) ...
Unpacking replacement gpfs.gskit ...
Preparing to replace gpfs.msg.en-us 4.1.0-6 (using gpfs.msg.en-us_4.2.1-0_all.deb) ...
Unpacking replacement gpfs.msg.en-us ...
Setting up gpfs.base (4.2.1-0) ...

At which point it hangs. A ps shows this:
ps -ef | grep mm
root     21269     1  0 14:18 pts/0    00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
root     21276 21150  1 14:18 pts/0    00:00:03 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmsysmoncontrol start
root     21363     1  0 14:18 ?        00:00:00 /usr/lpp/mmfs/bin/mmsdrserv 1191 10 10 /var/adm/ras/mmsdrserv.log 128 yes
root     22485 21276  0 14:18 pts/0    00:00:00 python /usr/lpp/mmfs/bin/mmsysmon.py
root     22486 22485  0 14:18 pts/0    00:00:00 /bin/sh -c /usr/lpp/mmfs/bin/mmlsmgr -c
root     22488 22486  1 14:18 pts/0    00:00:03 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmlsmgr -c
root     24420 22488  0 14:18 pts/0    00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmcommon linkCommand hadoop1-12-cdc-ib2.it.csiro.au /var/mmfs/tmp/nodefile.mmlsmgr.22488 mmlsmgr -c
root     24439 24420  0 14:18 pts/0    00:00:00 /usr/bin/perl /usr/lpp/mmfs/bin/mmdsh -svL gpfs-07-cdc-ib2.san.csiro.au /usr/lpp/mmfs/bin/mmremote mmrpc:1:1:1510:mmrc_mmlsmgr_hadoop1-12-cdc-ib2.it.csiro.au_24420_1470197923_: runCmd _NO_FILE_COPY_ _NO_MOUNT_CHECK_ NULL _LINK_ mmlsmgr -c
root     24446 24439  0 14:18 pts/0    00:00:00 /usr/bin/ssh gpfs-07-cdc-ib2.san.csiro.au -n -l root /bin/ksh -c ' LANG=en_US.UTF-8 LC_ALL= LC_COLLATE= LC_TYPE= LC_MONETARY= LC_NUMERIC= LC_TIME= LC_MESSAGES= MMMODE=lc environmentType=lc2 GPFS_rshPath=/usr/bin/ssh GPFS_rcpPath=/usr/bin/scp mmScriptTrace= GPFSCMDPORTRANGE=0 GPFS_CIM_MSG_FORMAT=  /usr/lpp/mmfs/bin/mmremote mmrpc:1:1:1510:mmrc_mmlsmgr_hadoop1-12-cdc-ib2.it.csiro.au_24420_1470197923_: runCmd _NO_FILE_COPY_ _NO_MOUNT_CHECK_ NULL _LINK_ mmlsmgr -c '
root     24546 21269  0 14:23 pts/0    00:00:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmccrmonitor 15
root     24548 24455  0 14:23 pts/1    00:00:00 grep mm

It is trying to connect with ssh to one of my nsd servers, that it does not have permission to? I am guessing that is where the hang is. Anybody else seen this? I have a workaround - remove from cluster before the update, but this is a bit of extra work I can do without. I have not had to this for previous versions starting with 4.1.0.0.

Greg

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160803/4d21932f/attachment-0001.htm>


More information about the gpfsug-discuss mailing list