[gpfsug-discuss] Wrong nodename after server restart
Michal Zacek
zacekm at img.cas.cz
Tue Sep 12 10:40:35 BST 2017
Hi,
I had to restart two of my gpfs servers (gpfs-n4 and gpfs-quorum) and
after that I was unable to move CES IP address back with strange error
"mmces address move: GPFS is down on this node". After I double checked
that gpfs state is active on all nodes, I dug deeper and I think I found
problem, but I don't really know how this could happen.
Look at the names of nodes:
[root at gpfs-n2 ~]# mmlscluster # Looks good
GPFS cluster information
========================
GPFS cluster name: gpfscl1.img.local
GPFS cluster id: 17792677515884116443
GPFS UID domain: img.local
Remote shell command: /usr/bin/ssh
Remote file copy command: /usr/bin/scp
Repository type: CCR
Node Daemon node name IP address Admin node name
Designation
----------------------------------------------------------------------------------
1 gpfs-n4.img.local 192.168.20.64 gpfs-n4.img.local
quorum-manager
2 gpfs-quorum.img.local 192.168.20.60 gpfs-quorum.img.local quorum
3 gpfs-n3.img.local 192.168.20.63 gpfs-n3.img.local
quorum-manager
4 tau.img.local 192.168.1.248 tau.img.local
5 gpfs-n1.img.local 192.168.20.61 gpfs-n1.img.local
quorum-manager
6 gpfs-n2.img.local 192.168.20.62 gpfs-n2.img.local
quorum-manager
8 whale.img.cas.cz 147.231.150.108 whale.img.cas.cz
[root at gpfs-n2 ~]# mmlsmount gpfs01 -L # not so good
File system gpfs01 is mounted on 7 nodes:
192.168.20.63 gpfs-n3
192.168.20.61 gpfs-n1
192.168.20.62 gpfs-n2
192.168.1.248 tau
192.168.20.64 gpfs-n4.img.local
192.168.20.60 gpfs-quorum.img.local
147.231.150.108 whale.img.cas.cz
[root at gpfs-n2 ~]# tsctl shownodes up | tr ',' '\n' # very wrong
whale.img.cas.cz.img.local
tau.img.local
gpfs-quorum.img.local.img.local
gpfs-n1.img.local
gpfs-n2.img.local
gpfs-n3.img.local
gpfs-n4.img.local.img.local
The "tsctl shownodes up" is the reason why I'm not able to move CES
address back to gpfs-n4 node, but the real problem are different
nodenames. I think OS is configured correctly:
[root at gpfs-n4 /]# hostname
gpfs-n4
[root at gpfs-n4 /]# hostname -f
gpfs-n4.img.local
[root at gpfs-n4 /]# cat /etc/resolv.conf
nameserver 192.168.20.30
nameserver 147.231.150.2
search img.local
domain img.local
[root at gpfs-n4 /]# cat /etc/hosts | grep gpfs-n4
192.168.20.64 gpfs-n4.img.local gpfs-n4
[root at gpfs-n4 /]# host gpfs-n4
gpfs-n4.img.local has address 192.168.20.64
[root at gpfs-n4 /]# host 192.168.20.64
64.20.168.192.in-addr.arpa domain name pointer gpfs-n4.img.local.
Can someone help me with this.
Thanks,
Michal
p.s. gpfs version: 4.2.3-2 (CentOS 7)
More information about the gpfsug-discuss
mailing list