[gpfsug-discuss] Node expulsion from GPFS Cluster
mattw at vpac.org
Wed Jan 16 21:12:28 GMT 2013
I might be stating the obvious, but do make sure you check the logs on the node that requested the expulsion.
Frequently the master nodes don't get the full details of why a node is to be expelled, they just get the expulsion request, and log the action. If you check the node that made the request, it often has exactly why it made the request, usually a reachability issue, and from that node, you want to do a copy and past of the host it requested expelled, and do a host lookup to make sure it's getting the right IP or interface.
Usually at that point it turns out that someone added the expelled node with the wrong interface.
On 17/01/2013, at 3:00 AM, Linda Dewar <linda at epcc.ed.ac.uk> wrote:
> Just wondering if anyone else has seen anything like this.
> I am using GPFS 220.127.116.11.
> I have a multicluster setup with NSD servers and TSM servers (running RHEL 6.2) in a Server cluster and clients (some running SUSE 11 SP1 and some running RHEL 6.2) in a Client cluster.
> All clients in both clusters are connected to a IBM BNT G8200 switch
> Nodes in the Client cluster are regularly expelled from the cluster, either at the request of other nodes in the Client cluster, or nodes in the Server cluster. This means that GPFS is shut down on the expelled node and filesystems unmounted . Obviously inconvenient to the users.
> Basic connectivity (ie response to ping) between nodes is unaffected at the times of the node expulsions. We are looking at using Wireshark to investigate what appears to be some sort of network connectivity problem.
> Does any of this sound familiar to anyone else, and does anyone have any suggestions as to what we should be looking for?
> Many thanks,
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
More information about the gpfsug-discuss