[gpfsug-discuss] subnets confusion
Brian Marshall
mimarsh2 at vt.edu
Tue Nov 8 13:40:43 GMT 2016
All,
I have a tricky (at least to me) subnets question.
I have 2 NSD Server clusters:
Serv1 -> daemon on 10.51 with high speed network on 10.82
Serv2 -> daemon on 10.42 a high speed network
and 2 client clusters:
Cli1 -> daemon on 10.81 with high speed network on 10.82
Cli2 -> daemon on 10.41 with high speed network on 10.42
Serv1 has the following subnets operand:
subnets 10.82.0.0/Serv1;Cli1 10.41.0.0/Cli2
Cli1 has the following subnets
subnets 10.82.0.0/Serv1;Cli1
Cli2 has the following subnets
subnets 10.51.0.0/Serv1 10.41.0.0/Cli2 10.42.0.0/Serv2
Problem:
Sometimes Serv1 will try to contact Cli2 nodes on the 10.42 address which
they don't have access to. I get errors like
Close connection to 10.42.1 0.1 hs001.cluster.ib (Connection timed out)
Cli2 nodes can connect/re-connect to Serv1 once the server cluster kicks
them out.
Serv1 has Cli2 listed on its 10.41 subnets operand, so I don't fully
understand why Serv1 does not use 10.41 to connect
Possible Solution??
I think to fix this I either need to add Serv1 to the 10.41 subnet of Cli2
OR move the 10.42 operand on Cli2 to the front of the list.
I am working from this link
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General+Parallel+File+System+(GPFS)/page/GPFS+Network+Communication+Overview
Please let me know if you need more info. I have tried to strip this down
to the bare minimum and in doing so may have left out good details.
Thank you,
Brian Marshall
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20161108/030bdaea/attachment-0001.htm>
More information about the gpfsug-discuss
mailing list