[gpfsug-discuss] Odd networking/name resolution issue

Jonathan Buzzard jonathan.buzzard at strath.ac.uk
Sat May 9 23:22:15 BST 2020


On 09/05/2020 12:06, Jaime Pinto wrote:
> DNS shouldn't be relied upon on a GPFS cluster for internal 
> communication/management or data.
> 

The 1980's have called and want their lack of IP resolution protocols 
back :-)

I would kindly disagree. If your DNS is not working then your cluster is 
fubar anyway and a zillion other things will also break very rapidly. 
For us at least half of the running jobs would be dead in a few minutes 
as failure to contact license servers would cause the software to stop. 
All authentication and account lookup is also going to fail as well.

You could distribute a hosts file but frankly outside of a storage only 
cluster (as opposed to one with hundreds if not thousands of compute 
nodes) that is frankly madness and will inevitably come to bite you in 
the ass because they *will* get out of sync. The only hosts entry we 
have is for the Salt Stack host because it tries to do things before the 
DNS resolvers have been setup and consequently breaks otherwise. Which 
IMHO is duff on it's behalf.

I would add I can't think of a time in the last 16 years where internal 
DNS at any University I have worked at has stopped working for even one 
millisecond. If DNS is that flaky at your institution then I suggest 
sacking the people responsible for it's maintenance as being incompetent 
twits. It is just such a vanishingly remote possibility that it's not 
worth bothering about. Frankly a aircraft falling out the sky and 
squishing your data centre seems more likely to me.

Finally in a world of IPv6 then anything other than DNS is a utter 
madness IMHO.


JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG



More information about the gpfsug-discuss mailing list