[gpfsug-discuss] GPFS heartbeat network specifications and resilience

Ashish Thandavan ashish.thandavan at cs.ox.ac.uk
Thu Jul 21 11:26:02 BST 2016


Dear all,

Please could anyone be able to point me at specifications required for 
the GPFS heartbeat network? Are there any figures for latency, jitter, 
etc that one should be aware of?

I also have a related question about resilience. Our three GPFS NSD 
servers utilize a single network port on each server and communicate 
heartbeat traffic over a private VLAN. We are looking at improving the 
resilience of this setup by adding an additional network link on each 
server (going to a different member of a pair of stacked switches than 
the existing one) and running the heartbeat network over bonded 
interfaces on the three servers. Are there any recommendations as to 
which network bonding type to use?

Based on the name alone, Mode 1 (active-backup) appears to be the ideal 
choice, and I believe the switches do not need any special 
configuration. However, it has been suggested that Mode 4 (802.3ad) or 
LACP bonding might be the way to go; this aggregates the two ports and 
does require the relevant switch ports to be configured to support this. 
Is there a recommended bonding mode?

If anyone here currently uses bonded interfaces for their GPFS heartbeat 
traffic, may I ask what type of bond have you configured? Have you had 
any problems with the setup? And more importantly, has it been of use in 
keeping the cluster up and running in the scenario of one network link 
going down?

Thank you,

Regards,
Ash



-- 
-------------------------
Ashish Thandavan

UNIX Support Computing Officer
Department of Computer Science
University of Oxford
Wolfson Building
Parks Road
Oxford OX1 3QD

Phone: 01865 610733
Email: ashish.thandavan at cs.ox.ac.uk




More information about the gpfsug-discuss mailing list