[gpfsug-discuss] RDMA question

Jonathan Buzzard jonathan.buzzard at strath.ac.uk
Wed Jan 17 15:37:48 GMT 2024


On 17/01/2024 15:21, Ward Poelmans wrote:
> CAUTION: This email originated outside the University. Check before 
> clicking links or attachments.
> 
> On 17/01/2024 16:11, Ryan Novosielski wrote:
>> We have a various points ran into nodes not using RDMA, just because of a
>>   minor misconfiguration, and suddenly hundreds of megabytes a second of
>> storage traffic we’re going over a net network designed for
>> administration.
> 
> You can use verbsRdmaFailBackTCPIfNotAvailable=no for that. If RDMA is 
> not working on a node configured for it, GPFS will refuse to start.
> 

Interesting. Noting we run GPFS exclusively over Ethernet and the idea 
was still to run it over Ethernet but with RDMA.

We took the decision a long time ago now to make use of the fact that we 
have fancy pants Ethernet switches and put the admin traffic over the 
same physical Ethernet link but on a separate VLAN which we then 
prioritise with QoS.

Consequently if something where to go wrong with the RDMA and it fell 
back to TCP it would still be going over the same physical link :-)


JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG




More information about the gpfsug-discuss mailing list