[gpfsug-discuss] Niggles in the 4.2.0 Install

Jonathan Buzzard jonathan at buzzard.me.uk
Fri Mar 11 15:46:39 GMT 2016


On Fri, 2016-03-11 at 13:19 +0000, Sean Killen wrote:
> Hi all,
> 
> So I have finally got my SpectrumScale system installed (well half of
> it).  But it wasn't without some niggles.
> 
> We have purchased DELL MD3860i disk trays with dual controllers (each
> with 2x 10Gbit NICs), to Linux this appears as 4 paths, I spent quite a
> while getting a nice multipath setup in place with 'friendly' names set
> 

Oh dear. I guess it might work with 10Gb Ethernet but based on my
personal experience iSCSI is spectacularly unsuited to GPFS. Either your
NSD nodes can overwhelm the storage arrays or the storage arrays can
overwhelm the NSD servers and performance falls through the floor. That
is unless you have Data Center Ethernet at which point you might as well
have gone Fibre Channel in the first place. Though unless you are going
to have large physical separation between the storage and NSD servers
12Gb SAS is a cheaper option and you can still have four NSD servers
hooked up to each MD3 based storage array.

I have in the past implement GPFS on Dell MD3200i's. I did eventually
get it working reliably but it was so suboptimal with so many
compromises that as soon as the MD3600f came out we purchased these to
replaced the MD3200i's.

Lets say you have three storage arrays with two paths to each controller
and four NSD servers. Basically what happens is that an NSD server
issues a bunch of requests for blocks to the storage arrays.

Then all 12 paths start answering to your two connections to the NSD
server. At this point the Ethernet adaptors on your NSD servers are
overwhelmed 802.1D PAUSE frames start being issued which just result in
head of line blocking and performance falls through the floor. You need
Data Center Ethernet to handle this properly, which is probably why FCoE
never took off as you can't just use the Ethernet switches and adaptors
you have. Both FC and SAS handle this sort of congestion gracefully
unlike ordinary Ethernet.

Now the caveat for all this is that it is much easier to overwhelm a
1Gbps link than a 10Gbps link. However with the combination of SSD and
larger cache's I can envisage that a 10Gbps link could be overwhelmed
and you would then see the same performance issues that I saw. Basically
the only way out is a one to one correspondence between ports on the
NSD's and the storage controllers.

JAB.

-- 
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.





More information about the gpfsug-discuss mailing list