[gpfsug-discuss] Achieving high parallelism with AFM using NFS?

Sun Nov 13 14:18:38 GMT 2016

Hi all.

After some help from IBM, we’ve concluded (and been told) that AFM over the NSD protocol when latency is greater than around 50ms on the RTT is effectively unusable. We’ve proven that now, so it is time to move on from the NSD protocol being an effective option in those conditions (unless IBM can consider it something worthy of an RFE and can fix it!).

The problem we face now, is one of parallelism and filling that 10GbE/40GbE/100GbE pipe efficiently, when using NFS as the transport provider for AFM.

On my test cluster at “Cache” side I’ve got two or three gateways:

[root at mc-5 ~]# mmlscluster

GPFS cluster information
========================
  GPFS cluster name:         sdx-gpfs.xxxxxxxxxxxxxxxx
  GPFS cluster id:           12880500218013865782
  GPFS UID domain:           sdx-gpfs. xxxxxxxxxxxxxxxx
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp
  Repository type:           CCR

 Node  Daemon node name           IP address    Admin node name            Designation
---------------------------------------------------------------------------------------
   1   mc-5. xxxxxxxxxxxxxxxx.net  ip.addresses.hidden  mc-5.hidden.net  quorum-manager
   2   mc-6. xxxxxxxxxxxxxxxx.net  ip.addresses.hidden  mc-6. hidden.net  quorum-manager-gateway
   3   mc-7. xxxxxxxxxxxxxxxx.net  ip.addresses.hidden  mc-7. hidden.net  quorum-manager-gateway
   4   mc-8. xxxxxxxxxxxxxxxx.net  ip.addresses.hidden  mc-8. hidden.net  quorum-manager-gateway

The bit I really don’t get is:

1.       Why no traffic ever seems to go through mc-6 or mc-8 back to my “home” directly and

2.       Why it only ever lists my AFM-cache fileset being associated with one gateway (mc-7).

I can see traffic flowing through mc-6 sometimes…but when it does, it all seems to channel back through mc-7 THEN back to the AFM-home. Am I missing something?

This is where I see one of the gateway’s listed (but never the others?).

[root at mc-5 ~]# mmafmctl afmcachefs getstate
Fileset Name    Fileset Target                                Cache State          Gateway Node    Queue Length   Queue numExec
------------    --------------                                -------------        ------------    ------------   -------------
afm-home        nfs://omnipath2/gpfs-flash/afm-home           Active               mc-7            0              746636

I got told I needed to setup “explicit maps” back to my home cluster to achieve parallelism:

[root at mc-5 ~]# mmafmconfig show
Map name:             omnipath1
Export server map:    address.is.hidden.100/mc-6.ip.address.hidden

Map name:             omnipath2
Export server map:    address.is.hidden.101/mc-7.ip.address.hidden

But – I have never seen any traffic come back from mc-6 to omnipath1.

What am I missing, and how do I actually achieve significant enough parallelism over an NFS transport to fill my 10GbE pipe?

I’ve seen maybe a couple of gigabits per second from the mc-7 host writing back to the omnipath2 host – and that was really trying my level best to put as many files onto the afm-cache at this side and hoping that enough threads pick up enough different files to start transferring files down the AFM simultaneously – but what I’d really like is those large files (or small, up to the thresholds set) to break into parallel chunks and ALL transfer as fast as possible, utilising as much of the 10GbE as they can.

Maybe I am missing fundamental principles in the way AFM works?

Thanks.

-jc

PS: NB The link is easily capable of 10GbE. We’ve tested it all the way up to about 9.67Gbit/sec transferring data from these sets of hosts using other protocols such as fDT and Globus Grid FTP Et al.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20161113/34adf48a/attachment-0001.htm>