[gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances
Scott Fadden
sfadden at us.ibm.com
Wed Nov 9 18:08:42 GMT 2016
Jake,
If AFM is using NFS it is all about NFS tuning. The copy from one side to
the other is basically just a client writing to an NFS mount. Thee are a
few things you can look at:
1. NFS Transfer size (Make is 1MiB, I think that is the max)
2. TCP Tuning for large window size. This is discussed on Tuning active
file management home communications in the docs. On this page you will
find some discussion on increasing gateway threads, and other things
similar that may help as well.
We can discuss further as I understand we will be meeting at SC16.
Scott Fadden
Spectrum Scale - Technical Marketing
Phone: (503) 880-5833
sfadden at us.ibm.com
http://www.ibm.com/systems/storage/spectrum/scale
From: Jake Carroll <jake.carroll at uq.edu.au>
To: "gpfsug-discuss at spectrumscale.org"
<gpfsug-discuss at spectrumscale.org>
Date: 11/09/2016 09:39 AM
Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO
over _really_ long distances
Sent by: gpfsug-discuss-bounces at spectrumscale.org
Hi.
I’ve got an GPFS to GPFS AFM cache/home (IW) relationship set up over a
really long distance. About 180ms of latency between the two clusters and
around 13,000km of optical path. Fortunately for me, I’ve actually got
near theoretical maximum IO over the NIC’s between the clusters and I’m
iPerf’ing at around 8.90 to 9.2Gbit/sec over a 10GbE circuit. All MTU9000
all the way through.
Anyway – I’m finding my AFM traffic to be dragging its feet and I don’t
really understand why that might be. I’ve verified the links and
transports ability as I said above with iPerf, and CERN’s FDT to near
10Gbit/sec.
I also verified the clusters on both sides in terms of disk IO and they
both seem easily capable in IOZone and IOR tests of multiple GB/sec of
throughput.
So – my questions:
1. Are there very specific tunings AFM needs for high latency/long
distance IO?
2. Are there very specific NIC/TCP-stack tunings (beyond the type of
thing we already have in place) that benefits AFM over really long
distances and high latency?
3. We are seeing on the “cache” side really lazy/sticky “ls –als” in
the home mount. It sometimes takes 20 to 30 seconds before the command
line will report back with a long listing of files. Any ideas why it’d
take that long to get a response from “home”.
We’ve got our TCP stack setup fairly aggressively, on all hosts that
participate in these two clusters.
ethtool -C enp2s0f0 adaptive-rx off
ifconfig enp2s0f0 txqueuelen 10000
sysctl -w net.core.rmem_max=536870912
sysctl -w net.core.wmem_max=536870912
sysctl -w net.ipv4.tcp_rmem="4096 87380 268435456"
sysctl -w net.ipv4.tcp_wmem="4096 65536 268435456"
sysctl -w net.core.netdev_max_backlog=250000
sysctl -w net.ipv4.tcp_congestion_control=htcp
sysctl -w net.ipv4.tcp_mtu_probing=1
I modified a couple of small things on the AFM “cache” side to see if it’d
make a difference such as:
mmchconfig afmNumWriteThreads=4
mmchconfig afmNumReadThreads=4
But no difference so far.
Thoughts would be appreciated. I’ve done this before over much shorter
distances (30Km) and I’ve flattened a 10GbE wire without really
tuning…anything. Are my large in-flight-packets
numbers/long-time-to-acknowledgement semantics going to hurt here? I
really thought AFM might be well designed for exactly this kind of work at
long distance *and* high throughput – so I must be missing something!
-jc
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20161109/c775cf5a/attachment-0002.htm>
More information about the gpfsug-discuss
mailing list