[gpfsug-discuss] Multipath configurations

Orlando Richards orlando.richards at ed.ac.uk
Thu Sep 19 15:53:12 BST 2013

On 16/09/13 16:25, Orlando Richards wrote:
> Hi folks,
> We're building a new storage service and are planning on using
> multipathd rather than LSI's rdac to handle the multipathing.
> It's all working well, but I'm looking at settling on the final
> parameters for the multipath.conf. In particular, the values for:
>   * rr_min_io (1?)
>   * failback (I think "manual" or "followover"?)
>   * no_path_retry (guessing here - fail?)
>   * dev_loss_tmo (guessing here - 15?)
>   * fast_io_fail_tmo (guessing here - 10?)
> Does anyone have a working multipath.conf for LSI based storage systems
> (or others, for that matter), and/or have experience and wisdom to share
> on the above settings (and any others I may have missed?). Any war
> stories about dm-multipath to share?

Hi all,

Thanks for the feedback on all this. From that, and more digging and 
testing, we've settled on the following multipath.conf stanzas:

		path_grouping_policy group_by_prio
		prio	rdac
		path_checker	rdac
		path_selector	"round-robin 0"
		hardware_handler	"1 rdac"
		features	"2 pg_init_retries 50"
		# All "standard" up to here

		# Prevent ping-ponging of controllers, but
		# allow for automatic failback
		failback	followover
		# Massively accelerate the failure detection time
		# (default settings give ~30-90 seconds, this gives ~5s)
		fast_io_fail_tmo 5
		# Keep the /dev device entries in situ for 90 seconds,
		# in case of rapid recovery of paths
		dev_loss_tmo	90
		# Don't queue traffic down a failed path
		no_path_retry	fail
		# balance much more aggressively across the active paths
		rr_min_io	1

The primary goal was to have rapid and reliable failover in a cluster 
environment (without ping-ponging). The defaults from multipathd gave a 
30-90 second pause in I/O every time a path went away - we've managed to 
get it down to ~5s with the above settings.

Note that we've not tried this "in production" yet, but it has held up 
fine under heavy benchmark load.

Along the way we discovered an odd GPFS "feature" - if some nodes in the 
cluster use RDAC (and thus have /dev/sdXX devices) and some use 
multipathd (and thus use /dev/dm-XX devices), then the nodes can either 
fail to find attached NSD devices (in the case of the RDAC host where 
the NSD's were initially created on a multipath host) or can try to talk 
to them down the wrong device (for instance - talking to /dev/sdXX 
rather than /dev/dm-XX). We just set up this mixed environment to 
compare rdac vs dm-multipath, and don't expect to put it into production 
- but it's the kind of thing which could end up cropping up in a system 
migrating from RDAC to dm-multipath, or vice versa. It seems that on 
creation, the nsd is tagged somewhere as either "dmm" (dm-multipath) or 
"generic" (rdac), and servers using one type can't see the other.

We're testing a workaround for the "dm-multipath server accessing via 
/dev/sdXX" case just now - create the following (executable, root-owned) 
script in /var/mmfs/etc/nsddevices on the dm-multipath hosts:

# this script ensures that we are not using the raw /dev/sd\* devices 
for GPFS
# but use the multipath /dev/dm-\* devices instead
for dev in $( cat /proc/partitions | grep dm- | awk '{print $4}' )
     echo $dev generic

# skip the GPFS device discovery
exit 0

except change that simple "$dev generic" echo to one which says "$dev 
mpp" or "$dev generic" depending on whether the device was created with 
dm-multipath or rdac attached hosts. The reverse also likely would work 
to get the rdac host to pick up the dm-multipath created nsd's (echo 
$dev mpp, for the /dev/sdXX devices).

Thankfully, we have no plans to mix the environment - but for future 
reference it could be important (if ever migrating existing systems from 
rdac to dm-multipath, for instance).

    Dr Orlando Richards
   Information Services
IT Infrastructure Division
        Unix Section
     Tel: 0131 650 4994
   skype: orlando.richards

The University of Edinburgh is a charitable body, registered in 
Scotland, with registration number SC005336.

More information about the gpfsug-discuss mailing list