[gpfsug-discuss] Problems with remote mount via routed IB

Zachary Mance zmance at ucar.edu
Tue Mar 13 19:38:48 GMT 2018


Hi Jan,

I am NOT using the pre-populated cache that mellanox refers to in it's
documentation. After chatting with support, I don't believe that's
necessary anymore (I didn't get a straight answer out of them).

For the subnet prefix, make sure to use one from the range
0xfec0000000000000-0xfec000000000001f.

---------------------------------------------------------------------------------------------------------------
Zach Mance  zmance at ucar.edu  (303) 497-1883

HPC Data Infrastructure Group / CISL / NCAR
---------------------------------------------------------------------------------------------------------------

On Tue, Mar 13, 2018 at 9:24 AM, Jan Erik Sundermann <jan.sundermann at kit.edu
> wrote:

> Hello Zachary
>
> We are currently changing out setup to have IP over IB on all machines to
> be able to enable verbsRdmaCm.
>
> According to Mellanox (https://community.mellanox.com/docs/DOC-2384)
> ibacm requires pre-populated caches to be distributed to all end hosts with
> the mapping of IP to the routable GIDs (of both IB subnets). Was this also
> required in your successful deployment?
>
> Best
> Jan Erik
>
>
>
> On 03/12/2018 11:10 PM, Zachary Mance wrote:
>
>> Since I am testing out remote mounting with EDR IB routers, I'll add to
>> the discussion.
>>
>> In my lab environment I was seeing the same rdma connections being
>> established and then disconnected shortly after. The remote filesystem
>> would eventually mount on the clients, but it look a quite a while
>> (~2mins). Even after mounting, accessing files or any metadata operations
>> would take a while to execute, but eventually it happened.
>>
>> After enabling verbsRdmaCm, everything mounted just fine and in a timely
>> manner. Spectrum Scale was using the librdmacm.so library.
>>
>> I would first double check that you have both clusters able to talk to
>> each other on their IPoIB address, then make sure you enable verbsRdmaCm on
>> both clusters.
>>
>>
>> ------------------------------------------------------------
>> ---------------------------------------------------
>> Zach Mance zmance at ucar.edu <mailto:zmance at ucar.edu> (303) 497-1883
>> HPC Data Infrastructure Group / CISL / NCAR
>> ------------------------------------------------------------
>> ---------------------------------------------------
>>
>> On Thu, Mar 1, 2018 at 1:41 AM, John Hearns <john.hearns at asml.com
>> <mailto:john.hearns at asml.com>> wrote:
>>
>>     In reply to Stuart,
>>     our setup is entirely Infiniband. We boot and install over IB, and
>>     rely heavily on IP over Infiniband.
>>
>>     As for users being 'confused' due to multiple IPs, I would
>>     appreciate some more depth on that one.
>>     Sure, all batch systems are sensitive to hostnames (as I know to my
>>     cost!) but once you get that straightened out why should users care?
>>     I am not being aggressive, just keen to find out more.
>>
>>
>>
>>     -----Original Message-----
>>     From: gpfsug-discuss-bounces at spectrumscale.org
>>     <mailto:gpfsug-discuss-bounces at spectrumscale.org>
>>     [mailto:gpfsug-discuss-bounces at spectrumscale.org
>>     <mailto:gpfsug-discuss-bounces at spectrumscale.org>] On Behalf Of
>>     Stuart Barkley
>>     Sent: Wednesday, February 28, 2018 6:50 PM
>>     To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org
>>     <mailto:gpfsug-discuss at spectrumscale.org>>
>>     Subject: Re: [gpfsug-discuss] Problems with remote mount via routed IB
>>
>>     The problem with CM is that it seems to require configuring IP over
>>     Infiniband.
>>
>>     I'm rather strongly opposed to IP over IB.  We did run IPoIB years
>>     ago, but pulled it out of our environment as adding unneeded
>>     complexity.  It requires provisioning IP addresses across the
>>     Infiniband infrastructure and possibly adding routers to other
>>     portions of the IP infrastructure.  It was also confusing some users
>>     due to multiple IPs on the compute infrastructure.
>>
>>     We have recently been in discussions with a vendor about their
>>     support for GPFS over IB and they kept directing us to using CM
>>     (which still didn't work).  CM wasn't necessary once we found out
>>     about the actual problem (we needed the undocumented
>>     verbsRdmaUseGidIndexZero configuration option among other things due
>>     to their use of SR-IOV based virtual IB interfaces).
>>
>>     We don't use routed Infiniband and it might be that CM and IPoIB is
>>     required for IB routing, but I doubt it.  It sounds like the OP is
>>     keeping IB and IP infrastructure separate.
>>
>>     Stuart Barkley
>>
>>     On Mon, 26 Feb 2018 at 14:16 -0000, Aaron Knister wrote:
>>
>>      > Date: Mon, 26 Feb 2018 14:16:34
>>      > From: Aaron Knister <aaron.s.knister at nasa.gov
>>     <mailto:aaron.s.knister at nasa.gov>>
>>      > Reply-To: gpfsug main discussion list
>>      > <gpfsug-discuss at spectrumscale.org
>>     <mailto:gpfsug-discuss at spectrumscale.org>>
>>      > To: gpfsug-discuss at spectrumscale.org
>>     <mailto:gpfsug-discuss at spectrumscale.org>
>>      > Subject: Re: [gpfsug-discuss] Problems with remote mount via
>>     routed IB
>>      >
>>      > Hi Jan Erik,
>>      >
>>      > It was my understanding that the IB hardware router required RDMA
>>     CM to work.
>>      > By default GPFS doesn't use the RDMA Connection Manager but it can
>> be
>>      > enabled (e.g. verbsRdmaCm=enable). I think this requires a restart
>> on
>>      > clients/servers (in both clusters) to take effect. Maybe someone
>> else
>>      > on the list can comment in more detail-- I've been told folks have
>>      > successfully deployed IB routers with GPFS.
>>      >
>>      > -Aaron
>>      >
>>      > On 2/26/18 11:38 AM, Sundermann, Jan Erik (SCC) wrote:
>>      > >
>>      > > Dear all
>>      > >
>>      > > we are currently trying to remote mount a file system in a routed
>>      > > Infiniband test setup and face problems with dropped RDMA
>>      > > connections. The setup is the
>>      > > following:
>>      > >
>>      > > - Spectrum Scale Cluster 1 is setup on four servers which are
>>      > > connected to the same infiniband network. Additionally they are
>>      > > connected to a fast ethernet providing ip communication in the
>>     network 192.168.11.0/24 <http://192.168.11.0/24>.
>>      > >
>>      > > - Spectrum Scale Cluster 2 is setup on four additional servers
>>     which
>>      > > are connected to a second infiniband network. These servers
>>     have IPs
>>      > > on their IB interfaces in the network 192.168.12.0/24
>>     <http://192.168.12.0/24>.
>>      > >
>>      > > - IP is routed between 192.168.11.0/24 <http://192.168.11.0/24>
>>     and 192.168.12.0/24 <http://192.168.12.0/24> on a
>>
>>      > > dedicated machine.
>>      > >
>>      > > - We have a dedicated IB hardware router connected to both IB
>>     subnets.
>>      > >
>>      > >
>>      > > We tested that the routing, both IP and IB, is working between
>> the
>>      > > two clusters without problems and that RDMA is working fine
>>     both for
>>      > > internal communication inside cluster 1 and cluster 2
>>      > >
>>      > > When trying to remote mount a file system from cluster 1 in
>> cluster
>>      > > 2, RDMA communication is not working as expected. Instead we see
>>      > > error messages on the remote host (cluster 2)
>>      > >
>>      > >
>>      > > 2018-02-23_13:48:47.037+0100: [I] VERBS RDMA connecting to
>>      > > 192.168.11.4 (iccn004-gpfs in gpfsstorage.localdomain) on mlx4_0
>>      > > port 1 fabnum 0 index 2
>>      > > 2018-02-23_13:48:49.890+0100: [I] VERBS RDMA connected to
>>      > > 192.168.11.4 (iccn004-gpfs in gpfsstorage.localdomain) on mlx4_0
>>      > > port 1 fabnum 0 sl 0 index 2
>>      > > 2018-02-23_13:48:53.138+0100: [E] VERBS RDMA closed connection to
>>      > > 192.168.11.1 (iccn001-gpfs in gpfsstorage.localdomain) on mlx4_0
>>      > > port 1 fabnum 0 error 733 index 3
>>      > > 2018-02-23_13:48:53.854+0100: [I] VERBS RDMA connecting to
>>      > > 192.168.11.1 (iccn001-gpfs in gpfsstorage.localdomain) on mlx4_0
>>      > > port 1 fabnum 0 index 3
>>      > > 2018-02-23_13:48:54.954+0100: [E] VERBS RDMA closed connection to
>>      > > 192.168.11.3 (iccn003-gpfs in gpfsstorage.localdomain) on mlx4_0
>>      > > port 1 fabnum 0 error 733 index 1
>>      > > 2018-02-23_13:48:55.601+0100: [I] VERBS RDMA connected to
>>      > > 192.168.11.1 (iccn001-gpfs in gpfsstorage.localdomain) on mlx4_0
>>      > > port 1 fabnum 0 sl 0 index 3
>>      > > 2018-02-23_13:48:57.775+0100: [I] VERBS RDMA connecting to
>>      > > 192.168.11.3 (iccn003-gpfs in gpfsstorage.localdomain) on mlx4_0
>>      > > port 1 fabnum 0 index 1
>>      > > 2018-02-23_13:48:59.557+0100: [I] VERBS RDMA connected to
>>      > > 192.168.11.3 (iccn003-gpfs in gpfsstorage.localdomain) on mlx4_0
>>      > > port 1 fabnum 0 sl 0 index 1
>>      > > 2018-02-23_13:48:59.876+0100: [E] VERBS RDMA closed connection to
>>      > > 192.168.11.2 (iccn002-gpfs in gpfsstorage.localdomain) on mlx4_0
>>      > > port 1 fabnum 0 error 733 index 0
>>      > > 2018-02-23_13:49:02.020+0100: [I] VERBS RDMA connecting to
>>      > > 192.168.11.2 (iccn002-gpfs in gpfsstorage.localdomain) on mlx4_0
>>      > > port 1 fabnum 0 index 0
>>      > > 2018-02-23_13:49:03.477+0100: [I] VERBS RDMA connected to
>>      > > 192.168.11.2 (iccn002-gpfs in gpfsstorage.localdomain) on mlx4_0
>>      > > port 1 fabnum 0 sl 0 index 0
>>      > > 2018-02-23_13:49:05.119+0100: [E] VERBS RDMA closed connection to
>>      > > 192.168.11.4 (iccn004-gpfs in gpfsstorage.localdomain) on mlx4_0
>>      > > port 1 fabnum 0 error 733 index 2
>>      > > 2018-02-23_13:49:06.191+0100: [I] VERBS RDMA connecting to
>>      > > 192.168.11.4 (iccn004-gpfs in gpfsstorage.localdomain) on mlx4_0
>>      > > port 1 fabnum 0 index 2
>>      > > 2018-02-23_13:49:06.548+0100: [I] VERBS RDMA connected to
>>      > > 192.168.11.4 (iccn004-gpfs in gpfsstorage.localdomain) on mlx4_0
>>      > > port 1 fabnum 0 sl 0 index 2
>>      > > 2018-02-23_13:49:11.578+0100: [E] VERBS RDMA closed connection to
>>      > > 192.168.11.1 (iccn001-gpfs in gpfsstorage.localdomain) on mlx4_0
>>      > > port 1 fabnum 0 error 733 index 3
>>      > > 2018-02-23_13:49:11.937+0100: [I] VERBS RDMA connecting to
>>      > > 192.168.11.1 (iccn001-gpfs in gpfsstorage.localdomain) on mlx4_0
>>      > > port 1 fabnum 0 index 3
>>      > > 2018-02-23_13:49:11.939+0100: [I] VERBS RDMA connected to
>>      > > 192.168.11.1 (iccn001-gpfs in gpfsstorage.localdomain) on mlx4_0
>>      > > port 1 fabnum 0 sl 0 index 3
>>      > >
>>      > >
>>      > > and in the cluster with the file system (cluster 1)
>>      > >
>>      > > 2018-02-23_13:47:36.112+0100: [E] VERBS RDMA rdma read error
>>      > > IBV_WC_RETRY_EXC_ERR to 192.168.12.5 (iccn005-ib in
>>      > > gpfsremoteclients.localdomain) on mlx4_0 port 1 fabnum 0
>> vendor_err
>>      > > 129
>>      > > 2018-02-23_13:47:36.112+0100: [E] VERBS RDMA closed connection to
>>      > > 192.168.12.5 (iccn005-ib in gpfsremoteclients.localdomain) on
>>     mlx4_0
>>      > > port 1 fabnum 0 due to RDMA read error IBV_WC_RETRY_EXC_ERR
>> index 3
>>      > > 2018-02-23_13:47:47.161+0100: [I] VERBS RDMA accepted and
>> connected
>>      > > to
>>      > > 192.168.12.5 (iccn005-ib in gpfsremoteclients.localdomain) on
>>     mlx4_0
>>      > > port 1 fabnum 0 sl 0 index 3
>>      > > 2018-02-23_13:48:04.317+0100: [E] VERBS RDMA rdma read error
>>      > > IBV_WC_RETRY_EXC_ERR to 192.168.12.5 (iccn005-ib in
>>      > > gpfsremoteclients.localdomain) on mlx4_0 port 1 fabnum 0
>> vendor_err
>>      > > 129
>>      > > 2018-02-23_13:48:04.317+0100: [E] VERBS RDMA closed connection to
>>      > > 192.168.12.5 (iccn005-ib in gpfsremoteclients.localdomain) on
>>     mlx4_0
>>      > > port 1 fabnum 0 due to RDMA read error IBV_WC_RETRY_EXC_ERR
>> index 3
>>      > > 2018-02-23_13:48:11.560+0100: [I] VERBS RDMA accepted and
>> connected
>>      > > to
>>      > > 192.168.12.5 (iccn005-ib in gpfsremoteclients.localdomain) on
>>     mlx4_0
>>      > > port 1 fabnum 0 sl 0 index 3
>>      > > 2018-02-23_13:48:32.523+0100: [E] VERBS RDMA rdma read error
>>      > > IBV_WC_RETRY_EXC_ERR to 192.168.12.5 (iccn005-ib in
>>      > > gpfsremoteclients.localdomain) on mlx4_0 port 1 fabnum 0
>> vendor_err
>>      > > 129
>>      > > 2018-02-23_13:48:32.523+0100: [E] VERBS RDMA closed connection to
>>      > > 192.168.12.5 (iccn005-ib in gpfsremoteclients.localdomain) on
>>     mlx4_0
>>      > > port 1 fabnum 0 due to RDMA read error IBV_WC_RETRY_EXC_ERR
>> index 3
>>      > > 2018-02-23_13:48:35.398+0100: [I] VERBS RDMA accepted and
>> connected
>>      > > to
>>      > > 192.168.12.5 (iccn005-ib in gpfsremoteclients.localdomain) on
>>     mlx4_0
>>      > > port 1 fabnum 0 sl 0 index 3
>>      > > 2018-02-23_13:48:53.135+0100: [E] VERBS RDMA rdma read error
>>      > > IBV_WC_RETRY_EXC_ERR to 192.168.12.5 (iccn005-ib in
>>      > > gpfsremoteclients.localdomain) on mlx4_0 port 1 fabnum 0
>> vendor_err
>>      > > 129
>>      > > 2018-02-23_13:48:53.135+0100: [E] VERBS RDMA closed connection to
>>      > > 192.168.12.5 (iccn005-ib in gpfsremoteclients.localdomain) on
>>     mlx4_0
>>      > > port 1 fabnum 0 due to RDMA read error IBV_WC_RETRY_EXC_ERR
>> index 3
>>      > > 2018-02-23_13:48:55.600+0100: [I] VERBS RDMA accepted and
>> connected
>>      > > to
>>      > > 192.168.12.5 (iccn005-ib in gpfsremoteclients.localdomain) on
>>     mlx4_0
>>      > > port 1 fabnum 0 sl 0 index 3
>>      > > 2018-02-23_13:49:11.577+0100: [E] VERBS RDMA rdma read error
>>      > > IBV_WC_RETRY_EXC_ERR to 192.168.12.5 (iccn005-ib in
>>      > > gpfsremoteclients.localdomain) on mlx4_0 port 1 fabnum 0
>> vendor_err
>>      > > 129
>>      > > 2018-02-23_13:49:11.577+0100: [E] VERBS RDMA closed connection to
>>      > > 192.168.12.5 (iccn005-ib in gpfsremoteclients.localdomain) on
>>     mlx4_0
>>      > > port 1 fabnum 0 due to RDMA read error IBV_WC_RETRY_EXC_ERR
>> index 3
>>      > > 2018-02-23_13:49:11.939+0100: [I] VERBS RDMA accepted and
>> connected
>>      > > to
>>      > > 192.168.12.5 (iccn005-ib in gpfsremoteclients.localdomain) on
>>     mlx4_0
>>      > > port 1 fabnum 0 sl 0 index 3
>>      > >
>>      > >
>>      > >
>>      > > Any advice on how to configure the setup in a way that would
>> allow
>>      > > the remote mount via routed IB would be very appreciated.
>>      > >
>>      > >
>>      > > Thank you and best regards
>>      > > Jan Erik
>>      > >
>>      > >
>>      > >
>>      > >
>>      > > _______________________________________________
>>      > > gpfsug-discuss mailing list
>>      > > gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>>      > >
>>     https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>>     <https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgp
>> >
>>      > > fsug.org
>>     <http://fsug.org>%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data
>> =01%7C01%7Cjohn.h
>>      > > earns%40asml.com
>>     <http://40asml.com>%7Ce40045550fc3467dd62808d57ed4d0d9%
>> 7Caf73baa8f5944e
>>      > >
>>     b2a39d93e96cad61fc%7C1&sdata=v%2F35G6ZnlHFBm%2BfVddvcuraFd9FRChyOSRE
>>      > > YpqcNNP8%3D&reserved=0
>>     > >
>>     >
>>     > --
>>     > Aaron Knister
>>     > NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight
>>     > Center
>>     > (301) 286-2776 <tel:%28301%29%20286-2776>
>>     > _______________________________________________
>>     > gpfsug-discuss mailing list
>>     > gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>>      >
>>     https://emea01.safelinks.protection.outlook.com/?url=http%
>> 3A%2F%2Fgpfs
>>     <https://emea01.safelinks.protection.outlook.com/?url=http%
>> 3A%2F%2Fgpfs>
>>      > ug.org
>>     <http://ug.org>%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=
>> 01%7C01%7Cjohn.hearn
>>      > s%40asml.com
>>     <http://40asml.com>%7Ce40045550fc3467dd62808d57ed4d0d9%
>> 7Caf73baa8f5944eb2a39d
>>      >
>>     93e96cad61fc%7C1&sdata=v%2F35G6ZnlHFBm%2BfVddvcuraFd9FRChyOS
>> REYpqcNNP8
>>      > %3D&reserved=0
>>     >
>>
>>     --
>>     I've never been lost; I was once bewildered for three days, but
>>     never lost!
>>                                              --  Daniel Boone
>>     _______________________________________________
>>     gpfsug-discuss mailing list
>>     gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>>     https://emea01.safelinks.protection.outlook.com/?url=http%
>> 3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&
>> data=01%7C01%7Cjohn.hearns%40asml.com%7Ce40045550fc3467dd
>> 62808d57ed4d0d9%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&
>> sdata=v%2F35G6ZnlHFBm%2BfVddvcuraFd9FRChyOSREYpqcNNP8%3D&reserved=0
>>     <https://emea01.safelinks.protection.outlook.com/?url=http%
>> 3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&
>> data=01%7C01%7Cjohn.hearns%40asml.com%7Ce40045550fc3467dd
>> 62808d57ed4d0d9%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&
>> sdata=v%2F35G6ZnlHFBm%2BfVddvcuraFd9FRChyOSREYpqcNNP8%3D&reserved=0>
>>     -- The information contained in this communication and any
>>     attachments is confidential and may be privileged, and is for the
>>     sole use of the intended recipient(s). Any unauthorized review, use,
>>     disclosure or distribution is prohibited. Unless explicitly stated
>>     otherwise in the body of this communication or the attachment
>>     thereto (if any), the information is provided on an AS-IS basis
>>     without any express or implied warranties or liabilities. To the
>>     extent you are relying on this information, you are doing so at your
>>     own risk. If you are not the intended recipient, please notify the
>>     sender immediately by replying to this message and destroy all
>>     copies of this message and any attachments. Neither the sender nor
>>     the company/group of companies he or she represents shall be liable
>>     for the proper and complete transmission of the information
>>     contained in this communication, or for any delay in its receipt.
>>     _______________________________________________
>>     gpfsug-discuss mailing list
>>     gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>>     http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>     <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
>>
>>
>>
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>>
> --
>
> Karlsruhe Institute of Technology (KIT)
> Steinbuch Centre for Computing (SCC)
>
> Jan Erik Sundermann
>
> Hermann-von-Helmholtz-Platz 1, Building 449, Room 226
> D-76344 Eggenstein-Leopoldshafen
>
> Tel: +49 721 608 26191
> Email: jan.sundermann at kit.edu
> www.scc.kit.edu
>
> KIT – The Research University in the Helmholtz Association
>
> Since 2010, KIT has been certified as a family-friendly university.
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180313/d2f6ff1b/attachment-0002.htm>


More information about the gpfsug-discuss mailing list