[gpfsug-discuss] VERBS RDMA issue

Tushar Pathare tpathare at sidra.org
Sun May 21 09:40:42 BST 2017


Hello Team,

We are facing a lot of messages waiters  related to waiting for conn rdmas < conn maxrdmas<https://www.mail-archive.com/search?l=gpfsug-discuss@spectrumscale.org&q=subject:%22Re%5C%3A+%5C%5Bgpfsug%5C-discuss%5C%5D+waiting+for+conn+rdmas+%3C+conn+maxrdmas%22&o=newest>

Is there some recommended settings to resolve this issue.?
Our config for RDMA is as follows for 140 nodes(32 cores each)


VERBS RDMA Configuration:
  Status                              : started
  Start time                          : Thu
  Stats reset time                    : Thu
  Dump time                           : Sun
  mmfs verbsRdma                      : enable
  mmfs verbsRdmaCm                    : disable
  mmfs verbsPorts                     : mlx4_0/1 mlx4_0/2
  mmfs verbsRdmasPerNode              : 3200
  mmfs verbsRdmasPerNode (max)        : 3200
  mmfs verbsRdmasPerNodeOptimize      : yes
  mmfs verbsRdmasPerConnection        : 16
  mmfs verbsRdmasPerConnection (max)  : 16
  mmfs verbsRdmaMinBytes              : 16384
  mmfs verbsRdmaRoCEToS               : -1
  mmfs verbsRdmaQpRtrMinRnrTimer      : 18
  mmfs verbsRdmaQpRtrPathMtu          : 2048
  mmfs verbsRdmaQpRtrSl               : 0
  mmfs verbsRdmaQpRtrSlDynamic        : no
  mmfs verbsRdmaQpRtrSlDynamicTimeout : 10
  mmfs verbsRdmaQpRtsRnrRetry         : 6
  mmfs verbsRdmaQpRtsRetryCnt         : 6
  mmfs verbsRdmaQpRtsTimeout          : 18
  mmfs verbsRdmaMaxSendBytes          : 16777216
  mmfs verbsRdmaMaxSendSge            : 27
  mmfs verbsRdmaSend                  : yes
  mmfs verbsRdmaSerializeRecv         : no
  mmfs verbsRdmaSerializeSend         : no
  mmfs verbsRdmaUseMultiCqThreads     : yes
  mmfs verbsSendBufferMemoryMB        : 1024
  mmfs verbsLibName                   : libibverbs.so
  mmfs verbsRdmaCmLibName             : librdmacm.so
  mmfs verbsRdmaMaxReconnectInterval  : 60
  mmfs verbsRdmaMaxReconnectRetries   : -1
  mmfs verbsRdmaReconnectAction       : disable
  mmfs verbsRdmaReconnectThreads      : 32
  mmfs verbsHungRdmaTimeout           : 90
  ibv_fork_support                    : true
  Max connections                     : 196608
  Max RDMA size                       : 16777216
  Target number of vsend buffs        : 16384
  Initial vsend buffs per conn        : 59
  nQPs                                : 140
  nCQs                                : 282
  nCMIDs                              : 0
  nDtoThreads                         : 2
  nextIndex                           : 141
  Number of Devices opened            : 1
    Device                            : mlx4_0
      vendor_id                       : 713
      Device vendor_part_id           : 4099
      Device mem register chunk       : 8589934592 (0x200000000)
      Device max_sge                  : 32
      Adjusted max_sge                : 0
      Adjusted max_sge vsend          : 30
      Device max_qp_wr                : 16351
      Device max_qp_rd_atom           : 16
      Open Connect Ports              : 1
        verbsConnectPorts[0]          : mlx4_0/1/0
          lid                         : 129
          state                       : IBV_PORT_ACTIVE
          path_mtu                    : 2048
          interface ID                : 0xe41d2d030073b9d1
          sendChannel.ib_channel      : 0x7FA6CB816200
          sendChannel.dtoThreadP      : 0x7FA6CB821870
          sendChannel.dtoThreadId     : 12540
          sendChannel.nFreeCq         : 1
          recvChannel.ib_channel      : 0x7FA6CB81D590
          recvChannel.dtoThreadP      : 0x7FA6CB822BA0
          recvChannel.dtoThreadId     : 12541
          recvChannel.nFreeCq         : 1
          ibv_cq                      : 0x7FA2724C81F8
          ibv_cq.cqP                  : 0x0
          ibv_cq.nEvents              : 0
          ibv_cq.contextP             : 0x0
          ibv_cq.ib_channel           : 0x0

Thanks


Tushar B Pathare MBA IT,BE IT
Bigdata & GPFS
Software Development & Databases
Scientific Computing
Bioinformatics Division
Research

"What ever the mind of man can conceive and believe, drill can query"

Sidra Medical and Research Centre
Sidra OPC Building
Sidra Medical & Research Center
PO Box 26999
Al Luqta Street
Education City North Campus
​Qatar Foundation, Doha, Qatar
Office 4003 3333 ext 37443 | M +974 74793547
tpathare at sidra.org<mailto:tpathare at sidra.org> | www.sidra.org<http://www.sidra.org/>

Disclaimer: This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient, any reading, printing, storage, disclosure, copying or any other action taken in respect of this e-mail is prohibited and may be unlawful. If you are not the intended recipient, please notify the sender immediately by using the reply function and then permanently delete what you have received. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Sidra Medical and Research Center.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170521/7f5787d3/attachment-0001.htm>


More information about the gpfsug-discuss mailing list