<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Title" content="">
<meta name="Keywords" content="">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Arial;
panose-1:2 11 6 4 2 2 2 2 2 4;}
@font-face
{font-family:Mangal;
panose-1:2 4 5 3 5 2 3 3 2 2;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"Helvetica Neue";
panose-1:2 0 5 3 0 0 0 2 0 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.null, li.null, div.null
{mso-style-name:null;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:12.0pt;
font-family:"Times New Roman";}
span.null1
{mso-style-name:null1;}
span.EmailStyle19
{mso-style-type:personal-reply;
font-family:Calibri;
color:windowtext;}
span.msoIns
{mso-style-type:export-only;
mso-style-name:"";
text-decoration:underline;
color:teal;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:595.0pt 842.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
--></style>
</head>
<body bgcolor="white" lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">Hello Aaron,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">Yes we saw recently an issue with
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">VERBS RDMA rdma send error IBV_WC_RETRY_EXC_ERR to 111.11.11.11 (sidra.nnode_group2.gpfs) on mlx5_0 port 2 fabnum 0 vendor_err 129 <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">And <o:p>
</o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:Arial;color:#AC7F00">Tushar B Pathare MBA IT,BE IT<o:p></o:p></span></b></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:Arial;color:gray">Bigdata & GPFS<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:Arial;color:gray">Software Development & Databases<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:Arial;color:gray">Scientific Computing</span><span style="color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:Arial;color:gray">Bioinformatics Division</span><span style="font-size:11.0pt;color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:Arial;color:gray">Research</span><span style="font-size:11.0pt;color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:black"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:Arial;color:#AFABAB">"What ever the mind of man can conceive and believe, drill can query"<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:black"><o:p> </o:p></span></p>
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:Arial;color:#AC7F00">Sidra Medical and Research Centre</span></b><span style="font-size:11.0pt;color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:Arial;color:#AC7F00">Sidra OPC Building</span></b><span style="font-size:11.0pt;color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:Arial;color:gray">Sidra Medical & Research Center<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:Arial;color:gray">PO Box 26999<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:Arial;color:gray">Al Luqta Street<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:Arial;color:gray">Education City North Campus<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:Arial;color:gray">Qatar Foundation, Doha, Qatar<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:Arial;color:gray">Office 4003 3333 ext 37443 | M +974 74793547</span><span style="font-size:11.0pt;color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="FR" style="font-size:10.0pt;font-family:Arial;color:gray"><a href="mailto:tpathare@sidra.org"><span style="color:purple">tpathare@sidra.org</span></a> | </span><span lang="FR" style="font-size:10.0pt;font-family:Arial;color:blue"><a href="http://www.sidra.org/"><span style="color:purple">www.sidra.org</span></a></span><span style="font-size:11.0pt;color:black"><o:p></o:p></span></p>
</div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span style="font-family:Calibri;color:black">From: </span>
</b><span style="font-family:Calibri;color:black"><gpfsug-discuss-bounces@spectrumscale.org> on behalf of "Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]" <aaron.s.knister@nasa.gov><br>
<b>Reply-To: </b>gpfsug main discussion list <gpfsug-discuss@spectrumscale.org><br>
<b>Date: </b>Sunday, May 21, 2017 at 11:59 AM<br>
<b>To: </b>gpfsug main discussion list <gpfsug-discuss@spectrumscale.org><br>
<b>Subject: </b>Re: [gpfsug-discuss] VERBS RDMA issue<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">Hi Tushar, <o:p></o:p></p>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">For me the issue was an underlying performance bottleneck (some CPU frequency scaling problems causing cores to throttle back when it wasn't appropriate). <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">I noticed you have <span style="font-size:13.0pt;font-family:"Helvetica Neue";color:#494949;background:white">verbsRdmaSend set to yes. I've seen suggestions in the past to turn this off under certain conditions although I don't remember
what those where. Hopefully others can chime in and qualify that. </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:13.0pt;font-family:"Helvetica Neue";color:#494949;background:white"><br>
<br>
</span><o:p></o:p></p>
</div>
<p class="MsoNormal"><span style="font-family:"Helvetica Neue";color:#494949;background:white">Are you seeing any RDMA errors in your logs? (e.g. grep IBV_ out of the mmfs.log). </span><o:p></o:p></p>
<div>
<p class="MsoNormal"><span style="font-size:13.0pt;font-family:"Helvetica Neue";color:#494949;background:white"><br>
<br>
</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:13.0pt;font-family:"Helvetica Neue";color:#494949;background:white">-Aaron</span><o:p></o:p></p>
</div>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><br>
<br>
<br>
<o:p></o:p></p>
<div>
<div>
<p class="MsoNormal">On May 21, 2017 at 04:41:00 EDT, Tushar Pathare <tpathare@sidra.org> wrote:<o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid windowtext 1.0pt;padding:0cm 0cm 0cm 8.0pt;margin-left:0cm;margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div>
<div>
<div>
<p class="null"><span class="null1">Hello Team,</span><o:p></o:p></p>
<p class="null"><span class="null1"> </span><o:p></o:p></p>
<p class="null"><span class="null1">We are facing a lot of messages waiters related to
<b><a href="https://www.mail-archive.com/search?l=gpfsug-discuss@spectrumscale.org&q=subject:%22Re%5C%3A+%5C%5Bgpfsug%5C-discuss%5C%5D+waiting+for+conn+rdmas+%3C+conn+maxrdmas%22&o=newest">waiting for conn rdmas < conn maxrdmas</a></b></span><o:p></o:p></p>
<p class="null"><span class="null1"> </span><o:p></o:p></p>
<p class="null"><span class="null1">Is there some recommended settings to resolve this issue.?</span><o:p></o:p></p>
<p class="null"><span class="null1">Our config for RDMA is as follows for 140 nodes(32 cores each)</span><o:p></o:p></p>
<p class="null"><span class="null1"> </span><o:p></o:p></p>
<p class="null"><span class="null1"> </span><o:p></o:p></p>
<p class="null"><span class="null1">VERBS RDMA Configuration:</span><o:p></o:p></p>
<p class="null"><span class="null1"> Status : started</span><o:p></o:p></p>
<p class="null"><span class="null1"> Start time : Thu </span>
<o:p></o:p></p>
<p class="null"><span class="null1"> Stats reset time : Thu </span>
<o:p></o:p></p>
<p class="null"><span class="null1"> Dump time : Sun</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdma : enable</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaCm : disable</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsPorts : mlx4_0/1 mlx4_0/2</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmasPerNode : 3200</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmasPerNode (max) : 3200</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmasPerNodeOptimize : yes</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmasPerConnection : 16</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmasPerConnection (max) : 16</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaMinBytes : 16384</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaRoCEToS : -1</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaQpRtrMinRnrTimer : 18</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaQpRtrPathMtu : 2048</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaQpRtrSl : 0</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaQpRtrSlDynamic : no</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaQpRtrSlDynamicTimeout : 10</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaQpRtsRnrRetry : 6</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaQpRtsRetryCnt : 6</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaQpRtsTimeout : 18</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaMaxSendBytes : 16777216</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaMaxSendSge : 27</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaSend : yes</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaSerializeRecv : no</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaSerializeSend : no</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaUseMultiCqThreads : yes</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsSendBufferMemoryMB : 1024</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsLibName : libibverbs.so</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaCmLibName : librdmacm.so</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaMaxReconnectInterval : 60</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaMaxReconnectRetries : -1</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaReconnectAction : disable</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsRdmaReconnectThreads : 32</span><o:p></o:p></p>
<p class="null"><span class="null1"> mmfs verbsHungRdmaTimeout : 90</span><o:p></o:p></p>
<p class="null"><span class="null1"> ibv_fork_support : true</span><o:p></o:p></p>
<p class="null"><span class="null1"> Max connections : 196608</span><o:p></o:p></p>
<p class="null"><span class="null1"> Max RDMA size : 16777216</span><o:p></o:p></p>
<p class="null"><span class="null1"> Target number of vsend buffs : 16384</span><o:p></o:p></p>
<p class="null"><span class="null1"> Initial vsend buffs per conn : 59</span><o:p></o:p></p>
<p class="null"><span class="null1"> nQPs : 140</span><o:p></o:p></p>
<p class="null"><span class="null1"> nCQs : 282</span><o:p></o:p></p>
<p class="null"><span class="null1"> nCMIDs : 0</span><o:p></o:p></p>
<p class="null"><span class="null1"> nDtoThreads : 2</span><o:p></o:p></p>
<p class="null"><span class="null1"> nextIndex : 141</span><o:p></o:p></p>
<p class="null"><span class="null1"> Number of Devices opened : 1</span><o:p></o:p></p>
<p class="null"><span class="null1"> Device : mlx4_0</span><o:p></o:p></p>
<p class="null"><span class="null1"> vendor_id : 713</span><o:p></o:p></p>
<p class="null"><span class="null1"> Device vendor_part_id : 4099</span><o:p></o:p></p>
<p class="null"><span class="null1"> Device mem register chunk : 8589934592 (0x200000000)</span><o:p></o:p></p>
<p class="null"><span class="null1"> Device max_sge : 32</span><o:p></o:p></p>
<p class="null"><span class="null1"> Adjusted max_sge : 0</span><o:p></o:p></p>
<p class="null"><span class="null1"> Adjusted max_sge vsend : 30</span><o:p></o:p></p>
<p class="null"><span class="null1"> Device max_qp_wr : 16351</span><o:p></o:p></p>
<p class="null"><span class="null1"> Device max_qp_rd_atom : 16</span><o:p></o:p></p>
<p class="null"><span class="null1"> Open Connect Ports : 1</span><o:p></o:p></p>
<p class="null"><span class="null1"> verbsConnectPorts[0] : mlx4_0/1/0</span><o:p></o:p></p>
<p class="null"><span class="null1"> lid : 129</span><o:p></o:p></p>
<p class="null"><span class="null1"> state : IBV_PORT_ACTIVE</span><o:p></o:p></p>
<p class="null"><span class="null1"> path_mtu : 2048</span><o:p></o:p></p>
<p class="null"><span class="null1"> interface ID : 0xe41d2d030073b9d1</span><o:p></o:p></p>
<p class="null"><span class="null1"> sendChannel.ib_channel : 0x7FA6CB816200</span><o:p></o:p></p>
<p class="null"><span class="null1"> sendChannel.dtoThreadP : 0x7FA6CB821870</span><o:p></o:p></p>
<p class="null"><span class="null1"> sendChannel.dtoThreadId : 12540</span><o:p></o:p></p>
<p class="null"><span class="null1"> sendChannel.nFreeCq : 1</span><o:p></o:p></p>
<p class="null"><span class="null1"> recvChannel.ib_channel : 0x7FA6CB81D590</span><o:p></o:p></p>
<p class="null"><span class="null1"> recvChannel.dtoThreadP : 0x7FA6CB822BA0</span><o:p></o:p></p>
<p class="null"><span class="null1"> recvChannel.dtoThreadId : 12541</span><o:p></o:p></p>
<p class="null"><span class="null1"> recvChannel.nFreeCq : 1</span><o:p></o:p></p>
<p class="null"><span class="null1"> ibv_cq : 0x7FA2724C81F8</span><o:p></o:p></p>
<p class="null"><span class="null1"> ibv_cq.cqP : 0x0</span><o:p></o:p></p>
<p class="null"><span class="null1"> ibv_cq.nEvents : 0</span><o:p></o:p></p>
<p class="null"><span class="null1"> ibv_cq.contextP : 0x0</span><o:p></o:p></p>
<p class="null"><span class="null1"> ibv_cq.ib_channel : 0x0</span><o:p></o:p></p>
<p class="null"><span class="null1"> </span><o:p></o:p></p>
<p class="null"><span class="null1">Thanks</span><o:p></o:p></p>
<p class="null"><span class="null1"> </span><o:p></o:p></p>
<p class="null"><span class="null1"> </span><o:p></o:p></p>
<p class="null"><span class="null1"><b><span style="font-family:Arial;color:#AC7F00">Tushar B Pathare MBA IT,BE IT</span></b></span><o:p></o:p></p>
<p class="null"><span class="null1"><span style="font-family:Arial;color:gray">Bigdata & GPFS</span></span><o:p></o:p></p>
<p class="null"><span class="null1"><span style="font-family:Arial;color:gray">Software Development & Databases</span></span><o:p></o:p></p>
<p class="null"><span class="null1"><span style="font-family:Arial;color:gray">Scientific Computing</span></span><o:p></o:p></p>
<p class="null"><span class="null1"><span style="font-family:Arial;color:gray">Bioinformatics Division</span></span><o:p></o:p></p>
<p class="null"><span class="null1"><span style="font-family:Arial;color:gray">Research</span></span><o:p></o:p></p>
<p class="null"><span class="null1"><span style="color:black"> </span></span><o:p></o:p></p>
<p class="null"><span class="null1"><span style="font-family:Arial;color:#AFABAB">"What ever the mind of man can conceive and believe, drill can query"</span></span><o:p></o:p></p>
<p class="null"><span class="null1"><span style="color:black"> </span></span><o:p></o:p></p>
<p class="null"><span class="null1"><b><span style="font-family:Arial;color:#AC7F00">Sidra Medical and Research Centre</span></b></span><o:p></o:p></p>
<p class="null"><span class="null1"><b><span style="font-family:Arial;color:#AC7F00">Sidra OPC Building</span></b></span><o:p></o:p></p>
<p class="null"><span class="null1"><span style="font-family:Arial;color:gray">Sidra Medical & Research Center</span></span><o:p></o:p></p>
<p class="null"><span class="null1"><span style="font-family:Arial;color:gray">PO Box 26999</span></span><o:p></o:p></p>
<p class="null"><span class="null1"><span style="font-family:Arial;color:gray">Al Luqta Street</span></span><o:p></o:p></p>
<p class="null"><span class="null1"><span style="font-family:Arial;color:gray">Education City North Campus</span></span><o:p></o:p></p>
<p class="null"><span class="null1"><span style="font-family:Arial;color:gray">Qatar Foundation, Doha, Qatar</span></span><o:p></o:p></p>
<p class="null"><span class="null1"><span style="font-family:Arial;color:gray">Office 4003 3333 ext 37443 | M +974 74793547</span></span><o:p></o:p></p>
<p class="null"><span class="null1"><span lang="FR" style="font-family:Arial;color:gray"><a href="mailto:tpathare@sidra.org"><span style="color:purple">tpathare@sidra.org</span></a> | </span></span><span class="null1"><span lang="FR" style="font-family:Arial;color:blue"><a href="http://www.sidra.org/"><span style="color:purple">www.sidra.org</span></a></span></span><o:p></o:p></p>
<p class="null"> <o:p></o:p></p>
</div>
<p class="MsoNormal">Disclaimer: This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient, any reading, printing, storage, disclosure, copying or
any other action taken in respect of this e-mail is prohibited and may be unlawful. If you are not the intended recipient, please notify the sender immediately by using the reply function and then permanently delete what you have received. Any views or opinions
expressed are solely those of the author and do not necessarily represent those of Sidra Medical and Research Center.
<o:p></o:p></p>
</div>
</div>
</div>
</blockquote>
</div>
</div>
Disclaimer: This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient, any reading, printing, storage, disclosure, copying or any other action taken
in respect of this e-mail is prohibited and may be unlawful. If you are not the intended recipient, please notify the sender immediately by using the reply function and then permanently delete what you have received. Any views or opinions expressed are solely
those of the author and do not necessarily represent those of Sidra Medical and Research Center.
</body>
</html>