<div><div><div>fre. 5. jun. 2020 kl. 15:53 skrev Giovanni Bracco <<a href="mailto:giovanni.bracco@enea.it" target="_blank">giovanni.bracco@enea.it</a>>:<br></div></div><div><div><div class="gmail_quote"></div></div></div><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)">answer in the text<br>

<br>

On 05/06/20 14:58, Jan-Frode Myklebust wrote:<br>

> <br>

> Could maybe be interesting to drop the NSD servers, and let all nodes <br>

> access the storage via srp ?<br>

<br>

no we can not: the production clusters fabric is a mix of a QDR based <br>

cluster and a OPA based cluster and NSD nodes provide the service to both.<br>

</blockquote><div dir="auto"><br></div></div></div><div><div><div dir="auto">You could potentially still do SRP from QDR nodes, and via NSD for your omnipath nodes. Going via NSD seems like a bit pointless indirection.</div></div><div><div><div class="gmail_quote"></div></div></div></div><div><div dir="auto"><br></div><div dir="auto"><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)"><br>

> <br>

> Maybe turn off readahead, since it can cause performance degradation <br>

> when GPFS reads 1 MB blocks scattered on the NSDs, so that read-ahead <br>

> always reads too much. This might be the cause of the slow read seen — <br>

> maybe you’ll also overflow it if reading from both NSD-servers at the <br>

> same time?<br>

<br>

I have switched the readahead off and this produced a small (~10%) <br>

increase of performances when reading from a NSD server, but no change <br>

in the bad behaviour for the GPFS clients</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)"><br>

> <br>

> <br>

> Plus.. it’s always nice to give a bit more pagepool to hhe clients than <br>

> the default.. I would prefer to start with 4 GB.<br>

<br>

we'll do also that and we'll let you know!</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)"></blockquote><div dir="auto"><br></div></div><div><div dir="auto">Could you show your mmlsconfig? Likely you should set maxMBpS to indicate what kind of throughput a client can do (affects GPFS readahead/writebehind).  Would typically also increase workerThreads on your NSD servers.</div></div><div><div><div><div class="gmail_quote"><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto">1 MB blocksize is a bit bad for your 9+p+q RAID with 256 KB strip size. When you write one GPFS block, less than a half RAID stripe is written, which means you  need to read back some data to calculate new parities. I would prefer 4 MB block size, and maybe also change to 8+p+q so that one GPFS is a multiple of a full 2 MB stripe. </div><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto">   -jf</div></div></div>

</div>

</div>