<div><br></div><div><div dir="auto">Could maybe be interesting to drop the NSD servers, and let all nodes access the storage via srp ?</div><div dir="auto"><br></div><div dir="auto">Maybe turn off readahead, since it can cause performance degradation when GPFS reads 1 MB blocks scattered on the NSDs, so that read-ahead always reads too much. This might be the cause of the slow read seen — maybe you’ll also overflow it if reading from both NSD-servers at the same time?</div><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto">Plus.. it’s always nice to give a bit more pagepool to hhe clients than the default.. I would prefer to start with 4 GB.</div><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto"> -jf</div></div><div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">fre. 5. jun. 2020 kl. 14:22 skrev Giovanni Bracco <<a href="mailto:giovanni.bracco@enea.it">giovanni.bracco@enea.it</a>>:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:rgb(204,204,204)">In our lab we have received two storage-servers, Super micro <br>
SSG-6049P-E1CR24L, 24 HD each (9TB SAS3), with Avago 3108 RAID <br>
controller (2 GB cache) and before putting them in production for other <br>
purposes we have setup a small GPFS test cluster to verify if they can <br>
be used as storage (our gpfs production cluster has the licenses based <br>
on the NSD sockets, so it would be interesting to expand the storage <br>
size just by adding storage-servers in a infiniband based SAN, without <br>
changing the number of NSD servers)<br>
<br>
The test cluster consists of:<br>
<br>
1) two NSD servers (IBM x3550M2) with a dual port IB QDR Trues scale each.<br>
2) a Mellanox FDR switch used as a SAN switch<br>
3) a Truescale QDR switch as GPFS cluster switch<br>
4) two GPFS clients (Supermicro AMD nodes) one port QDR each.<br>
<br>
All the nodes run CentOS 7.7.<br>
<br>
On each storage-server a RAID 6 volume of 11 disk, 80 TB, has been <br>
configured and it is exported via infiniband as an iSCSI target so that <br>
both appear as devices accessed by the srp_daemon on the NSD servers, <br>
where multipath (not really necessary in this case) has been configured <br>
for these two LIO-ORG devices.<br>
<br>
GPFS version 5.0.4-0 has been installed and the RDMA has been properly <br>
configured<br>
<br>
Two NSD disk have been created and a GPFS file system has been configured.<br>
<br>
Very simple tests have been performed using lmdd serial write/read.<br>
<br>
1) storage-server local performance: before configuring the RAID6 volume <br>
as NSD disk, a local xfs file system was created and lmdd write/read <br>
performance for 100 GB file was verified to be about 1 GB/s<br>
<br>
2) once the GPFS cluster has been created write/read test have been <br>
performed directly from one of the NSD server at a time:<br>
<br>
write performance 2 GB/s, read performance 1 GB/s for 100 GB file<br>
<br>
By checking with iostat, it was observed that the I/O in this case <br>
involved only the NSD server where the test was performed, so when <br>
writing, the double of base performances was obtained, while in reading <br>
the same performance as on a local file system, this seems correct.<br>
Values are stable when the test is repeated.<br>
<br>
3) when the same test is performed from the GPFS clients the lmdd result <br>
for a 100 GB file are:<br>
<br>
write - 900 MB/s and stable, not too bad but half of what is seen from <br>
the NSD servers.<br>
<br>
read - 30 MB/s to 300 MB/s: very low and unstable values<br>
<br>
No tuning of any kind in all the configuration of the involved system, <br>
only default values.<br>
<br>
Any suggestion to explain the very bad read performance from a GPFS client?<br>
<br>
Giovanni<br>
<br>
here are the configuration of the virtual drive on the storage-server <br>
and the file system configuration in GPFS<br>
<br>
<br>
Virtual drive<br>
==============<br>
<br>
Virtual Drive: 2 (Target Id: 2)<br>
Name :<br>
RAID Level : Primary-6, Secondary-0, RAID Level Qualifier-3<br>
Size : 81.856 TB<br>
Sector Size : 512<br>
Is VD emulated : Yes<br>
Parity Size : 18.190 TB<br>
State : Optimal<br>
Strip Size : 256 KB<br>
Number Of Drives : 11<br>
Span Depth : 1<br>
Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if <br>
Bad BBU<br>
Current Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if <br>
Bad BBU<br>
Default Access Policy: Read/Write<br>
Current Access Policy: Read/Write<br>
Disk Cache Policy : Disabled<br>
<br>
<br>
GPFS file system from mmlsfs<br>
============================<br>
<br>
mmlsfs vsd_gexp2<br>
flag value description<br>
------------------- ------------------------ <br>
-----------------------------------<br>
-f 8192 Minimum fragment <br>
(subblock) size in bytes<br>
-i 4096 Inode size in bytes<br>
-I 32768 Indirect block size in bytes<br>
-m 1 Default number of metadata <br>
replicas<br>
-M 2 Maximum number of metadata <br>
replicas<br>
-r 1 Default number of data <br>
replicas<br>
-R 2 Maximum number of data <br>
replicas<br>
-j cluster Block allocation type<br>
-D nfs4 File locking semantics in <br>
effect<br>
-k all ACL semantics in effect<br>
-n 512 Estimated number of nodes <br>
that will mount file system<br>
-B 1048576 Block size<br>
-Q user;group;fileset Quotas accounting enabled<br>
user;group;fileset Quotas enforced<br>
none Default quotas enabled<br>
--perfileset-quota No Per-fileset quota enforcement<br>
--filesetdf No Fileset df enabled?<br>
-V 22.00 (5.0.4.0) File system version<br>
--create-time Fri Apr 3 19:26:27 2020 File system creation time<br>
-z No Is DMAPI enabled?<br>
-L 33554432 Logfile size<br>
-E Yes Exact mtime mount option<br>
-S relatime Suppress atime mount option<br>
-K whenpossible Strict replica allocation <br>
option<br>
--fastea Yes Fast external attributes <br>
enabled?<br>
--encryption No Encryption enabled?<br>
--inode-limit 134217728 Maximum number of inodes<br>
--log-replicas 0 Number of log replicas<br>
--is4KAligned Yes is4KAligned?<br>
--rapid-repair Yes rapidRepair enabled?<br>
--write-cache-threshold 0 HAWC Threshold (max 65536)<br>
--subblocks-per-full-block 128 Number of subblocks per <br>
full block<br>
-P system Disk storage pools in file <br>
system<br>
--file-audit-log No File Audit Logging enabled?<br>
--maintenance-mode No Maintenance Mode enabled?<br>
-d nsdfs4lun2;nsdfs5lun2 Disks in file system<br>
-A yes Automatic mount option<br>
-o none Additional mount options<br>
-T /gexp2 Default mount point<br>
--mount-priority 0 Mount priority<br>
<br>
<br>
-- <br>
Giovanni Bracco<br>
phone +39 351 8804788<br>
E-mail <a href="mailto:giovanni.bracco@enea.it" target="_blank">giovanni.bracco@enea.it</a><br>
WWW <a href="http://www.afs.enea.it/bracco" rel="noreferrer" target="_blank">http://www.afs.enea.it/bracco</a><br>
<br>
<br>
==================================================<br>
<br>
Questo messaggio e i suoi allegati sono indirizzati esclusivamente alle persone indicate e la casella di posta elettronica da cui e' stata inviata e' da qualificarsi quale strumento aziendale.<br>
La diffusione, copia o qualsiasi altra azione derivante dalla conoscenza di queste informazioni sono rigorosamente vietate (art. 616 c.p, D.Lgs. n. 196/2003 s.m.i. e GDPR Regolamento - UE 2016/679).<br>
Qualora abbiate ricevuto questo documento per errore siete cortesemente pregati di darne immediata comunicazione al mittente e di provvedere alla sua distruzione. Grazie.<br>
<br>
This e-mail and any attachments is confidential and may contain privileged information intended for the addressee(s) only.<br>
Dissemination, copying, printing or use by anybody else is unauthorised (art. 616 c.p, D.Lgs. n. 196/2003 and subsequent amendments and GDPR UE 2016/679).<br>
If you are not the intended recipient, please delete this message and any attachments and advise the sender by return e-mail. Thanks.<br>
<br>
==================================================<br>
<br>
_______________________________________________<br>
gpfsug-discuss mailing list<br>
gpfsug-discuss at <a href="http://spectrumscale.org" rel="noreferrer" target="_blank">spectrumscale.org</a><br>
<a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" rel="noreferrer" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a><br>
</blockquote></div></div>