[gpfsug-discuss] GPFS 5.1.9.4 on Windows 11 Pro. Performance issues, write.

Uwe Falke uwe.falke at kit.edu
Wed Sep 4 13:39:21 BST 2024


Hi,

the writes look strange:

you seem to use a blocksize of 8MiB on your file system.

The reads are entirely full block reads (16ki sectors a 0.5kiB => 8MiB), 
but of the writes less than half are comprising full blocks.

The service time for the write I/O seems to correlate prtetty well with 
I/O size, i.e. is rate-bound, not IOps-bound (which again could be 
rooted in the data link to your NSD server):

8MiB transfers take about 6.0...6.5ms, related to approx 1.3GiB/s, a 
bunch of 10800-sector transfers take 4.0ms on rough average, this 
relates to 1.3GiB/s as well.


As for the read data: none of the seen service times comes close to the 
1ms you've reported below, confirming those numbers have nothing to do 
with your real data traffic.

The service times go down to 5ms which is rather low, in particular 
compared to the somewhat higher service times for writes. This could 
mean those read I/Os with times of 6ms and below are served from the 
storage systems cache (NSD servers do buffer but not cache). Then, the 
read IOs with substantially higher service times are the ones really 
served from disk which again would mean your storage is the culprit . 
But then the question arises wy do your other machines behave well? But 
given the caching issues indicated above - are you sure the measurements 
at your other machines are trustworthy?


The iohist snippet for reads comprises 74 IOs in about 0.854s, this 
relates to roughly 690MiB/s, far from the 10-fold value you reported. 
So, while reason is to assume some IOs are served from the storage 
cache, much more of those seems to come from the client's cache


Begfore that hasn't been clarified I think you do not need to play with 
the BIOS and stuff :-)


Uwe



On 04.09.24 13:28, Henrik Cednert wrote:
> Adding a snippet from iohist when reading as well.
>
> >mmdiag --iohist
>
> === mmdiag: iohist ===
>
> I/O history:
>
>  I/O start time RW    Buf type disk:sectorNum     nSec  time ms  Type 
>      Device/NSD ID        NSD node
> --------------- -- ----------- ----------------- -----  -------  ---- 
> ------------------ ---------------
> 13:24:59.180277  R        data   19:1491763200   16384   32,433   cli 
> C0A82DD5:63877BDC   192.168.45.213
> 13:24:59.212710  R        data    8:805453824    16384    5,987   cli 
> C0A82DD5:63877BD0   192.168.45.214
> 13:24:59.220698  R        data    8:805453824    16384    6,012   cli 
> C0A82DD5:63877BD0   192.168.45.214
> 13:24:59.226710  R        data   16:805453824    16384    9,129   cli 
> C0A82DD5:63877BD9   192.168.45.214
> 13:24:59.226710  R        data   12:1491763200   16384   18,526   cli 
> C0A82DD5:63877BD4   192.168.45.214
> 13:24:59.226710  R        data   20:2178072576   16384   23,520   cli 
> C0A82DD5:63877BDD   192.168.45.214
> 13:24:59.251229  R        data   16:805453824    16384    5,497   cli 
> C0A82DD5:63877BD9   192.168.45.214
> 13:24:59.257730  R        data   24:805453824    16384    5,990   cli 
> C0A82DD5:63877BE1   192.168.45.214
> 13:24:59.257730  R        data    5:1720532992   16384   33,458   cli 
> C0A82DD5:63877BCD   192.168.45.213
> 13:24:59.257730  R        data   28:1720532992   16384   33,458   cli 
> C0A82DD5:63877BE5   192.168.45.214
> 13:24:59.292189  R        data   24:805453824    16384    4,992   cli 
> C0A82DD5:63877BE1   192.168.45.214
> 13:24:59.300176  R        data   24:805453824    16384    4,991   cli 
> C0A82DD5:63877BE1   192.168.45.214
> 13:24:59.306169  R        data    9:805453824    16384    7,987   cli 
> C0A82DD5:63877BD1   192.168.45.213
> 13:24:59.306169  R        data   13:1720532992   16384   25,471   cli 
> C0A82DD5:63877BD5   192.168.45.213
> 13:24:59.306169  R        data   21:1720532992   16384   28,466   cli 
> C0A82DD5:63877BDE   192.168.45.213
> 13:24:59.335637  R        data    9:805453824    16384    4,994   cli 
> C0A82DD5:63877BD1   192.168.45.213
> 13:24:59.341629  R        data   17:805453824    16384    3,993   cli 
> C0A82DD5:63877BDA   192.168.45.213
> 13:24:59.341629  R        data    6:1720532992   16384   27,471   cli 
> C0A82DD5:63877BCE   192.168.45.214
> 13:24:59.341629  R        data   14:1720532992   16384   44,953   cli 
> C0A82DD5:63877BD6   192.168.45.214
> 13:24:59.387584  R        data   17:805453824    16384    5,990   cli 
> C0A82DD5:63877BDA   192.168.45.213
> 13:24:59.394572  R        data   17:805453824    16384    7,993   cli 
> C0A82DD5:63877BDA   192.168.45.213
> 13:24:59.402565  R        data   25:805453824    16384    4,991   cli 
> C0A82DD5:63877BE2   192.168.45.213
> 13:24:59.402565  R        data   22:1720532992   16384   26,309   cli 
> C0A82DD5:63877BDF   192.168.45.214
> 13:24:59.402565  R        data    7:1720532992   16384  142,856   cli 
> C0A82DD5:63877BCF   192.168.45.213
> 13:24:59.546419  R        data   25:805453824    16384    7,992   cli 
> C0A82DD5:63877BE2   192.168.45.213
> 13:24:59.556408  R        data   25:805453824    16384    7,987   cli 
> C0A82DD5:63877BE2   192.168.45.213
> 13:24:59.564395  R        data   10:805453824    16384    5,505   cli 
> C0A82DD5:63877BD2   192.168.45.214
> 13:24:59.564395  R        data   23:1720532992   16384   28,053   cli 
> C0A82DD5:63877BE0   192.168.45.213
> 13:24:59.564395  R        data   15:1720532992   16384   33,044   cli 
> C0A82DD5:63877BD8   192.168.45.213
> 13:24:59.598437  R        data   10:805453824    16384    5,504   cli 
> C0A82DD5:63877BD2   192.168.45.214
> 13:24:59.604939  R        data   18:805453824    16384    4,993   cli 
> C0A82DD5:63877BDB   192.168.45.214
> 13:24:59.604939  R        data    8:1720532992   16384   36,015   cli 
> C0A82DD5:63877BD0   192.168.45.214
> 13:24:59.609932  R        data   16:1720532992   16384   42,010   cli 
> C0A82DD5:63877BD9   192.168.45.214
> 13:24:59.652940  R        data   18:805453824    16384    7,989   cli 
> C0A82DD5:63877BDB   192.168.45.214
> 13:24:59.662434  R        data   18:805453824    16384    6,994   cli 
> C0A82DD5:63877BDB   192.168.45.214
> 13:24:59.669428  R        data   26:805453824    16384    8,986   cli 
> C0A82DD5:63877BE3   192.168.45.214
> 13:24:59.669428  R        data   24:1720532992   16384   20,308   cli 
> C0A82DD5:63877BE1   192.168.45.214
> 13:24:59.669428  R        data    9:1720532992   16384   25,812   cli 
> C0A82DD5:63877BD1   192.168.45.213
> 13:24:59.696239  R        data   26:805453824    16384    6,989   cli 
> C0A82DD5:63877BE3   192.168.45.214
> 13:24:59.703228  R        data   11:805453824    16384    4,992   cli 
> C0A82DD5:63877BD3   192.168.45.213
> 13:24:59.703228  R        data   25:1720532992   16384   17,976   cli 
> C0A82DD5:63877BE2   192.168.45.213
> 13:24:59.703228  R        data   17:1720532992   16384   22,481   cli 
> C0A82DD5:63877BDA   192.168.45.213
> 13:24:59.725709  R        data   11:805453824    16384    4,992   cli 
> C0A82DD5:63877BD3   192.168.45.213
> 13:24:59.731699  R        data   11:805453824    16384    8,986   cli 
> C0A82DD5:63877BD3   192.168.45.213
> 13:24:59.740685  R        data   19:805453824    16384    4,997   cli 
> C0A82DD5:63877BDC   192.168.45.213
> 13:24:59.740685  R        data   18:1720532992   16384   19,486   cli 
> C0A82DD5:63877BDB   192.168.45.214
> 13:24:59.740685  R        data   10:1720532992   16384   27,474   cli 
> C0A82DD5:63877BD2   192.168.45.214
> 13:24:59.769157  R        data   19:805453824    16384    4,997   cli 
> C0A82DD5:63877BDC   192.168.45.213
> 13:24:59.774154  R        data   27:805453824    16384    4,992   cli 
> C0A82DD5:63877BE4   192.168.45.213
> 13:24:59.774154  R        data   11:1720532992   16384   22,476   cli 
> C0A82DD5:63877BD3   192.168.45.213
> 13:24:59.774154  R        data   26:1720532992   16384   29,464   cli 
> C0A82DD5:63877BE3   192.168.45.214
> 13:24:59.803618  R        data   27:805453824    16384    4,997   cli 
> C0A82DD5:63877BE4   192.168.45.213
> 13:24:59.809614  R        data   27:805453824    16384    7,987   cli 
> C0A82DD5:63877BE4   192.168.45.213
> 13:24:59.817601  R        data   12:805453824    16384    8,500   cli 
> C0A82DD5:63877BD4   192.168.45.214
> 13:24:59.817601  R        data   27:1720532992   16384    9,499   cli 
> C0A82DD5:63877BE4   192.168.45.213
> 13:24:59.817601  R        data   19:1720532992   16384   74,075   cli 
> C0A82DD5:63877BDC   192.168.45.213
> 13:24:59.892674  R        data   12:805453824    16384    4,997   cli 
> C0A82DD5:63877BD4   192.168.45.214
> 13:24:59.897671  R        data   20:1491763200   16384    4,992   cli 
> C0A82DD5:63877BDD   192.168.45.214
> 13:24:59.898670  R        data   12:1720532992   16384   24,479   cli 
> C0A82DD5:63877BD4   192.168.45.214
> 13:24:59.898670  R        data   20:2406842368   16384   26,476   cli 
> C0A82DD5:63877BDD   192.168.45.214
> 13:24:59.925146  R        data   20:1491763200   16384    8,990   cli 
> C0A82DD5:63877BDD   192.168.45.214
> 13:24:59.935430  R        data   20:1491763200   16384    4,691   cli 
> C0A82DD5:63877BDD   192.168.45.214
> 13:24:59.940121  R        data   28:1034223616   16384    4,312   cli 
> C0A82DD5:63877BE5   192.168.45.214
> 13:24:59.940121  R        data   28:1949302784   16384   26,664   cli 
> C0A82DD5:63877BE5   192.168.45.214
> 13:24:59.940121  R        data    5:1949302784   16384   34,831   cli 
> C0A82DD5:63877BCD   192.168.45.213
> 13:24:59.975955  R        data   28:1034223616   16384    5,583   cli 
> C0A82DD5:63877BE5   192.168.45.214
> 13:24:59.981539  R        data    5:1034223616   16384    5,168   cli 
> C0A82DD5:63877BCD   192.168.45.213
> 13:24:59.981539  R        data   21:1949302784   16384   25,392   cli 
> C0A82DD5:63877BDE   192.168.45.213
> 13:24:59.981539  R        data   13:1949302784   16384   34,412   cli 
> C0A82DD5:63877BD5   192.168.45.213
> 13:25:00.016950  R        data    5:1034223616   16384    7,991   cli 
> C0A82DD5:63877BCD   192.168.45.213
> 13:25:00.026938  R        data    5:1034223616   16384    7,750   cli 
> C0A82DD5:63877BCD   192.168.45.213
> 13:25:00.034688  R        data   13:1034223616   16384    5,498   cli 
> C0A82DD5:63877BD5   192.168.45.213
> 13:25:00.034688  R        data    6:1949302784   16384   27,661   cli 
> C0A82DD5:63877BCE   192.168.45.214
>
>
> -- 
>
> Henrik Cednert */ * + 46 704 71 89 54 */*  CTO */  OnePost 
> *(formerly Filmlance Post)
>
> ☝️ *OnePost*, formerly Filmlance's post-production, is now an 
> independent part of the Banijay Group.
> New name, same team – business as usual at OnePost.
>
>
>
> ------------------------------------------------------------------------
> *From:* gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> on behalf 
> of Henrik Cednert <henrik.cednert at onepost.se>
> *Sent:* Wednesday, 4 September 2024 13:24
> *To:* gpfsug-discuss at gpfsug.org <gpfsug-discuss at gpfsug.org>
> *Subject:* Re: [gpfsug-discuss] GPFS 5.1.9.4 on Windows 11 Pro. 
> Performance issues, write.
> Hello
>
> My theory is that it's windows 11 related, or the combo windows 11 and 
> this hardware. I guess the only thing to know for sure, is to boot it 
> into *nix and install gpfs there and test. Which I guess isn't the 
> worst of ideas at this stage.
>
> --iohist spits out a pretty hefty chuck of data. Below is a snippet 
> from when I did a write test:
>
>
>
> mmdiag --iohist
>
> === mmdiag: iohist ===
>
> I/O history:
>
>  I/O start time RW    Buf type disk:sectorNum     nSec  time ms  Type 
>      Device/NSD ID        NSD node
> --------------- -- ----------- ----------------- -----  -------  ---- 
> ------------------ ---------------
> 13:17:04.088621  W        data   10:2178073088   10800  3,994   cli 
> C0A82DD5:63877BD2   192.168.45.214
> 13:17:04.092614  W       inode    4:3060742158       1  0,000   cli 
> C0A82DD5:63877BCC   192.168.45.214
> 13:17:04.092614  W       inode    2:3040332814       1  0,999   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.093613  W        data   10:2178083888    5072  2,995   cli 
> C0A82DD5:63877BD2   192.168.45.214
> 13:17:04.094611  W        data   18:2178072576   16384  6,502   cli 
> C0A82DD5:63877BDB   192.168.45.214
> 13:17:04.101113  W     logData    2:1869321537       2  0,998   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.101113  W     logData    3:1877542209       2  0,998   cli 
> C0A82DD5:63877BCB   192.168.45.213
> 13:17:04.103109  W        data   18:2178078304   10656  3,994   cli 
> C0A82DD5:63877BDB   192.168.45.214
> 13:17:04.103109  W        data   26:2178072576   16384  6,529   cli 
> C0A82DD5:63877BE3   192.168.45.214
> 13:17:04.109638  W     logData    2:1869321538       2  0,994   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.109638  W     logData    3:1877542210       2  0,994   cli 
> C0A82DD5:63877BCB   192.168.45.213
> 13:17:04.111634  W        data   26:2178072720   10800  3,992   cli 
> C0A82DD5:63877BE3   192.168.45.214
> 13:17:04.115626  W       inode    4:3060742158       1  0,000   cli 
> C0A82DD5:63877BCC   192.168.45.214
> 13:17:04.115626  W       inode    2:3040332814       1  0,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.116626  W        data   26:2178083520    5440  2,995   cli 
> C0A82DD5:63877BE3   192.168.45.214
> 13:17:04.117629  W        data   11:2178072576   16384  4,987   cli 
> C0A82DD5:63877BD3   192.168.45.213
> 13:17:04.122616  W     logData    2:1869321539       2  0,999   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.122616  W     logData    3:1877542211       2  0,999   cli 
> C0A82DD5:63877BCB   192.168.45.213
> 13:17:04.124613  W        data   11:2178077936   10800  3,999   cli 
> C0A82DD5:63877BD3   192.168.45.213
> 13:17:04.128612  W       inode    4:3060742158       1  0,000   cli 
> C0A82DD5:63877BCC   192.168.45.214
> 13:17:04.128612  W       inode    2:3040332814       1  0,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.129609  W        data   11:2178088736     224  0,000   cli 
> C0A82DD5:63877BD3   192.168.45.213
> 13:17:04.129609  W        data   19:2178072576   16384  6,495   cli 
> C0A82DD5:63877BDC   192.168.45.213
> 13:17:04.136104  W     logData    2:1869321540       1  0,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.136104  W     logData    3:1877542212       1  0,000   cli 
> C0A82DD5:63877BCB   192.168.45.213
> 13:17:04.137103  W        data   19:2178083152    5808  2,999   cli 
> C0A82DD5:63877BDC   192.168.45.213
> 13:17:04.138100  W        data   27:2178072576   16384  5,990   cli 
> C0A82DD5:63877BE4   192.168.45.213
> 13:17:04.144091  W     logData    2:1869321540       2  0,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.144091  W     logData    3:1877542212       2  0,000   cli 
> C0A82DD5:63877BCB   192.168.45.213
> 13:17:04.146088  W        data   27:2178077568   10800  4,005   cli 
> C0A82DD5:63877BE4   192.168.45.213
> 13:17:04.150092  W       inode    4:3060742158       1  0,000   cli 
> C0A82DD5:63877BCC   192.168.45.214
> 13:17:04.150092  W       inode    2:3040332814       1  0,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.151092  W        data   27:2178088368     592  0,000   cli 
> C0A82DD5:63877BE4   192.168.45.213
> 13:17:04.151092  W        data   12:2178072576   16384  4,995   cli 
> C0A82DD5:63877BD4   192.168.45.214
> 13:17:04.156086  W     logData    2:1869321541       2  0,996   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.156086  W     logData    3:1877542213       2  0,996   cli 
> C0A82DD5:63877BCB   192.168.45.213
> 13:17:04.157082  W        data   12:2178082784    6176  2,994   cli 
> C0A82DD5:63877BD4   192.168.45.214
> 13:17:04.158083  W        data   20:2178072576   16384  7,498   cli 
> C0A82DD5:63877BDD   192.168.45.214
> 13:17:04.165581  W     logData    2:1869321542       2  0,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.165581  W     logData    3:1877542214       2  0,000   cli 
> C0A82DD5:63877BCB   192.168.45.213
> 13:17:04.167578  W        data   20:2178077200   10800  2,994   cli 
> C0A82DD5:63877BDD   192.168.45.214
> 13:17:04.170572  W       inode    4:3060742158       1  0,996   cli 
> C0A82DD5:63877BCC   192.168.45.214
> 13:17:04.171568  W       inode    2:3040332814       1  0,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.171568  W        data   20:2178088000     960  1,001   cli 
> C0A82DD5:63877BDD   192.168.45.214
> 13:17:04.172569  W        data   28:2864381952   16384  5,988   cli 
> C0A82DD5:63877BE5   192.168.45.214
> 13:17:04.178557  W     logData    2:1869321543       2  0,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.178557  W     logData    3:1877542215       2  0,000   cli 
> C0A82DD5:63877BCB   192.168.45.213
> 13:17:04.179560  W        data   28:2864391792    6544  2,995   cli 
> C0A82DD5:63877BE5   192.168.45.214
> 13:17:04.179560  W        data    5:2406842368   16384  5,987   cli 
> C0A82DD5:63877BCD   192.168.45.213
> 13:17:04.185548  W     logData    2:1869321544       1  0,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.185548  W     logData    3:1877542216       1  0,000   cli 
> C0A82DD5:63877BCB   192.168.45.213
> 13:17:04.186549  W        data    5:2406846624   10800  4,993   cli 
> C0A82DD5:63877BCD   192.168.45.213
> 13:17:04.191542  W       inode    4:3060742158       1  0,000   cli 
> C0A82DD5:63877BCC   192.168.45.214
> 13:17:04.191542  W       inode    2:3040332814       1  0,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.192540  W        data    5:2406857424    1328  0,995   cli 
> C0A82DD5:63877BCD   192.168.45.213
> 13:17:04.193535  W        data   13:2406842368   16384  6,019   cli 
> C0A82DD5:63877BD5   192.168.45.213
> 13:17:04.199554  W     logData    2:1869321544       2  0,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.199554  W     logData    3:1877542216       2  0,000   cli 
> C0A82DD5:63877BCB   192.168.45.213
> 13:17:04.200551  W        data   13:2406851840    6912  2,997   cli 
> C0A82DD5:63877BD5   192.168.45.213
> 13:17:04.201554  W        data   21:2406842368   16384  5,912   cli 
> C0A82DD5:63877BDE   192.168.45.213
> 13:17:04.207466  W     logData    2:1869321545       2  0,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.207466  W     logData    3:1877542217       2  0,000   cli 
> C0A82DD5:63877BCB   192.168.45.213
> 13:17:04.208465  W        data   21:2406846256   10800  3,990   cli 
> C0A82DD5:63877BDE   192.168.45.213
> 13:17:04.212456  W       inode    4:3060742158       1  0,000   cli 
> C0A82DD5:63877BCC   192.168.45.214
> 13:17:04.212456  W       inode    2:3040332814       1  0,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.213456  W        data   21:2406857056    1696  1,998   cli 
> C0A82DD5:63877BDE   192.168.45.213
> 13:17:04.214457  W        data    6:2406842368   16384  5,015   cli 
> C0A82DD5:63877BCE   192.168.45.214
> 13:17:04.219472  W     logData    2:1869321546       2  1,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.219472  W     logData    3:1877542218       2  1,000   cli 
> C0A82DD5:63877BCB   192.168.45.213
> 13:17:04.221474  W        data    6:2406851472    7280  1,994   cli 
> C0A82DD5:63877BCE   192.168.45.214
> 13:17:04.221474  W        data   14:2406842368   16384  7,502   cli 
> C0A82DD5:63877BD6   192.168.45.214
> 13:17:04.228976  W     logData    2:1869321547       1  0,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.228976  W     logData    3:1877542219       1  0,000   cli 
> C0A82DD5:63877BCB   192.168.45.213
> 13:17:04.229971  W        data   14:2406845888   10800  3,994   cli 
> C0A82DD5:63877BD6   192.168.45.214
> 13:17:04.233965  W       inode    4:3060742158       1  0,000   cli 
> C0A82DD5:63877BCC   192.168.45.214
> 13:17:04.233965  W       inode    2:3040332814       1  0,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.234963  W        data   14:2406856688    2064  1,999   cli 
> C0A82DD5:63877BD6   192.168.45.214
> 13:17:04.235965  W        data   22:2406842368   16384  4,993   cli 
> C0A82DD5:63877BDF   192.168.45.214
> 13:17:04.240958  W     logData    2:1869321547       2  0,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.240958  W     logData    3:1877542219       2  0,000   cli 
> C0A82DD5:63877BCB   192.168.45.213
> 13:17:04.241974  W        data   22:2406851104    7648  2,531   cli 
> C0A82DD5:63877BDF   192.168.45.214
> 13:17:04.242977  W        data    7:2406842368   16384  5,526   cli 
> C0A82DD5:63877BCF   192.168.45.213
> 13:17:04.248503  W     logData    2:1869321548       2  0,000   cli 
> C0A82DD5:63877BCA   192.168.45.214
> 13:17:04.248503  W     logData    3:1877542220       2  0,000   cli 
> C0A82DD5:63877BCB   192.168.45.213
> 13:17:04.249503  W        data    7:2406845520   10800  3,994   cli 
> C0A82DD5:63877BCF   192.168.45.213
> 13:17:04.253497  W       inode    4:3060742158       1  0,000   cli 
> C0A82DD5:63877BCC   192.168.45.214
>
>
> -- 
>
> Henrik Cednert */ * + 46 704 71 89 54 */*  CTO */  OnePost 
> *(formerly Filmlance Post)
>
> ☝️ *OnePost*, formerly Filmlance's post-production, is now an 
> independent part of the Banijay Group.
> New name, same team – business as usual at OnePost.
>
>
>
> ------------------------------------------------------------------------
> *From:* gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> on behalf 
> of Uwe Falke <uwe.falke at kit.edu>
> *Sent:* Wednesday, 4 September 2024 12:59
> *To:* gpfsug-discuss at gpfsug.org <gpfsug-discuss at gpfsug.org>
> *Subject:* Re: [gpfsug-discuss] GPFS 5.1.9.4 on Windows 11 Pro. 
> Performance issues, write.
>
> Hi, given you see read latencies of 1 ms, you do not get the data from 
> disk but from some cache (on whatever level). From spinning disks, you 
> can never expect such read latencies (mind that GPFS block reading, 
> even if sequential from the application's PoV, typically translates to 
> random I/O at the physical disk level).
>
> So, I do not know what the latencies on your other measurements 
> (bypassing GPFS) were  but the numbers below do not represent 
> sustained high-scale throughputs, apparently.
>
>
> It is nevertheless strange that your write rates are much below reads 
> (and write latencies are that high) -- from my experience with 
> different systems, when hammering GPFS with usual storage backends 
> with both read and write requests, the writes tend to prevail.
>
>
>
>
> Your waiters indicate that the problem is above GPFS:
>
> GPFS is able to serve all I/O threads within a few ms, and there is 
> not a long list of pending IOs.
>
>
> Sorry, the iohistory options is
>
> mmdiag --iohist
>
>
> But to me it looks like not GPFS is the culprit here.
>
>
> Uwe
>
> On 04.09.24 11:08, Henrik Cednert wrote:
>> Hi Uwe
>>
>> Thanks.
>>
>> Worth noting is that we have win10 ltsc, win 2019 and have had a 
>> single CPU win 11 22h2 (as a test) clients that all perform as 
>> expected. Those machines is older though and connected with 10-40GbE, 
>> those client max out their NIC in read and write.
>>
>> Let me know if i missed something important here. Thanks again.
>>
>> Setups is
>>
>> Client:
>>
>>   * Supermicro Workstation
>>   * Intel(R) Xeon(R) Gold 6418H   2.10 GHz  (2 processors)
>>   * Mellanox ConnectX-6 Dx connected with 100GbE over dedicated vlan
>>     via mellanox sn2100.
>>   * Windows 11 Pro for Workstations, 22H2
>>
>>
>>
>> Storage setup
>>
>>  *
>>     3 x 84 bay seagate chassis with spinning disks.
>>  *
>>     Storage connected with redundant 12Gb SAS to 2 x Storage node servers
>>  *
>>     2 x mellanox sn2100
>>  *
>>     The 2 storage node servers are for this vlan connected with
>>     100GbE to each switch, so in total 4 x 100GbE.  And the switches
>>     are connected with 2 * 100GbE
>>
>>
>>
>> I tested the commands you suggested. They are both new to me so not 
>> sure what the output is supposed to be, looks like -iohistory isn't 
>> available in windows. I ran --waiters a few times, as seen below. Not 
>> sure what the expected output is from that.
>>
>>
>> mmdiag --waiters
>>
>> === mmdiag: waiters ===
>> Waiting 0.0000 sec since 2024-09-04_10:05:14, monitored, thread 18616 
>> MsgHandler at getData: for In function sendMessage
>> Waiting 0.0000 sec since 2024-09-04_10:05:14, monitored, thread 25084 
>> WritebehindWorkerThread: on ThCond 0x31A7C360 (MsgRecordCondvar), 
>> reason 'RPC wait' for NSD I/O completion on node 192.168.45.213 <c0n0>
>>
>> C:\Users\m5-tkd01>mmdiag --waiters
>>
>> === mmdiag: waiters ===
>> Waiting 0.0009 sec since 2024-09-04_10:05:17, monitored, thread 16780 
>> FsyncHandlerThread: on ThCond 0x37FFDAB0 (MsgRecordCondvar), reason 
>> 'RPC wait' for NSD I/O completion on node 192.168.45.214 <c0n1>
>> Waiting 0.0009 sec since 2024-09-04_10:05:17, monitored, thread 30308 
>> MsgHandler at getData: for In function sendMessage
>>
>> C:\Users\m5-tkd01>mmdiag --waiters
>>
>> === mmdiag: waiters ===
>> Waiting 0.0055 sec since 2024-09-04_10:05:21, monitored, thread 16780 
>> FileBlockReadFetchHandlerThread: on ThCond 0x37A25FF0 
>> (MsgRecordCondvar), reason 'RPC wait' for NSD I/O completion on node 
>> 192.168.45.213 <c0n0>
>>
>> C:\Users\m5-tkd01>mmdiag --waiters
>>
>> === mmdiag: waiters ===
>> Waiting 0.0029 sec since 2024-09-04_10:05:23, monitored, thread 16780 
>> FileBlockReadFetchHandlerThread: on ThCond 0x38281DE0 
>> (MsgRecordCondvar), reason 'RPC wait' for NSD I/O completion on node 
>> 192.168.45.213 <c0n0>
>>
>> C:\Users\m5-tkd01>mmdiag --waiters
>>
>> === mmdiag: waiters ===
>> Waiting 0.0019 sec since 2024-09-04_10:05:25, monitored, thread 11832 
>> PrefetchWorkerThread: on ThCond 0x38278D20 (MsgRecordCondvar), reason 
>> 'RPC wait' for NSD I/O completion on node 192.168.45.214 <c0n1>
>> Waiting 0.0009 sec since 2024-09-04_10:05:25, monitored, thread 16780 
>> AcquireBRTHandlerThread: on ThCond 0x37A324E0 (MsgRecordCondvar), 
>> reason 'RPC wait' for tmMsgBRRevoke on node 192.168.45.161 <c0n11>
>> Waiting 0.0009 sec since 2024-09-04_10:05:25, monitored, thread 2576 
>> RangeRevokeWorkerThread: on ThCond 0x5419DAA0 (BrlObjCondvar), reason 
>> 'waiting because of local byte range lock conflict'
>>
>> C:\Users\m5-tkd01>
>>
>>
>>
>>
>>
>>
>> C:\Users\m5-tkd01>mmdiag --iohistory
>> Unrecognized option: --iohistory.
>> Run mmdiag --help for the option list
>>
>>
>>
>>
>> -- 
>>
>> Henrik Cednert */ * + 46 704 71 89 54 */*  CTO */  OnePost 
>> *(formerly Filmlance Post)
>>
>> ☝️ *OnePost*, formerly Filmlance's post-production, is now an 
>> independent part of the Banijay Group.
>> New name, same team – business as usual at OnePost.
>>
>>
>>
>> ------------------------------------------------------------------------
>> *From:* gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> 
>> <mailto:gpfsug-discuss-bounces at gpfsug.org> on behalf of Uwe Falke 
>> <uwe.falke at kit.edu> <mailto:uwe.falke at kit.edu>
>> *Sent:* Tuesday, 3 September 2024 17:35
>> *To:* gpfsug-discuss at gpfsug.org <mailto:gpfsug-discuss at gpfsug.org> 
>> <gpfsug-discuss at gpfsug.org> <mailto:gpfsug-discuss at gpfsug.org>
>> *Subject:* Re: [gpfsug-discuss] GPFS 5.1.9.4 on Windows 11 Pro. 
>> Performance issues, write.
>>
>> Hi, Henrik,
>>
>>
>> while I am not using Windows I'd start investigating the usual things 
>> (see below).
>>
>>
>> But first you should describe your set-up better.
>>
>> Where are the NSDs : locally attached to the Windows box? In some NSD 
>> servers?
>>
>> If the latter -- what is the link to the NSD servers? via your GbE 
>> link? FC? IB? separate Ethernet?
>>
>> What type of storage? Spinning Disks? Flash?
>>
>>
>> How long are your I/Os waiting on the client (compare that to the 
>> waiting times on the NSD server if applicable)?
>>
>> not sure whether that is available on Windows, but
>>
>> mmdiag --waiters
>>
>> mmdiag --iohistory
>>
>> might be of use.
>>
>>
>> Somewhere in the chain from your application to the storage backend 
>> there is a delay and you should first find out where that occurs I 
>> think.
>>
>>
>> Bye
>>
>> Uwe
>>
>>
>>
>> On 03.09.24 14:10, Henrik Cednert wrote:
>>> Still no solution here regarding this.
>>>
>>> Have tested other cables.
>>> Have tested to change tcp window size, no change
>>> Played with numa in the bios, no change
>>> Played with hyperthreading in bios, no change
>>>
>>>
>>> Have anyone managed to get some speed out of windows 11 and gpfs?
>>>
>>>
>>> -- 
>>>
>>> Henrik Cednert */ * + 46 704 71 89 54 */*  CTO */ OnePost 
>>> *(formerly Filmlance Post)
>>>
>>> ☝️ *OnePost*, formerly Filmlance's post-production, is now an 
>>> independent part of the Banijay Group.
>>> New name, same team – business as usual at OnePost.
>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>> *From:* gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> 
>>> <mailto:gpfsug-discuss-bounces at gpfsug.org> on behalf of Henrik 
>>> Cednert <henrik.cednert at onepost.se> <mailto:henrik.cednert at onepost.se>
>>> *Sent:* Friday, 9 August 2024 17:25
>>> *To:* gpfsug-discuss at gpfsug.org <mailto:gpfsug-discuss at gpfsug.org> 
>>> <gpfsug-discuss at gpfsug.org> <mailto:gpfsug-discuss at gpfsug.org>
>>> *Subject:* [gpfsug-discuss] GPFS 5.1.9.4 on Windows 11 Pro. 
>>> Performance issues, write.
>>>
>>> ***VARNING: DETTA ÄR ETT EXTERNT MAIL. Klicka inte på några länkar 
>>> oavsett hur legitima de verkar utan att verifiera.*
>>>
>>>
>>> Hello
>>>
>>> I have some issues with write performance on a windows 11 pro system 
>>> and I'm out of ideas here. Hopefully someone here have some bright 
>>> ideas and/or experience of GPFS on Windows 11?
>>>
>>> The system is a:
>>>
>>> Windows 11 Pro 22H2
>>> 2 x Intel(R) Xeon(R) Gold 6418H   2.10 GHz
>>> 512 GB RAM
>>> GPFS 5.1.9.4
>>> Mellanox ConnectX 6 Dx
>>> 100GbE connected to Mellanox Switch with 5m Mellanox DAC.
>>>
>>> Before deploying this workstation we had a single socket system as a 
>>> test bench where we got 60 GbE in both directons with iPerf and 
>>> around 6GB/sec write and 3GB/sec read from the system over GPFS (fio 
>>> tests, same tests as furhter down here).
>>>
>>> With that system I had loads of issues before getting to that point 
>>> though. MS Defender had to be forcefully disabled via regedit some 
>>> other tweaks. All those tweaks have been performed in this new 
>>> system as well, but I can't get the proper speed out of it.
>>>
>>>
>>> On this new system and with iPerf to the storage servers I get 
>>> around 50-60GbE in both directions and send and receive.
>>>
>>> If I mount the storage over SMB and 100GbE via the storage gateway 
>>> servers I get around 3GB/sec read and write with Blackmagics Disk 
>>> speed test. I have not tweaked the system for samba performande, 
>>> just a test to see what it would give and part of the troubleshooting.
>>>
>>> If I run Blackmagics diskspeed test to the GPFS mount I instead get 
>>> around 700MB/sec write and 400MB/sec read.
>>>
>>> Starting to think that the Blackmagic test might not run properly on 
>>> this machine with these CPUs though. Or it's related to the mmfsd 
>>> process maybe, how that threads or not threads...?
>>>
>>> But if we instead look at fio. I have a bat script that loops 
>>> through a bunch of FIO-tests. A test that I have been using over the 
>>> years so that we easily can benchmark all deployed systems with the 
>>> exakt same tests. The tests are named like:
>>>
>>> seqrw-<filesize>gb-<blocksize>mb-t<threads>
>>>
>>> The result when I run this is like the below list. Number in 
>>> parenthesis is the by fio reported latency.
>>>
>>> Job: seqrw-40gb-1mb-t1
>>>       •     Write: 162 MB/s (6 ms)
>>>       •     Read: 1940 MB/s (1 ms)
>>>
>>> Job: seqrw-20gb-1mb-t2
>>>       •     Write: 286 MB/s (7 ms)
>>>       •     Read: 3952 MB/s (1 ms)
>>>
>>> Job: seqrw-10gb-1mb-t4
>>>       •     Write: 549 MB/s (7 ms)
>>>       •     Read: 6987 MB/s (1 ms)
>>>
>>> Job: seqrw-05gb-1mb-t8
>>>       •     Write: 989 MB/s (8 ms)
>>>       •     Read: 7721 MB/s (1 ms)
>>>
>>> Job: seqrw-40gb-2mb-t1
>>>       •     Write: 161 MB/s (12 ms)
>>>       •     Read: 2261 MB/s (0 ms)
>>>
>>> Job: seqrw-20gb-2mb-t2
>>>       •     Write: 348 MB/s (11 ms)
>>>       •     Read: 4266 MB/s (1 ms)
>>>
>>> Job: seqrw-10gb-2mb-t4
>>>       •     Write: 626 MB/s (13 ms)
>>>       •     Read: 4949 MB/s (1 ms)
>>>
>>> Job: seqrw-05gb-2mb-t8
>>>       •     Write: 1154 MB/s (14 ms)
>>>       •     Read: 7007 MB/s (2 ms)
>>>
>>> Job: seqrw-40gb-4mb-t1
>>>       •     Write: 161 MB/s (25 ms)
>>>       •     Read: 2083 MB/s (1 ms)
>>>
>>> Job: seqrw-20gb-4mb-t2
>>>       •     Write: 352 MB/s (23 ms)
>>>       •     Read: 4317 MB/s (2 ms)
>>>
>>> Job: seqrw-10gb-4mb-t4
>>>       •     Write: 696 MB/s (23 ms)
>>>       •     Read: 7358 MB/s (2 ms)
>>>
>>> Job: seqrw-05gb-4mb-t8
>>>       •     Write: 1251 MB/s (25 ms)
>>>       •     Read: 6707 MB/s (5 ms)
>>>
>>>
>>> So with fio I get a very nice read speed, but the write is 
>>> horrendous and I cannot find what causes it. I have looked at 
>>> affinity settings for the mmfsd process but not sure I fully 
>>> understand it. But no matter what I set it to, I see no difference.
>>>
>>> I have "played" with the bios and tried with/without hyperthreading, 
>>> numa and so on. And nothing affects atleast the blackmagic disk 
>>> speed test.
>>>
>>> the current settings for this host is like below. I write "current" 
>>> because I have tested a few different settings here but nothing 
>>> affects the write speed. maxTcpConnsPerNodeConn for sure bumped the 
>>> read speed though.
>>>
>>> nsdMaxWorkerThreads 16
>>> prefetchPct 60
>>> maxTcpConnsPerNodeConn 8
>>> maxMBpS 14000
>>>
>>>
>>> Does anyone have any suggestions or ideas on how to troubleshoot this?
>>>
>>> Thanks
>>>
>>>
>>>
>>> -- 
>>>
>>> Henrik Cednert */ * + 46 704 71 89 54 */*  CTO */ OnePost 
>>> *(formerly Filmlance Post)
>>>
>>> ☝️ *OnePost*, formerly Filmlance's post-production, is now an 
>>> independent part of the Banijay Group.
>>> New name, same team – business as usual at OnePost.
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at gpfsug.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org  <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
>> -- 
>> Karlsruhe Institute of Technology (KIT)
>> Scientific Computing Centre (SCC)
>> Scientific Data Management (SDM)
>>
>> Uwe Falke
>>
>> Hermann-von-Helmholtz-Platz 1, Building 442, Room 187
>> D-76344 Eggenstein-Leopoldshafen
>>
>> Tel: +49 721 608 28024
>> Email:uwe.falke at kit.edu  <mailto:uwe.falke at kit.edu>
>> www.scc.kit.edu  <http://www.scc.kit.edu>
>>
>> Registered office:
>> Kaiserstraße 12, 76131 Karlsruhe, Germany
>>
>> KIT – The Research University in the Helmholtz Association
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at gpfsug.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org  <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
> -- 
> Karlsruhe Institute of Technology (KIT)
> Scientific Computing Centre (SCC)
> Scientific Data Management (SDM)
>
> Uwe Falke
>
> Hermann-von-Helmholtz-Platz 1, Building 442, Room 187
> D-76344 Eggenstein-Leopoldshafen
>
> Tel: +49 721 608 28024
> Email:uwe.falke at kit.edu  <mailto:uwe.falke at kit.edu>
> www.scc.kit.edu  <http://www.scc.kit.edu>
>
> Registered office:
> Kaiserstraße 12, 76131 Karlsruhe, Germany
>
> KIT – The Research University in the Helmholtz Association
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org

-- 
Karlsruhe Institute of Technology (KIT)
Scientific Computing Centre (SCC)
Scientific Data Management (SDM)

Uwe Falke

Hermann-von-Helmholtz-Platz 1, Building 442, Room 187
D-76344 Eggenstein-Leopoldshafen

Tel: +49 721 608 28024
Email:uwe.falke at kit.edu
www.scc.kit.edu

Registered office:
Kaiserstraße 12, 76131 Karlsruhe, Germany

KIT – The Research University in the Helmholtz Association

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240904/e2e10be1/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5814 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240904/e2e10be1/attachment-0001.bin>


More information about the gpfsug-discuss mailing list