[gpfsug-discuss] Rewriting existing files is incredibly slow

Peter Hruška Peter.Hruska at mcomputers.cz
Tue Mar 12 12:03:46 GMT 2024


Hello,

For the ease of testing and reproduction we use fio and iozone benchmarking tools. The issue you desrcibe looks pretty similar. However it occurs using GPFS mount only. Here are some results from out testing environment:

Linux NSD server:
Writes: 3472 MiB/s
Re-writes: 3476 MiB/s

Windows GPFS:
Writes: 2487 MiB/s
Re-writes: 250 MiB/s

Windows Samba:
Writes: 997 MiB/s
Re-writes: 1269 MiB/s




--

S přáním pěkného dne / Best regards

Mgr. Peter Hruška
IT specialista

M Computers s.r.o.
Úlehlova 3100/10, 628 00 Brno-Líšeň (mapa<https://mapy.cz/s/gafufehufe>)
T:+420 515 538 136
E: peter.hruska at mcomputers.cz<mailto:peter.hruska at mcomputers.cz>

www.mcomputers.cz<http://www.mcomputers.cz/>
www.lenovoshop.cz<http://www.lenovoshop.cz/>
[cid:a57a2d9ed2ea460d8bb2abefe678c4fab3cdaeeb.camel at mcomputers.cz-0]




On Mon, 2024-03-11 at 17:35 +0000, Jonathan Buzzard wrote:
EXTERNÍ ODESÍLATEL


On 11/03/2024 16:43, Peter Hruška wrote:

Hello,

We've encountered yet another performance flaw. We have a GPFS
filesystem mounted using GPFS binaries on Windows. When we try to
rewrite a file on the GPFS filesystem rewriting speed is much slower
than writing to a new file. The difference ratio we measured is about
3.5 times. From the task manager it is visible that there is excessive
amount of reading from the network when rewriting. This is even visible
on the NDS server as io activity. However when rewriting on a linux
client there are no reads while rewriting. Has anyone encountered such
problems?
To replicate the issue is is possidle to run fio twice with the same
configuration to achieve rewriting or to run iozone. Both tools return
similar outputs.

Kind of yes.

What are you using to "rewrite" the file?

What we saw initially over Samba was certain Microsoft applications when
rewriting a file had truly abysmal performance. The same application
when saving the same document to a new file and the performance was as
expected.

After much digging into it the cause (it was weeks of person effort) it
was determined to be down to the application writing the file *one* byte
at a time. Basically some idiot C++ developer at Microsoft decided to
ignore the C++ library because it has "bugs" and write their own
formatted output routines.

It was not noticeable saving to a local disk, but the instant you tried
saving to a network drive the performance was truly awful. Basically the
increased latency of single character IO was the issue.

Note that the issue was not confined to Samba and GPFS, as we verified
the same abysmal performance with a Windows 2008 R2 server running on
File and Print Sharing on NTFS on bare metal hardware. It was also not
confined to just Windows the same awful performance happened on Macs
too. In fact that is where it first came to light.

Might be the cause of your problem, might not.


JAB.

--
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240312/d97c3043/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mcomputers_podpis_2024.png
Type: image/png
Size: 13955 bytes
Desc: mcomputers_podpis_2024.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240312/d97c3043/attachment-0001.png>


More information about the gpfsug-discuss mailing list