[gpfsug-discuss] Writing to the same file from multiple clients gives slow performance compared to beegfs.

Wed Nov 13 22:05:08 GMT 2019

Hello,

This is simply due to the fact that your “test” is doing parallel appended writes to the same file.  With GPFS, there is one node at a time that can write to roughly the same position (aka byte range) in a file.  This is handled by something called a write_lock token that has to be revoked on the node that currently owns the token, then given to one of the other nodes attempting to perform the write.  This overhead in lock token management is done for almost every write that is happening in this “test”.  This is not something that you would ever really want an application to do anyways.

GPFS offers byte-range locking to allow for very fast, non-blocking I/O to the same file from multiple nodes in parallel.  So if you change your “test” so that each node seeks to a different part of the single file and that each node writes into the file in a way that there is no overlap in the byte range of your total writes from the node, then GPFS will give a byte-range write_lock token to each node.  Since no node will attempt to write to the same part of the file as another node (which is what you’d want anyways in most cases) then there will be no write_lock token revoking and no blocking of your I/O to the single file.

Hope that helps,
-Bryan

From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Andi Christiansen
Sent: Wednesday, November 13, 2019 3:34 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Writing to the same file from multiple clients gives slow performance compared to beegfs.

[EXTERNAL EMAIL]
Hi all,

I am in the process of replacing a beegfs cluster with a spectrum scale cluster and some of our initial tests have returned poor performance when writing from multiple client nodes to the same file.

If we setup a client to write into a file it takes less than 8 seconds to complete the write on Flash and about the same on NLSAS storage. But if we get 3 more nodes to do the exact same write the cluster seems to slow down and completes all writes in about 1.5 minute.

We are running 5.0.4-0 on 4 Lenovo SR630 servers with a V7000 control enclosure with flash drives and a 92F drawer with NLSAS drives behind the nodes attach through SAN.

Is there something I am missing since the writes are so slow compared to beegfs?

Beegfs completes the write from all clients within 9 second when run from 4 clients doing the same write to the same file

GPFS takes 1.5 min to do the same.

Tests run:

time (for i in `seq 5000`;do echo "This is $(hostname) writing line number" $i >> "/gpfs_T0/test/benchmark1.txt";done)   ########## this is run from 4 gpfs client nodes at the same time.

Result for scale:

real    1m43.995s

user    0m0.821s

sys     0m3.545s

Result for beegfs:

real    0m6.507s

user    0m0.651s

sys     0m2.534s

if we run writes from each client node to separate files, performance is way better with GPFS than beegfs but not when the writes are done parallel.

If anyone have an idea I would be glad to hear it 😊

Best Regards

Andi Christiansen

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20191113/21c07d4d/attachment-0002.htm>