[gpfsug-discuss] Moving/copying files from one file system to another
Tomer Perry
TOMP at il.ibm.com
Wed Oct 30 18:31:01 GMT 2013
Hi,
cluster will also be there for some time, as soon as one start deleting
and creating file - it will kind of shift to be scattered...
So, while its good for benchmarks on "clean and new" system - cluster is
somewhat meaningless ( and was hidden for some time because of that).
As far as it goes for copying from old to new system - AFM can be
considered as well ( will make the transition easier). You can prefetch (
using mmafmctl) and even work in LU mode in order to get the data from the
old FS to the new one without pushing changes back.
hth,
Tomer.
From: Alex Chekholko <chekh at stanford.edu>
To: gpfsug-discuss at gpfsug.org,
Date: 10/30/2013 07:57 PM
Subject: Re: [gpfsug-discuss] Moving/copying files from one file
system to another
Sent by: gpfsug-discuss-bounces at gpfsug.org
On 10/30/13, 5:47 AM, Jonathan Buzzard wrote:
> On Mon, 2013-10-28 at 11:31 -0400, Richard Lefebvre wrote:
>
> [SNIP]
>
>> Also, another question, under what condition a scatter allocation
better
>> then cluster allocation. We currently have a cluster of 650 nodes all
>> accessing the same 230TB gpfs file system.
>>
>
> Scatter allocation is better in almost all circumstances. Basically by
> scattering the files to all corners you don't get hotspots where just a
> small subset of the disks are being hammered by lots of accesses to a
> handful of files, while the rest of the disks sit idle.
>
If you do benchmarks with only a few threads, you will see higher
performance with 'cluster' allocation. So if your workload is only a
few clients accessing the FS in a mostly streaming way, you'd see better
performance from 'cluster'.
With 650 nodes, even if each client is doing streaming reads, at the
filesystem level that would all be interleaved and thus be random reads.
But it's tough to do a big enough benchmark to show the difference in
performance.
I had a tough time convincing people to use 'scatter' instead of
'cluster' even though I think the documentation is clear about the
difference, and even gives you the sizing parameters ( greater than 8
disks or 8 NSDs? use 'scatter').
We use 'scatter' now.
Regards
--
chekh at stanford.edu
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20131030/d130f587/attachment-0003.htm>
More information about the gpfsug-discuss
mailing list