[gpfsug-discuss] Moving/copying files from one file system to another

Tomer Perry TOMP at il.ibm.com
Wed Oct 30 18:31:01 GMT 2013


Hi,

cluster will also be there for some time, as soon as one start deleting 
and creating file - it will kind of shift to be scattered...
So, while its good for benchmarks on "clean and new" system - cluster is 
somewhat meaningless ( and was hidden for some time because of that).

As far as it goes for copying from old to new system - AFM can be 
considered as well ( will make the transition easier). You can prefetch ( 
using mmafmctl) and even work in LU mode in order to get the data from the 
old FS to the new one without pushing changes back.

hth,
Tomer.




From:   Alex Chekholko <chekh at stanford.edu>
To:     gpfsug-discuss at gpfsug.org, 
Date:   10/30/2013 07:57 PM
Subject:        Re: [gpfsug-discuss] Moving/copying files from one file 
system to another
Sent by:        gpfsug-discuss-bounces at gpfsug.org



On 10/30/13, 5:47 AM, Jonathan Buzzard wrote:
> On Mon, 2013-10-28 at 11:31 -0400, Richard Lefebvre wrote:
>
> [SNIP]
>
>> Also, another question, under what condition a scatter allocation 
better
>> then cluster allocation. We currently have a cluster of 650 nodes all
>> accessing the same 230TB gpfs file system.
>>
>
> Scatter allocation is better in almost all circumstances. Basically by
> scattering the files to all corners you don't get hotspots where just a
> small subset of the disks are being hammered by lots of accesses to a
> handful of files, while the rest of the disks sit idle.
>

If you do benchmarks with only a few threads, you will see higher 
performance with 'cluster' allocation.  So if your workload is only a 
few clients accessing the FS in a mostly streaming way, you'd see better 
performance from 'cluster'.

With 650 nodes, even if each client is doing streaming reads, at the 
filesystem level that would all be interleaved and thus be random reads. 
  But it's tough to do a big enough benchmark to show the difference in 
performance.

I had a tough time convincing people to use 'scatter' instead of 
'cluster' even though I think the documentation is clear about the 
difference, and even gives you the sizing parameters ( greater than 8 
disks or 8 NSDs?  use 'scatter').

We use 'scatter' now.

Regards
-- 
chekh at stanford.edu
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20131030/d130f587/attachment-0003.htm>


More information about the gpfsug-discuss mailing list