[gpfsug-discuss] migrating data from GPFS3.5 to ESS appliance (GPFS4.1)

Sven Oehme oehmes at gmail.com
Fri Jan 29 22:36:31 GMT 2016


This won't really work if you make use of ACL's or use special GPFS
extended attributes or set quotas, filesets, etc
so unfortunate the answer is you need to use a combination of things and
there is work going on to make some of this simpler (e.g. for ACL's) , but
its a longer road to get there.  so until then you need to think about
multiple aspects .

1. you need to get the data across and there are various ways to do this.

a) AFM is the simplest of all as it not just takes care of ACL's and
extended attributes and alike as it understands the GPFS internals it also
is operating in parallel can prefetch data, etc so its a efficient way to
do this but as already pointed out doesn't transfer quota or fileset

b) you can either use rsync or any other pipe based copy program. the
downside is that they are typical single threaded and do a file by file
approach, means very metadata intensive on the source as well as target
side and cause a lot of ios on both side.

c) you can use the policy engine to create a list of files to transfer to
at least address the single threaded scan part, then partition the data and
run multiple instances of cp or rsync in parallel, still doesn't fix the
ACL / EA issues, but the data gets there faster.

2. you need to get ACL/EA informations over too. there are several command
line options to dump the data and restore it, they kind of suffer the same
problem as data transfers , which is why using AFM is the best way of doing
this if you rely on ACL/EA  informations.

3. transfer quota / fileset infos.  there are several ways to do this, but
all require some level of scripting to do this.

if you have TSM/HSM you could also transfer the data using SOBAR it's
described in the advanced admin book.


On Fri, Jan 29, 2016 at 11:35 AM, Hughes, Doug <
Douglas.Hughes at deshawresearch.com> wrote:

> I have found that a tar pipe is much faster than rsync for this sort of
> thing. The fastest of these is ‘star’ (schily tar). On average it is about
> 2x-5x faster than rsync for doing this. After one pass with this, you can
> use rsync for a subsequent or last pass synch.
> e.g.
> $ cd /export/gpfs1/foo
> $ star –c H=xtar | (cd /export/gpfs2/foo; star –xp)
> This also will not preserve filesets and quotas, though. You should be
> able to automate that with a little bit of awk, perl, or whatnot.
> *From:* gpfsug-discuss-bounces at spectrumscale.org [mailto:
> gpfsug-discuss-bounces at spectrumscale.org] *On Behalf Of *Damir Krstic
> *Sent:* Friday, January 29, 2016 2:32 PM
> *To:* gpfsug main discussion list
> *Subject:* [gpfsug-discuss] migrating data from GPFS3.5 to ESS appliance
> (GPFS4.1)
> We have recently purchased ESS appliance from IBM (GL6) with 1.5PT of
> storage. We are in planning stages of implementation. We would like to
> migrate date from our existing GPFS installation (around 300TB) to new
> solution.
> We were planning of adding ESS to our existing GPFS cluster and adding its
> disks and then deleting our old disks and having the data migrated this
> way. However, our existing block size on our projects filesystem is 1M and
> in order to extract as much performance out of ESS we would like its
> filesystem created with larger block size. Besides rsync do you have any
> suggestions of how to do this without downtime and in fastest way possible?
> I have looked at AFM but it does not seem to migrate quotas and filesets
> so that may not be an optimal solution.
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160129/5a970eb6/attachment-0002.htm>

More information about the gpfsug-discuss mailing list