[gpfsug-discuss] Using AFM to migrate files. (Peter Childs)

Peter Childs p.childs at qmul.ac.uk
Thu Oct 20 20:07:44 BST 2016


Yes, most of the filesets are based on research groups, projects or departments, with the exception of scratch and home, hence the idea to use a different method for these filesets.

There are approximately 230 million files, the largest of the filesets has 52TB and 63 million files. 300TB in total.

Peter Childs
Research Storage
ITS Research and Teaching Support
Queen Mary, University of London


---- Bill Pappas wrote ----


I have some ideas to suggest given some of my experiences. First, I have some questions:


How many files are you migrating?

Will you be creating multiple file sets on the target system based off of business or project needs? Like, file set a is for "department a" and file set b is for "large scale project a"


Thanks.


Bill Pappas

901-619-0585

bpappas at dstonline.com


[1466780990050_DSTlogo.png]


[http://www.prweb.com/releases/2016/06/prweb13504050.htm]

http://www.prweb.com/releases/2016/06/prweb13504050.htm


________________________________
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> on behalf of gpfsug-discuss-request at spectrumscale.org <gpfsug-discuss-request at spectrumscale.org>
Sent: Wednesday, October 19, 2016 3:12 PM
To: gpfsug-discuss at spectrumscale.org
Subject: gpfsug-discuss Digest, Vol 57, Issue 49

Send gpfsug-discuss mailing list submissions to
        gpfsug-discuss at spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://gpfsug.org/mailman/listinfo/gpfsug-discuss
or, via email, send a message with subject or body 'help' to
        gpfsug-discuss-request at spectrumscale.org

You can reach the person managing the list at
        gpfsug-discuss-owner at spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

   1. Using AFM to migrate files. (Peter Childs)
   2. subnets (Brian Marshall)
   3. Re: subnets (Simon Thompson (Research Computing - IT Services))
   4. Re: subnets (Uwe Falke)
   5. Will there be any more GPFS 4.2.0-x releases?
      (Buterbaugh, Kevin L)


----------------------------------------------------------------------

Message: 1
Date: Wed, 19 Oct 2016 14:12:41 +0000
From: Peter Childs <p.childs at qmul.ac.uk>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Using AFM to migrate files.
Message-ID:
        <HE1PR0701MB2554710DD534587615543AE5A4D20 at HE1PR0701MB2554.eurprd07.prod.outlook.com>

Content-Type: text/plain; charset="iso-8859-1"


We are planning to use AFM to migrate our old GPFS file store to a new GPFS file store. This will give us the advantages of Spectrum Scale (GPFS) 4.2, such as larger block and inode size. I would like to attempt to gain some insight on my plans before I start.

The old file store was running GPFS 3.5 with 512 byte inodes and 1MB block size. We have now upgraded it to 4.1 and are working towards 4.2 with 300TB of files. (385TB max space) this is so we can use both the old and new storage via multi-cluster.

We are moving to a new GPFS cluster so we can use the new protocol nodes eventually and also put the new storage machines as cluster managers, as this should be faster and future proof

The new hardware has 1PB of space running GPFS 4.2

We have multiple filesets, and would like to maintain our namespace as far as possible.

My plan was to.

1. Create a read-only (RO) AFM cache on the new storage (ro)
2a. Move old fileset and replace with SymLink to new.
2b. Convert RO AFM to Local Update (LU) AFM pointing to new parking area of old files.
2c. move user access to new location in cache.
3. Flush everything into cache and disconnect.

I've read the docs including the ones on migration but it's not clear if it's safe to move the home of a cache and update the target. It looks like it should be possible and my tests say it works.

An alternative plan is to use a Independent Writer (IW) AFM Cache to move the home directories which are pointed to by LDAP. Hence we can move users one at a time and only have to drain the HPC cluster at the end to disconnect the cache. I assume that migrating users over an Independent Writer is safe so long as the users don't use both sides of the cache at once (ie home and target)

I'm also interested in any recipe people have on GPFS policies to preseed and flush the cache.

We plan to do all the migration using AFM over GPFS we're not currently using NFS and have no plans to start. I believe using GPFS is the faster method to preform the migration.

Any suggestions and experience of doing similar migration jobs would be helpful.

Peter Childs
Research Storage
ITS Research and Teaching Support
Queen Mary, University of London



------------------------------

Message: 2
Date: Wed, 19 Oct 2016 13:46:02 -0400
From: Brian Marshall <mimarsh2 at vt.edu>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] subnets
Message-ID:
        <CAD0XtKRDTXe9Y5qQB5-qVRdo_RTbv9WctoJKf+CB97kNkmss0g at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

All,

We are setting up communication between 2 clusters using ethernet and
IPoFabric.

The Daemon interface is running on ethernet, so all admin traffic will use
it.

We are still getting the subnets setting correct.

Question:

Does GPFS have a way to query how it is connecting to a given cluster/node?
 i.e. once we have subnets setup how can we tell GPFS is actually using
them.  Currently we just do a large transfer and check tcpdump for any
packets flowing on the high-speed/data/non-admin subnet.


Thank you,
Brian Marshall
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20161019/5b59ed8e/attachment-0001.html>

------------------------------

Message: 3
Date: Wed, 19 Oct 2016 18:10:38 +0000
From: "Simon Thompson (Research Computing - IT Services)"
        <S.J.Thompson at bham.ac.uk>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] subnets
Message-ID:
        <CF45EE16DEF2FE4B9AA7FF2B6EE26545F584168E at EX13.adf.bham.ac.uk>
Content-Type: text/plain; charset="us-ascii"


mmdiag --network

Simon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Brian Marshall [mimarsh2 at vt.edu]
Sent: 19 October 2016 18:46
To: gpfsug main discussion list
Subject: [gpfsug-discuss] subnets

All,

We are setting up communication between 2 clusters using ethernet and IPoFabric.

The Daemon interface is running on ethernet, so all admin traffic will use it.

We are still getting the subnets setting correct.

Question:

Does GPFS have a way to query how it is connecting to a given cluster/node?  i.e. once we have subnets setup how can we tell GPFS is actually using them.  Currently we just do a large transfer and check tcpdump for any packets flowing on the high-speed/data/non-admin subnet.


Thank you,
Brian Marshall


------------------------------

Message: 4
Date: Wed, 19 Oct 2016 20:15:52 +0200
From: "Uwe Falke" <UWEFALKE at de.ibm.com>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] subnets
Message-ID:
        <OF96AC7F85.6594A994-ONC1258051.0064379D-C1258051.0064547E at notes.na.collabserv.com>

Content-Type: text/plain; charset="ISO-8859-1"

Hi Brian,
you might use

mmfsadm saferdump tscomm

to check on which route peer cluster members are reached.


Mit freundlichen Gr??en / Kind regards


Dr. Uwe Falke

IT Specialist
High Performance Computing Services / Integrated Technology Services /
Data Center Services
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland
Rathausstr. 7
09111 Chemnitz
Phone: +49 371 6978 2165
Mobile: +49 175 575 2877
E-Mail: uwefalke at de.ibm.com
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung:
Frank Hammer, Thorsten Moehring
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 17122




From:   Brian Marshall <mimarsh2 at vt.edu>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   10/19/2016 07:46 PM
Subject:        [gpfsug-discuss] subnets
Sent by:        gpfsug-discuss-bounces at spectrumscale.org



All,

We are setting up communication between 2 clusters using ethernet and
IPoFabric.

The Daemon interface is running on ethernet, so all admin traffic will use
it.

We are still getting the subnets setting correct.

Question:

Does GPFS have a way to query how it is connecting to a given
cluster/node?  i.e. once we have subnets setup how can we tell GPFS is
actually using them.  Currently we just do a large transfer and check
tcpdump for any packets flowing on the high-speed/data/non-admin subnet.


Thank you,
Brian Marshall_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss






------------------------------

Message: 5
Date: Wed, 19 Oct 2016 20:11:57 +0000
From: "Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] Will there be any more GPFS 4.2.0-x
        releases?
Message-ID: <142FECE0-E157-42D9-BC10-4C48E78FA065 at vanderbilt.edu>
Content-Type: text/plain; charset="utf-8"

Hi All,

We?re currently running GPFS 4.2.0-4 with an efix installed and now we need a 2nd efix.  I?m not a big fan of adding efix to efix and would prefer to go to a new PTF that contains both efixes.

So ? is there going to be a GPFS 4.2.0-5 (it?s been a longer than normal interval since PTF 4 came out) or do we need to go to GPFS 4.2.1-x?  If the latter, any major changes to watch out for?  Thanks?

Kevin

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20161019/3a0a91e7/attachment.html>

------------------------------

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


End of gpfsug-discuss Digest, Vol 57, Issue 49
**********************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20161020/47b2ddf6/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OutlookEmoji-1466780990050_DSTlogo.png.png
Type: image/png
Size: 6282 bytes
Desc: OutlookEmoji-1466780990050_DSTlogo.png.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20161020/47b2ddf6/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OutlookEmoji-httpwww.prweb.comreleases201606prweb13504050.htm.jpg
Type: image/jpeg
Size: 14887 bytes
Desc: OutlookEmoji-http://www.prweb.com/releases/2016/06/prweb13504050.htm.jpg
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20161020/47b2ddf6/attachment-0002.jpg>


More information about the gpfsug-discuss mailing list