[gpfsug-discuss] gpfs snapshots

mark.bergman at uphs.upenn.edu mark.bergman at uphs.upenn.edu
Tue Sep 13 22:23:57 BST 2016


In the message dated: Tue, 13 Sep 2016 13:51:16 -0700,
The pithy ruminations from Yuri L Volobuev on 
<Re: [gpfsug-discuss] gpfs snapshots> were:
=> 
=> Hi Jez,
=> 
=> It sounds to me like the functionality that you're _really_ looking for is
=> an ability to to do automated snapshot management, similar to what's

Yep.

=> available on other storage systems.  For example, "create a new snapshot of
=> filesets X, Y, Z every 30 min, keep the last 16 snapshots".  I've seen many

Or, take a snapshot every 15min, keep the 4 most recent, expire all except
4 that were created within 6hrs, only 4 created between 6:01-24:00 hh:mm
ago, and expire all-but-2 created between 24:01-48:00, etc, as we do.

=> examples of sysadmins rolling their own snapshot management system along
=> those lines, and an ability to add an expiration string as a snapshot

I'd be glad to distribute our local example of this exercise.

=> "comment" appears to be merely an aid in keeping such DIY snapshot
=> management scripts a bit simpler -- not by much though.  The end user would
=> still be on the hook for some heavy lifting, in particular figuring out a
=> way to run an equivalent of a cluster-aware cron with acceptable fault
=> tolerance semantics.  That is, if a snapshot creation is scheduled, only
=> one node in the cluster should attempt to create the snapshot, but if that
=> node fails, another node needs to step in (as opposed to skipping the
=> scheduled snapshot creation).  This is doable outside of GPFS, of course,
=> but is not trivial.  Architecturally, the right place to implement a

Ah, that part really is trivial....In our case, the snapshot program
takes the filesystem name as an argument... we simply rely on the GPFS
fault detection/failover.  The job itself runs (via cron) on every GPFS
server node, but only creates the snapshot on the server that is the
active manager for the specified filesystem:

##############################################################################
	# Check if the node where this script is running is the GPFS manager node for the
        # specified filesystem
        manager=`/usr/lpp/mmfs/bin/mmlsmgr $filesys | grep -w "^$filesys" |awk '{print $2}'`
        ip addr list | grep -qw "$manager"
        if [ $? != 0 ] ; then
                # This node is not the manager...exit
                exit
        fi

	# else ... continue and create the snapshot
###################################################################################################

=> 
=> yuri
=> 
=> 

-- 
Mark Bergman                                           voice: 215-746-4061       
mark.bergman at uphs.upenn.edu                              fax: 215-614-0266
http://www.cbica.upenn.edu/
IT Technical Director, Center for Biomedical Image Computing and Analytics
Department of Radiology                         University of Pennsylvania
          PGP Key: http://www.cbica.upenn.edu/sbia/bergman 



More information about the gpfsug-discuss mailing list