[gpfsug-discuss] GPFS(snapshot, backup) vs. GPFS(backup scripts) vs. TSM(backup)

Jaime Pinto pinto at scinet.utoronto.ca
Wed Mar 16 20:20:21 GMT 2016


OK, that is good to know.
I'll give it a try with snapshot then. We already have 3.5 almost  
everywhere, and planing for 4.2 upgrade (reading the posts with  
interest)
Thanks
Jaime

Quoting Yuri L Volobuev <volobuev at us.ibm.com>:

>
>> Under both 3.2 and 3.3 mmbackup would always lock up our cluster when
>> using snapshot. I never understood the behavior without snapshot, and
>> the lock up was intermittent in the carved-out small test cluster, so
>> I never felt confident enough to deploy over the larger 4000+ clients
>> cluster.
>
> Back then, GPFS code had a deficiency: migrating very large files didn't
> work well with snapshots (and some operation mm commands).  In order to
> create a snapshot, we have to have the file system in a consistent state
> for a moment, and we get there by performing a "quiesce" operation.  This
> is done by flushing all dirty buffers to disk, stopping any new incoming
> file system operations at the gates, and waiting for all in-flight
> operations to finish.  This works well when all in-flight operations
> actually finish reasonably quickly.  That assumption was broken if an
> external utility, e.g. mmapplypolicy, used gpfs_restripe_file API on a very
> large file, e.g. to migrate the file's blocks to a different storage pool.
> The quiesce operation would need to wait for that API call to finish, as
> it's an in-flight operation, but migrating a multi-TB file could take a
> while, and during this time all new file system ops would be blocked.  This
> was solved several years ago by changing the API and its callers to do the
> migration one block range at a time, thus making each individual syscall
> short and allowing quiesce to barge in and do its thing.  All currently
> supported levels of GPFS have this fix.  I believe mmbackup was affected by
> the same GPFS deficiency and benefited from the same fix.
>
> yuri
>






          ************************************
           TELL US ABOUT YOUR SUCCESS STORIES
          http://www.scinethpc.ca/testimonials
          ************************************
---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477

----------------------------------------------------------------
This message was sent using IMP at SciNet Consortium, University of Toronto.





More information about the gpfsug-discuss mailing list