[gpfsug-discuss] Tracking deleted files

Marc A Kaplan makaplan at us.ibm.com
Mon Feb 27 21:23:52 GMT 2017


Diffing file lists can be fast - IF you keep the file lists sorted by a 
unique key, e.g. the inode number.
I believe that's how mmbackup does it.  Use the classic set difference 
algorithm.

Standard diff is designed to do something else and is terribly slow on 
large file lists.



From:   Edward Wahl <ewahl at osc.edu>
To:     "Simon Thompson (Research Computing - IT Services)" 
<S.J.Thompson at bham.ac.uk>
Cc:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   02/27/2017 03:51 PM
Subject:        Re: [gpfsug-discuss] Tracking deleted files
Sent by:        gpfsug-discuss-bounces at spectrumscale.org



I can think of a couple of ways to do this.  But using snapshots seems 
heavy,
but so does using mmbackup unless you are already running it every day. 

Diff the shadow files?  Haha could be a _terrible_ idea if you have a 
couple
hundred million files. But it IS possible. 


Next, I'm NOT a tsm expert, but I know a bit about it: (and I probably 
stayed
at a Holiday Inn express at least once in my heavy travel days)

-query objects using '-ina=yes' and yesterdays date? Might be a touch 
slow. But
it probably uses the next one as it's backend:

-db2 query inside TSM to see a similar thing.  This ought to be the 
fastest,
and I'm sure with a little google'ing you can work this out.  Tivoli MUST 
know
exact dates of deletion as it uses that and the retention time to know
when to purge/reclaim deleted objects from it's storage pools.
(retain extra version or RETEXTRA or retain only version) 

Ed

On Mon, 27 Feb 2017 13:32:42 +0000
"Simon Thompson (Research Computing - IT Services)" 
<S.J.Thompson at bham.ac.uk>
wrote:

> >It has been discussed in the past, but the way to track stuff is to
> >enable HSM and then hook into the DSMAPI. That way you can see all the
> >file creates and deletes "live". 
> 
> Won't work, I already have a "real" HSM client attached to DMAPI
> (dsmrecalld).
> 
> I'm not actually wanting to backup for this use case, we already have
> mmbackup running to do those things, but it was a list of deleted files
> that I was after (I just thought it might be easy given mmbackup is
> tracking it already).
> 
> Simon
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss



-- 

Ed Wahl
Ohio Supercomputer Center
614-292-9302
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170227/894b9ce8/attachment-0002.htm>


More information about the gpfsug-discuss mailing list