[gpfsug-discuss] Question on changing mode on many files

Alec anacreo at gmail.com
Sun Dec 12 02:38:26 GMT 2021


You can manipulate the permissions via GPFS policy engine, essentially
you'd write a script that the policy engine calls and tell GPFS to farm out
the change in at whatever scale you need... run in a single node, how many
files per thread, how many threads per node, etc...  This can GREATLY
accelerate file change permissions over a large quantity of files.
However, as stated earlier the mmfind command will do all of this for you
and it's worth the effort to get it compiled for your system.

I don't have Spectrum Scale in front of me but for the best performance
you'll want to setup the mmfind policy engine parameters to parallelize
your workload...  If mmfind has no action it will silently use GPFS policy
engine to produce the requested output, however if mmfind has an action it
will expose the policy engine calls.

it goes something like this:
mmfind -B 1 -N directattachnode1,directattachnode2 -m 24 /path/to/find
-perm +o=w ! \( -type d -perm +o=t \) -xargs chmod o-w

This will run 48 threads on 2 nodes and bump other write permissions off of
any file it finds (excluding temp dirs) until it completes, it should go
blistering fast... as this is only a meta operation the -B 1 might not be
necessary, you'd probably be better off with a -B 100, but as I deal with a
lot of 100GB+ files I don't want a single thread to be stuck with 3 100GB+
files and another thread to have none, so I usually set the max depth to be
1 and take the higher execution count.  This has an advantage in that GPFS
will break up the inodes in the most efficient way for the chmod to happen
in parallel.

I'm not sure if this happens on Spectrum Scale but on most FS's if you do a
chmod 770 file you'll lose any ACLs assigned to the file, so safest to bump
the permissions with a subtractive or additive o-w or g+w type operation.

If you think of the possibilities here you could easily change that chmod
to a   gzip and add a -mtime +1200 and you have a find command that will
gzip compress files over 4 years old in parallel across multiple nodes...
mmfind is VERY powerful and flexible, highly worth getting into usage.

Alec


On Tue, Dec 7, 2021 at 7:43 AM Jonathan Buzzard <
jonathan.buzzard at strath.ac.uk> wrote:

> On 07/12/2021 14:55, Simon Thompson wrote:
> >
> > Or add:
> >    UPDATECTIME               yes
> >    SKIPACLUPDATECHECK        yes
> >
> > To you dsm.opt file to skip checking for those updates and don’t back
> > them up again.
>
> Yeah, but then a restore gives you potentially an unusable file system
> as the ownership of the files and ACL's are all wrong. Better to bite
> the bullet and back them up again IMHO.
>
> >
> > Actually I thought TSM only updated the metadata if the mode/owner
> > changed, not re-backed the file…
>
> That was my understanding but I have seen TSM rebacked up large amounts
> of data where the owner of the file changed in the past, so your mileage
> may vary.
>
> Also ACL's are stored in extended attributes which are stored with the
> files and changes will definitely cause the file to be backed up again.
>
>
> JAB.
>
> --
> Jonathan A. Buzzard                         Tel: +44141-5483420
> HPC System Administrator, ARCHIE-WeSt.
> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20211211/af351d0f/attachment-0002.htm>


More information about the gpfsug-discuss mailing list