[gpfsug-discuss] Change uidNumber and gidNumber for billions of files

Wed Jun 10 12:33:09 BST 2020

On 10/06/2020 02:15, Aaron Knister wrote:
> Lohit,
> 
> I did this while working @ NASA. I had two tools I used, one 
> affectionately known as "luke file walker" (to modify traditional unix 
> permissions) and the other known as the "milleniumfacl" (to modify posix 
> ACLs). Stupid jokes aside, there were some real technical challenges here.
> 
> I don't know if anyone from the NCCS team at NASA is on the list, but if 
> they are perhaps they'll jump in if they're willing to share the code :)
> 
>  From what I recall, I used uthash and the gpfs API's to store in-memory 
> a hash of inodes and their uid/gid information. I then walked the 
> filesystem using the gpfs API's and could lookup the given inode in the 
> in-memory hash to view its ownership details. Both the inode traversal 
> and directory walk were parallelized/threaded. They way I actually 
> executed the chown was particularly security-minded. There is a race 
> condition that exists if you chown /path/to/file. All it takes is either 
> a malicious user or someone monkeying around with the filesystem while 
> it's live to accidentally chown the wrong file if a symbolic link ends 
> up in the file path.

Well I would expect this needs to be done with no user access to the 
system. Or at the very least no user access for the bits you are 
currently modifying. Otherwise you are going to end up in a complete mess.

> My work around was to use openat() and fchmod (I 
> think that was it, I played with this quite a bit to get it right) and 
> for every path to be chown'd I would walk the hierarchy, opening each 
> component with the O_NOFOLLOW flags to be sure I didn't accidentally 
> stumble across a symlink in the way.

Or you could just use lchown so you change the ownership of the symbolic 
link rather than the file it is pointing to. You need to change the 
ownership of the symbolic link not the file it is linking to, that will 
be picked up elsewhere in the scan. If you don't change the ownership of 
the symbolic link you are going to be left with a bunch of links owned 
by none existent users. No race condition exists if you are doing it 
properly in the first place :-)

I concluded that the standard nftw system call was more suited to this 
than the GPFS inode scan. I could see no way to turn an inode into a 
path to the file which lchownn, gpfs_getacl and gpfs_putacl all use.

I think the problem with the GPFS inode scan is that is is for a backup 
application. Consequently there are some features it is lacking for more 
general purpose programs looking for a quick way to traverse the file 
system. An other example is that the gpfs_iattr_t structure returned 
from gpfs_stat_inode does not contain any information as to whether the 
file is a symbolic link like a standard stat call does.

> I also implemented caching of open 
> path component file descriptors since odds are I would be 
> chowning/chgrp'ing files in the same directory. That bought me some 
> speed up.
>

More reasons to use nftw for now, no need to open any files :-)

> I opened up RFE's at one point, I believe, for gpfs API calls to do this 
> type of operation. I would ideally have liked a mechanism to do this 
> based on inode number rather than path which would help avoid issues of 
> race conditions.
>

Well lchown to the rescue, but that does require a path to the file. The 
biggest problem is the inability to get a path given an inode using the 
GPFS inode scan which is why I steered away from it.

In theory you could use gpfs_igetattrsx/gpfs_iputattrsx to modify the 
UID/GID of the file, but they are returned in an opaque format, so it's 
not possible :-(

> One of the gotchas to be aware of, is quotas. My wrapper script would 
> clone quotas from the old uid to the new uid. That's easy enough. 
> However, keep in mind, if the uid is over their quota your chown 
> operation will absolutely kill your cluster. Once a user is over their 
> quota the filesystem seems to want to quiesce all of its accounting 
> information on every filesystem operation for that user. I would check 
> for adequate quota headroom for the user in question and abort if there 
> wasn't enough.

Had not thought of that one. Surely the simple solution would be to set 
the quota's on the mapped UID/GID's after the change has been made. Then 
the filesystem operation would not be for the user over quota but for 
the new user?

The other alternative is to dump the quotas to file and remove them. 
Change the UID's and GID's then restore the quotas on the new UID/GID's.

As I said earlier surely the end users have no access to the file system 
while the modifications are being made. If they do all hell is going to 
break loose IMHO.

> 
> The ACL changes were much more tricky. There's no way, of which I'm 
> aware, to atomically update ACL entries. You run the risk that you could 
> clobber a user's ACL update if it occurs in the milliseconds between you 
> reading the ACL and updating it as part of the UID/GID update. 
> Thankfully we were using Posix ACLs which were easier for me to deal 
> with programmatically. I still had the security concern over symbolic 
> links appearing in paths to have their ACLs updated either maliciously 
> or organically. I was able to deal with that by modifying libacl to 
> implement ACL calls that used variants of xattr calls that took file 
> descriptors as arguments and allowed me to throw nofollow flags. That 
> code is here (
> https://github.com/aaronknister/acl/commits/nofollow 
> <https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Faaronknister%2Facl%2Fcommits%2Fnofollow&data=02%7C01%7Cjonathan.buzzard%40strath.ac.uk%7C99476bc8be4a4e2ad20408d80cdbdf21%7C631e0763153347eba5cd0457bee5944e%7C0%7C0%7C637273487058332585&sdata=h4Hgmq3jsnX5VQBDPH%2BhLS2crNg9JYNJ4uae5VM4Meo%3D&reserved=0>). 
> I couldn't take advantage of the GPFS API's here to meet my 
> requirements, so I just walked the filesystem tree in parallel if I 
> recall correctly, retrieved every ACL and updated if necessary.
> 
> If you're using NFS4 ACLs... I don't have an easy answer for you :)

You call gpfs_getacl, walk the array of ACL's returned changing any 
UID/GID's as required and then call gpfs_putacl. You can modify both 
Posix and NFSv4 ACL's with this call. Given they only take a path to the 
file another reason to use nftw rather than GPFS inode scan.

As I understand even if your file system is set to an ACL type of "all", 
any individual file/directory can only have either Posix *or* NSFv4 ACLS 
(ignoring the fact you can set your filesystem ACL's type to the 
undocumented Samba), so can all be handled automatically.

Note if you are using nftw to walk the file system then you get a 
standard system stat structure for every file/directory and you could 
just skip symbolic links. I don't think you can set an ACL on a symbolic 
link anyway. You certainly can't set standard permissions on them.

It would be sensible to wrap the main loop in 
gpfs_lib_init/gpfs_lib_term in this scenario.

JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG