[gpfsug-discuss] [EXTERNAL] Re: Handling bad file names in policies?

Jonathan Buzzard jonathan.buzzard at strath.ac.uk
Mon Oct 11 11:47:49 BST 2021


On 11/10/2021 09:55, Peter Childs wrote>
> We've had this same issue with characters that are fine in Scale but
> Protect can't handle. Normally its because some script has embedded a
> newline in the middle of a file name, and normally we end up renaming
> that file by inode number
> 
> find . -inum 9975226749 -exec mv {} badfilename \;
> 
> mostly because we can't even type the filename at the command
> prompt.
> 

You can it just requires know how. I will freely admit it took me a long 
time to work out how to do it. The dirty alternative that sometimes 
works is to use wildcards.

What gets me is I have never created a single file with "problem" 
characters in the filename in over 30 years of computing. Well apart 
from deliberately trying to work out how the hell you do it, and it's 
not easy.

I think the most likely answer for newlines in file names is cut and 
paste into a file save dialogue box.

> However its not always just new line characters currently we've got a
> few files with unprintable characters in it. but its normally less
> than 50 files every few months, so is easy to handle manually.

Mostly I find the none newline issues are down to "foreigners" using 
something other than UTF-8 (aka random stupid Windows code pages) to 
give files names in their native language.

You can usually work out what the filename is supposed to be once you 
know the nationality of the file owner. Again I think this happens due 
to cut and paste from text documents in none UTF-8 encodings.

So for example take something Cyrillic in codepage 1251, copy and paste 
it into a file save dialogue box and end up with a filename containing 
unprintable characters.

> I normally end up looking at /data/mmbackup.unsupported which is the
> standard output from mmapplypolicy and extracting the file names from
> it and emailing the users concerned to assist them in working out
> what went wrong.
> 
> I guess you could automate the parsing of this file at the end of the
> backup process and do something interesting with it.
> 

Email the owner of the file and tell them it's not being backed up and 
won't be till they "fix" the file name so that backup software can 
process it.

If it is just a newline I would be tempted to have them automatically 
renamed sans the newline, and then send the file owner an email (per 
file) letting them know what has happened. If their inbox is spammed 
that will hopefully prompt them to stop doing it :-)


JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG



More information about the gpfsug-discuss mailing list