<div dir="auto">Building off of this you could use the file placement engine to give a bad pool name for files that have bad names... Then new files couldn't be created with bad names. However files can still be renamed so you need a policy to deal with those cases.<div dir="auto"><br></div><div dir="auto">Alec</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Jul 7, 2023, 11:11 AM Wayne Sawdon <<a href="mailto:wsawdon@us.ibm.com">wsawdon@us.ibm.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div lang="EN-US" link="blue" vlink="purple" style="word-wrap:break-word">
<div class="m_9193952915445869170WordSection1">
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><span style="font-size:8.5pt;font-family:Menlo;color:black">The policy code uses a more or less standard linux regexp library, so your regular expressions used for grep should work. The catch is the policy file is preprocessed with M4 which
makes writing regexs a bit tricky. I grabbed a comment from the code: <u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:8.5pt;font-family:Menlo;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:8.5pt;font-family:Menlo;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:8.5pt;font-family:Menlo;color:black"> The policy SQL parser normally does M4 macros processing with [ ] set as the quote characters.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:8.5pt;font-family:Menlo;color:black"> SOOOO…. We highly recommend you add an extra set of [ ] around your REGEX pattern string like this:<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:8.5pt;font-family:Menlo;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:8.5pt;font-family:Menlo;color:black"> . . . WHERE REGEX(name, [‘^[a-z]*$’])<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:8.5pt;font-family:Menlo;color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:8.5pt;font-family:Menlo;color:black"> To only match lowercase alphabetic names.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:8.5pt;font-family:Menlo;color:black">
<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:8.5pt;font-family:"Courier New";color:black"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="color:#212121">Once you’ve gotten past M4, you can either match for not good characters or directly for bad characters<u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:#212121"> <u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:#212121">REGEX(. FILENAME, [‘[^a-zA-Z0-9\_\-\.]’] ) ### match when you find a character not in the good set<u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:#212121">REGEX(. FILENAME, [‘[\n\*\\]’] ). ### match when you find a bad character<u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:#212121"> <u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:#212121">I am not sure which is more difficult to enumerate.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:#212121"> <u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:#212121">The ESCAPE clause described by Olaf is the trick we use to pass file names with bad characters through the surrounding scripts (like mmbackup, mmxcp, etc). There is code in samples/ilm that show how to use it.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:#212121"> <u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:#212121"> <u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#121212">-Wayne </span><span style="color:#212121"><u></u><u></u></span></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#121212"><u></u> <u></u></span></p>
</div>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
<div id="m_9193952915445869170mail-editor-reference-message-container">
<div>
<div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="margin-bottom:12.0pt"><b><span style="font-size:12.0pt;color:black">From:
</span></b><span style="font-size:12.0pt;color:black">gpfsug-discuss <<a href="mailto:gpfsug-discuss-bounces@gpfsug.org" target="_blank" rel="noreferrer">gpfsug-discuss-bounces@gpfsug.org</a>> on behalf of Olaf Weiser <<a href="mailto:olaf.weiser@de.ibm.com" target="_blank" rel="noreferrer">olaf.weiser@de.ibm.com</a>><br>
<b>Date: </b>Wednesday, July 5, 2023 at 7:06 PM<br>
<b>To: </b>gpfsug main discussion list <<a href="mailto:gpfsug-discuss@gpfsug.org" target="_blank" rel="noreferrer">gpfsug-discuss@gpfsug.org</a>><br>
<b>Subject: </b>[EXTERNAL] Re: [gpfsug-discuss] Special characters in filenames<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:1.0pt;color:white">Hallo Jonathan, I haven't used it for a while, but I can remember a customer, where we masked "all" special characters with ESCAPE In fact, as far as I remember. .
this was an iterative progress .. . </span><span style="font-size:1.0pt;font-family:"Apple Color Emoji";color:white">😉</span><span style="font-size:1.0pt;color:white">
</span><span style="font-size:1.0pt;font-family:"Apple Color Emoji";color:white">😉</span><span style="font-size:1.0pt;color:white"> You're right, the doc's are
<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:1.0pt;color:white">ZjQcmQRYFpfptBannerStart<u></u><u></u></span></p>
</div>
<table border="0" cellspacing="0" cellpadding="0" width="100%" style="width:100.0%;border-radius:4px">
<tbody>
<tr>
<td style="padding:12.0pt 0in 12.0pt 0in">
<table border="1" cellspacing="0" cellpadding="0" width="100%" style="width:100.0%;background:#d0d8dc;border:none;border-top:solid #90a4ae 3.0pt">
<tbody>
<tr>
<td valign="top" style="border:none;padding:0in 7.5pt 3.75pt 4.5pt">
<table border="0" cellspacing="0" cellpadding="0" align="left">
<tbody>
<tr>
<td style="padding:3.0pt 6.0pt 3.0pt 6.0pt">
<p class="MsoNormal"><b><span style="font-size:10.5pt;font-family:"Arial",sans-serif;color:black">This Message Is From an External Sender
<u></u><u></u></span></b></p>
</td>
</tr>
<tr>
<td style="padding:3.0pt 6.0pt 3.0pt 6.0pt">
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Arial",sans-serif;color:black">This message came from outside your organization.
<u></u><u></u></span></p>
</td>
</tr>
</tbody>
</table>
<table border="0" cellspacing="0" cellpadding="0" align="right">
<tbody>
<tr>
<td style="padding:3.0pt 0in 3.0pt 0in">
<p class="MsoNormal"> <a href="https://us-phishalarm-ewt.proofpoint.com/EWT/v1/PjiDSg!1e-vr57TRvm6FYv7eAEkoFZ5-fuixwOfksdMyYJ2Yw9UHwuf23wcNAn2q-2zAW_rt-pXEwiWUEgZYE59IM6oXjeF6R9iCOapflooMaGkIunnVczfBcG0YBhSB07msMGJqVJ3cuRnSrg$" target="_blank" rel="noreferrer"><strong><span style="font-size:10.5pt;font-family:"Arial",sans-serif;color:black;border:solid #666666 1.0pt;padding:6.0pt;font-weight:normal;text-decoration:none"> Report Suspicious </span></strong></a>
<u></u><u></u></p>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<div>
<p class="MsoNormal"><span style="font-size:1.0pt;color:white">ZjQcmQRYFpfptBannerEnd<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black">Hallo Jonathan, <u></u>
<u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black">I haven't used it for a while, but I can remember a customer, where we masked "all" special characters with
</span><span class="m_9193952915445869170contentpasted0"><b><span style="font-size:12.0pt;font-family:"Courier New";color:#ff5454;background:white">ESCAPE</span></b></span><span style="font-size:12.0pt;color:black"><u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black">In fact, as far as I remember.. this was an iterative progress ...
</span><span style="font-size:12.0pt;font-family:"Apple Color Emoji";color:black">😉</span><span style="font-size:12.0pt;color:black">
</span><span style="font-size:12.0pt;font-family:"Apple Color Emoji";color:black">😉</span><span style="font-size:12.0pt;color:black">
<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black">You're right, the doc's are not really self-explaining here..
<u></u><u></u></span></p>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black"><u></u> <u></u></span></p>
</div>
<div id="m_9193952915445869170Signature">
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black">from my personal notes I found a litte better example:<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black"><u></u> <u></u></span></p>
</div>
<div>
<p class="m_9193952915445869170contentpasted1"><span style="font-size:12.0pt;color:black">In GPFS 3.5 we introduce an (optional) ESCAPE clause to the EXTERNAL LIST and EXTERNAL POOL rules, which allow the user-administrator to specify that path names and SHOW(strings) within the
associated file lists are encoded using an encoding based on the RFC3986 URI-percent-encoding scheme. For example:<u></u><u></u></span></p>
<p><span style="font-size:12.0pt;font-family:"Courier New";color:black">RULE 'xp' EXTERNAL POOL 'pool-name' EXEC 'script-name' ESCAPE '%'</span><span style="font-size:12.0pt;color:black"><u></u><u></u></span></p>
<p><span style="font-size:12.0pt;font-family:"Courier New";color:black">RULE 'xl' EXTERNAL LIST 'list-name' EXEC 'script-name' ESCAPE '%/+@#'</span><span style="font-size:12.0pt;color:black"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black"><u></u> <u></u></span></p>
</div>
<div>
<div>
<p class="m_9193952915445869170contentpasted2"><span style="font-size:12.0pt;color:black">ESCAPE '%' specifies that all characters except the "unreserved" characters in the set a-zA-Z0-9-_.~ are encoded as %XX where XX comprises 2 hexadecimal digits. The GPFS ESCAPE clause allows
you to add to the set of "unreserved" characters.<u></u><u></u></span></p>
<p class="m_9193952915445869170contentpasted2"><span style="font-size:12.0pt;color:black">For example, ESCAPE '%/+@#', specifies that none of the characters in "/+@#" are escaped, so that a path name like "/root/directory/@abc+def#ghi.jkl" will appear in a file list with no escape
sequences, whereas under ESCAPE '%', specifying a rigorous RFC3986 encoding yields "%2Froot%2Fdirectory%2F%40abc%2Bdef%23ghi.jkl".<u></u><u></u></span></p>
<p class="m_9193952915445869170contentpasted2"><span style="font-size:12.0pt;color:black"><u></u> <u></u></span></p>
<p class="m_9193952915445869170contentpasted2"><span style="font-size:12.0pt;color:black">at least for us, it was doing the trick (back then) by using ESCAPE..<u></u><u></u></span></p>
<p class="m_9193952915445869170contentpasted2"><span style="font-size:12.0pt;color:black">Maybe it is useful for your case here as well<u></u><u></u></span></p>
<p class="m_9193952915445869170contentpasted2"><span style="font-size:12.0pt;color:black"><u></u> <u></u></span></p>
<p class="m_9193952915445869170contentpasted2"><span style="font-size:12.0pt;color:black">cheers<u></u><u></u></span></p>
<p class="m_9193952915445869170contentpasted2"><span style="font-size:12.0pt;color:black">laff<u></u><u></u></span></p>
</div>
</div>
</div>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black"><u></u> <u></u></span></p>
</div>
<div class="MsoNormal" align="center" style="text-align:center">
<hr size="0" width="100%" align="center">
</div>
<div id="m_9193952915445869170divRplyFwdMsg">
<p class="MsoNormal"><b><span style="color:black">Von:</span></b><span style="color:black"> gpfsug-discuss <<a href="mailto:gpfsug-discuss-bounces@gpfsug.org" target="_blank" rel="noreferrer">gpfsug-discuss-bounces@gpfsug.org</a>> im Auftrag von Jonathan Buzzard <<a href="mailto:jonathan.buzzard@strath.ac.uk" target="_blank" rel="noreferrer">jonathan.buzzard@strath.ac.uk</a>><br>
<b>Gesendet:</b> Donnerstag, 6. Juli 2023 00:20<br>
<b>An:</b> gpfsug main discussion list <<a href="mailto:gpfsug-discuss@gpfsug.org" target="_blank" rel="noreferrer">gpfsug-discuss@gpfsug.org</a>><br>
<b>Betreff:</b> [EXTERNAL] [gpfsug-discuss] Special characters in filenames</span>
<u></u><u></u></p>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><br>
After another support incident that eventually transpired to be down to <br>
the user using what I will call stupid characters in their filenames (we <br>
include a section on not doing this in our mandatory training so no <br>
excuse) I have been musing on using the policy engine to periodically <br>
produce lists of files that have stupid characters in their filenames so <br>
we can proactively educate the users and get them to rename their files <br>
to something sensible :-)<br>
<br>
The issue is of course the stupid characters include all the regular <br>
expression wildcard characters in addition to \n, \r and backticks. I am <br>
coming up short on escaping them correctly in REGEX() for the policy engine.<br>
<br>
The documentation appears to be devoid of help on the subject, because <br>
of course only an fool would be including these characters in their <br>
filenames...<br>
<br>
Anyone any idea on how to do this?<br>
<br>
<br>
JAB.<br>
<br>
-- <br>
Jonathan A. Buzzard Tel: +44141-5483420<br>
HPC System Administrator, ARCHIE-WeSt.<br>
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG<br>
<br>
_______________________________________________<br>
gpfsug-discuss mailing list<br>
gpfsug-discuss at <a href="http://gpfsug.org" target="_blank" rel="noreferrer">gpfsug.org</a><br>
<a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org" target="_blank" rel="noreferrer">http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org</a>
<u></u><u></u></p>
</div>
</div>
</div>
</div>
</div>
</div>
_______________________________________________<br>
gpfsug-discuss mailing list<br>
gpfsug-discuss at <a href="http://gpfsug.org" rel="noreferrer noreferrer" target="_blank">gpfsug.org</a><br>
<a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org" rel="noreferrer noreferrer" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org</a><br>
</blockquote></div>