<div dir="auto"><div>Ed</div><div dir="auto"> Thanks for the response, I wasn't aware of those two commands. I will see if that unlocks a solution. I kind of need the test to work in a production environment. So can't just be adding spare nodes onto the cluster and forgetting with file systems.</div><div dir="auto"><br></div><div dir="auto">Unfortunately the logs don't indicate when a node has returned to health. Only that it's in trouble but as we patch often we see these regularly.</div><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto">For the second question, we would add a 2nd MEK key to each file so that two independent keys from two different RKM pools would be able to unlock any file. This would give us two whole independent paths to encrypt and decrypt a file.</div><div dir="auto"><br></div><div dir="auto">So I'm looking for a best practice example from IBM to indicate this so we don't have a dependency on a single RKM environment.</div><div dir="auto"><br></div><div dir="auto">Alec</div><div dir="auto"><br></div><div dir="auto"><br><br><div class="gmail_quote" dir="auto"><div dir="ltr" class="gmail_attr">On Wed, Aug 16, 2023, 2:02 PM Wahl, Edward <<a href="mailto:ewahl@osc.edu">ewahl@osc.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div lang="EN-US" link="blue" vlink="purple" style="word-wrap:break-word">
<div class="m_6725390570358789500WordSection1">
<p class="MsoNormal">> How can we verify that a key server is up and running when there are multiple key servers in an rkm pool serving a single key.<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Pretty simple. <u></u><u></u></p>
<p class="MsoNormal">-Grab a compute node/client (and mark it offline if needed) unmount all encrypted File Systems.<u></u><u></u></p>
<p class="MsoNormal">-Hack the RKM.conf to point to JUST the server you want to test (and maybe a backup)<u></u><u></u></p>
<p class="MsoNormal">-Clear all keys: ‘/usr/lpp/mmfs/bin/tsctl encKeyCachePurge all ‘<u></u><u></u></p>
<p class="MsoNormal">-Reload the RKM.conf: ‘/usr/lpp/mmfs/bin/tsloadikm run’ (this is a great command if you need to load new Certificates too)
<u></u><u></u></p>
<p class="MsoNormal">-Attempt to mount the encrypted FS, and then cat a few files.
<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">If you’ve not setup a 2<sup>nd</sup> server in your test you will see quarantine messages in the logs for a bad KMIP server. If it works, you can clear keys again and see how many were retrieved.
<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">>Is there any documentation or diagram officially from IBM that recommends having 2 keys from independent RKM environments for high availability as best practice that I could refer to?<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">I am not an IBM-er… but I’m also not 100% sure what you are asking here. Two un-related SKLM setups? How would you sync the keys? How would this be better than multiple replicated servers?<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Ed Wahl<u></u><u></u></p>
<p class="MsoNormal">Ohio Supercomputer Center<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<div style="border:none;border-top:solid #e1e1e1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>From:</b> gpfsug-discuss <<a href="mailto:gpfsug-discuss-bounces@gpfsug.org" target="_blank" rel="noreferrer">gpfsug-discuss-bounces@gpfsug.org</a>>
<b>On Behalf Of </b>Alec<br>
<b>Sent:</b> Wednesday, August 16, 2023 3:33 PM<br>
<b>To:</b> gpfsug main discussion list <<a href="mailto:gpfsug-discuss@gpfsug.org" target="_blank" rel="noreferrer">gpfsug-discuss@gpfsug.org</a>><br>
<b>Subject:</b> [gpfsug-discuss] RKM resilience questions testing and best practice<u></u><u></u></p>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal"><span style="font-size:1.0pt;color:white">Hello we are using a remote key server with GPFS I have two questions: First question: How can we verify that a key server is up and running when there are multiple
key servers in an rkm pool serving a single key. The scenario is after maintenance
<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:1.0pt;color:white"><u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal">Hello we are using a remote key server with GPFS I have two questions:<u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">First question:<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">How can we verify that a key server is up and running when there are multiple key servers in an rkm pool serving a single key.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">The scenario is after maintenance or periodically we want to verify that all member of the pool are in service.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Second question is:<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Is there any documentation or diagram officially from IBM that recommends having 2 keys from independent RKM environments for high availability as best practice that I could refer to?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Alec<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
</div>
</div>
</div>
_______________________________________________<br>
gpfsug-discuss mailing list<br>
gpfsug-discuss at <a href="http://gpfsug.org" rel="noreferrer noreferrer" target="_blank">gpfsug.org</a><br>
<a href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org" rel="noreferrer noreferrer" target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org</a><br>
</blockquote></div></div></div>