[gpfsug-discuss] RKM resilience questions testing and best practice

Wahl, Edward ewahl at osc.edu
Wed Aug 16 21:56:53 BST 2023


> How can we verify that a key server is up and running when there are multiple key servers in an rkm pool serving a single key.

Pretty simple.
-Grab a compute node/client (and mark it offline if needed) unmount all encrypted File Systems.
-Hack the RKM.conf to point to JUST the server you want to test (and maybe a backup)
-Clear all keys:   ‘/usr/lpp/mmfs/bin/tsctl encKeyCachePurge all ‘
-Reload the RKM.conf:  ‘/usr/lpp/mmfs/bin/tsloadikm run’   (this is a great command if you need to load new Certificates too)
-Attempt to mount the encrypted FS, and then cat a few files.

If you’ve not setup a 2nd server in your test you will see quarantine messages in the logs for a bad KMIP server.    If it works, you can clear keys again and see how many were retrieved.

>Is there any documentation or diagram officially from IBM that recommends having 2 keys from independent RKM environments for high availability as best practice that I could refer to?

I am not an IBM-er…  but I’m also not 100% sure what you are asking here.   Two un-related SKLM setups? How would you sync the keys?   How would this be better than multiple replicated servers?

Ed Wahl
Ohio Supercomputer Center

From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> On Behalf Of Alec
Sent: Wednesday, August 16, 2023 3:33 PM
To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
Subject: [gpfsug-discuss] RKM resilience questions testing and best practice

Hello we are using a remote key server with GPFS I have two questions: First question: How can we verify that a key server is up and running when there are multiple key servers in an rkm pool serving a single key. The scenario is after maintenance

Hello we are using a remote key server with GPFS I have two questions:

First question:
How can we verify that a key server is up and running when there are multiple key servers in an rkm pool serving a single key.

The scenario is after maintenance or periodically we want to verify that all member of the pool are in service.

Second question is:
Is there any documentation or diagram officially from IBM that recommends having 2 keys from independent RKM environments for high availability as best practice that I could refer to?

Alec


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20230816/f25cb366/attachment.htm>


More information about the gpfsug-discuss mailing list