[gpfsug-discuss] RKM resilience questions testing and best practice

Jan-Frode Myklebust janfrode at tanso.net
Thu Aug 17 16:08:29 BST 2023


Your second KMIP server don’t need to have an active replication
relationship with the first one — it just needs to contain the same MEK. So
you could do a one time replication / copying between them, and they would
not have to see each other anymore.

I don’t think having them host different keys will work, as you won’t be
able to fetch the second key from the one server your client is connected
to, and then will be unable to encrypt with that key.

>From what I’ve seen of KMIP setups with Scale, it’s a stupidly trivial
service. It’s just a server that will tell you the key when asked + some
access control to make sure no one else gets it. Also MEKs never changes…
unless you actively change them in the file system policy, and then you
could just post the new key to all/both your independent key servers when
you do the change.


 -jf

ons. 16. aug. 2023 kl. 23:25 skrev Alec <anacreo at gmail.com>:

> Ed
>   Thanks for the response, I wasn't aware of those two commands.  I will
> see if that unlocks a solution. I kind of need the test to work in a
> production environment.   So can't just be adding spare nodes onto the
> cluster and forgetting with file systems.
>
> Unfortunately the logs don't indicate when a node has returned to health.
> Only that it's in trouble but as we patch often we see these regularly.
>
>
> For the second question, we would add a 2nd MEK key to each file so that
> two independent keys from two different RKM pools would be able to unlock
> any file.  This would give us two whole independent paths to encrypt and
> decrypt a file.
>
> So I'm looking for a best practice example from IBM to indicate this so we
> don't have a dependency on a single RKM environment.
>
> Alec
>
>
>
> On Wed, Aug 16, 2023, 2:02 PM Wahl, Edward <ewahl at osc.edu> wrote:
>
>> > How can we verify that a key server is up and running when there are
>> multiple key servers in an rkm pool serving a single key.
>>
>>
>>
>> Pretty simple.
>>
>> -Grab a compute node/client (and mark it offline if needed) unmount all
>> encrypted File Systems.
>>
>> -Hack the RKM.conf to point to JUST the server you want to test (and
>> maybe a backup)
>>
>> -Clear all keys:   ‘/usr/lpp/mmfs/bin/tsctl encKeyCachePurge all ‘
>>
>> -Reload the RKM.conf:  ‘/usr/lpp/mmfs/bin/tsloadikm run’   (this is a
>> great command if you need to load new Certificates too)
>>
>> -Attempt to mount the encrypted FS, and then cat a few files.
>>
>>
>>
>> If you’ve not setup a 2nd server in your test you will see quarantine
>> messages in the logs for a bad KMIP server.    If it works, you can clear
>> keys again and see how many were retrieved.
>>
>>
>>
>> >Is there any documentation or diagram officially from IBM that
>> recommends having 2 keys from independent RKM environments for high
>> availability as best practice that I could refer to?
>>
>>
>>
>> I am not an IBM-er…  but I’m also not 100% sure what you are asking here.
>>   Two un-related SKLM setups? How would you sync the keys?   How would this
>> be better than multiple replicated servers?
>>
>>
>>
>> Ed Wahl
>>
>> Ohio Supercomputer Center
>>
>>
>>
>> *From:* gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> *On Behalf Of
>> *Alec
>> *Sent:* Wednesday, August 16, 2023 3:33 PM
>> *To:* gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
>> *Subject:* [gpfsug-discuss] RKM resilience questions testing and best
>> practice
>>
>>
>>
>> Hello we are using a remote key server with GPFS I have two questions:
>> First question: How can we verify that a key server is up and running when
>> there are multiple key servers in an rkm pool serving a single key. The
>> scenario is after maintenance
>>
>> Hello we are using a remote key server with GPFS I have two questions:
>>
>>
>>
>> First question:
>>
>> How can we verify that a key server is up and running when there are
>> multiple key servers in an rkm pool serving a single key.
>>
>>
>>
>> The scenario is after maintenance or periodically we want to verify that
>> all member of the pool are in service.
>>
>>
>>
>> Second question is:
>>
>> Is there any documentation or diagram officially from IBM that recommends
>> having 2 keys from independent RKM environments for high availability as
>> best practice that I could refer to?
>>
>>
>>
>> Alec
>>
>>
>>
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at gpfsug.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
>>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20230817/7088301f/attachment-0001.htm>


More information about the gpfsug-discuss mailing list