Home

Awesome

RKMS (Reliable Key Management Service)

RKMS is a highly available key management service, built on top of AWS's KMS.

Objective

While AWS's KMS is an amazing service, it does not have an SLA. As a result, if KMS goes down in the region you are using it in, your application also goes down as it can't encrypt/decrypt data. The idea of RKMS is to replicate your application's data keys across multiple regions, so you can fallback on another region if your main region goes down.

To get a better understanding, check out my blog post about the project.

Architecture

Overview

Before we look at how RKMS is designed, let's go over the main functionalities AWS's KMS provides:

RKMS's main endpoint is GET /key?id=<id>, which roughly does the following:

  1. Look in the key/value store for a value for id
  2. If found, the value will contain mappings from KMS regions to encrypted data key - Pick a region - Decrypt encrypted data key in the selected region and return the plaintext data key returned by KMS - If call to KMS fails, try other regions
  3. If not found, a new key has to be created for the given id - Ask one of the KMS regions to generate a data key - Encrypt the data key in every region - Save all the encrypted data keys in the store for key id
  4. Return plaintext data key

Notes:

High Availability and Race Conditions

One of the benefits of RKMS is that it is stateless. As a result, one can run multiple copies of the service to avoid single point of failure. On the other hand, running multiple copies bring up concerns regarding race conditions (e.g. creating the same key at the "same" time on multiple servers). In order to address this concern, RKMS is designed with First Write Wins concept. The last step of creating a key is to save it in the key/value store. RKMS performs a conditional write here, where it only saves to the store if no value exists for the given key. For that reason, when the same key is being created at the "same" time, the writes to the store happen serially and only the first write wins. In which case, the second writer will just re-read from the store and return the value generated by the other RKMS server.

Get Started

Contributing

Contributions to this project are very welcome! You can even contribute by simply requesting features or reporting bugs.

Things I would like to do in the future (which you can help with!) are: