Awesome

ngx-distributed-shm

This projet is memcached like server based on Hazelcast and Vertx. The goals of the project is to build an easy-to-use distributed memory storage with the nginx shared memory semantic for use with lua nginx plugin.

The semantic of the protocol is the same as the lua.shared semantic.

Status

Production Ready since 06/2018.

Use cases

This project was successfully used to store rate limiting counter across a cluster of a Nginx based API Gateway in a French banking company.

This project was successfully used to distribute OpenID Connect Replying Party (based on zmartzone/lua-resty-openidc ) web session with the library bungle/lua-resty-session in a french banking company.

Principle Schema

-------------------    -------------------    -------------------
|                 |    |                 |    |                 |
| Nginx/Openresty |    | Nginx/Openresty |    | Nginx/Openresty |
|        1        |    |        2        |    |        3        |
-------------------    -------------------    -------------------
         | (127.0.0.1:4321)     | (127.0.0.1:4321)     | (127.0.0.1:4321)
-------------------    -------------------    -------------------
|                 |    |                 |    |                 |
|       DSHM      |    |       DSHM      |    |       DSHM      |
|        1        |    |        2        |    |        3        |
-------------------    -------------------    -------------------
         |                      |                       |
         |                Data Replication              |
         |______________________|_______________________|

Dependencies

ngx-distributed-shm depends on followings libraries :

Hazelcast IMDG for implementing the distributed storage
Vertx for implementing the communication protocol

The dependencies above are automatically included in the distribution jar with maven shade plugin.

Download

You can download distribution jar directly from the GitHub release.

Alternatively, you can download the distribution jar via maven :

mvn dependency:copy -Dartifact=io.github.grrolland:ngx-distributed-shm:1.0.2:jar -DoutputDirectory=.

Installation

You need a JVM (Java 8 at least) to run the distributed storage.

Simply get the distribution jar and copy it on your filesystem.

How it works

When the storage startup, it creates an hazelcast instance and a vertx connector to communicate with it. When you start a second instance, it joins the first with the hazelcast protocol.

The protocol expose commands to interact with the distributed storage :

SET : set a value in the storage
GET : get a value from the storage
TOUCH : update the ttl of a key
DELETE : delete a key from the storage
INCR : increment the value for a key
RATE_LIMITER : consume a token in the sliding window rate limiter identified by the key

In a clustered deployment (2 or more instances), a client need to connect to only one instance to see all the storage. The goal is to provide a near storage associated with a nginx instance.

Startup

This command startup the storage on default port (4321), with default replication options (UDP multicast) :

java -jar ngx-distributed-shm.jar

Or :

java -cp ngx-distributed-shm.jar io.github.grrolland.hcshm.Main

To startup with a configuration directory ./conf (with hazelcast.xml and logback.xml) use :

java -cp ngx-distributed-shm.jar:./conf io.github.grrolland.hcshm.Main

The dist/bin directory contains startup and shutdown scripts.

Startup Options

-Dngx-distributed-shm.port=port default : 4321

Startup the storage and bind the protocol port on <port>.

This command startup the storage on the port 40000 and the 127.0.0.1 address :

  java -Dngx-distributed-shm.port=40000 -jar ngx-distributed-shm.jar

-Dngx-distributed-shm.bind_address=address default : 127.0.0.1

Startup the storage and bind the protocol on address <address>.

This command startup the storage on the 192.168.0.1 address :

  java -Dngx-distributed-shm.bind_address=192.168.0.1 -jar ngx-distributed-shm.jar

Configure the hazelcast IMDG map for replication

The hazelcast IMDG is configured with a configuration file which must be present in the classpath. The file must be named hazelcast.xml.

This is an example of this file :

<?xml version="1.0" encoding="UTF-8"?>
<hazelcast xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xmlns="http://www.hazelcast.com/schema/config"
           xsi:schemaLocation="http://www.hazelcast.com/schema/config
           http://www.hazelcast.com/schema/config/hazelcast-config-5.0.xsd">

    <cluster-name>ngx-dshm</cluster-name>

    <network>
        <port auto-increment="false">5701</port>
        <join>
            <multicast enabled="false"/>
            <tcp-ip enabled="true">
                <interface>127.0.0.1</interface>
                <member-list>
                    <member>127.0.0.1</member>
                </member-list>
            </tcp-ip>
            <aws enabled="false"/>
        </join>
    </network>

    <map name="default">
        <in-memory-format>BINARY</in-memory-format>
        <backup-count>1</backup-count>
        <async-backup-count>0</async-backup-count>
        <time-to-live-seconds>0</time-to-live-seconds>
        <max-idle-seconds>0</max-idle-seconds>
        <eviction eviction-policy="NONE"/>
        <merge-policy>PutIfAbsentMergePolicy</merge-policy>
        <cache-deserialized-values>INDEX-ONLY</cache-deserialized-values>
    </map>

</hazelcast>

The reference documentation for this configuration is here : https://docs.hazelcast.org/docs/3.12.1/manual/html-single/index.html#tcp-ip-element

This configuration works well for a two member cluster of the distributed shared memory.

Logging

The dist/conf directory contains an example logback.xml which control logging. The example file is the following :

<?xml version="1.0" encoding="UTF-8"?>
<configuration>

    <contextName>ngx-dshm</contextName>
    <jmxConfigurator/>

    <appender name="FILE-GLOBAL"
              class="ch.qos.logback.core.rolling.RollingFileAppender">
        <file>${ngx-distributed-shm.log_dir}/ngx-dshm.log</file>
        <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
            <fileNamePattern>${ngx-distributed-shm.log_dir}/ngx-dshm-%d{yyyy-MM-dd}.log</fileNamePattern>
            <maxHistory>30</maxHistory>
        </rollingPolicy>
        <encoder>
            <pattern>%d{yyyy-MM-dd HH:mm:ss.SSS,Europe/Paris} [%thread] %-5level %logger{36} - %msg %xEx{20}%n</pattern>
        </encoder>
    </appender>

    <appender name="ASYNC"
              class="ch.qos.logback.classic.AsyncAppender">
        <appender-ref ref="FILE-GLOBAL"/>
    </appender>

    <root level="info">
        <appender-ref ref="ASYNC"/>
    </root>

</configuration>

Starting the dshm with the -Dngx-distributed-shm.log_dir=log_dir permit to choose the logging directory.

Protocol

The protocol is a test protocol inspired by the memcached protocol.

There is two part in the protocol request : the command line part and the optional data part.

Command line part

The command line part has the following format :

COMMAND ARG1 ARG2 ARGN\r\n

The ARG1 is alway the key

Anatomy of a key : region cache support

The distributed shared memory support region partitioning. The format of the key control the region where will be located the value.

This key key1 will be located in the region region1 :

region1::key1

This key key1 will be located in the default region :

key1

This permit to control the hazelcast map where will be stored the key/value.

Data part

The data part is a bytes stream which the length is specified in the command part.

For example the second argument of the SET command set the length of the data to send :

SET key 0 4\r\n
1234

This command set the key "key" at the value "1234"

Response

The response protocol is fairly simple. When the command is executed the response have the following format :

\r\nRESPONSE response_arg\r\n

The following terms are used in the response :

DONE

The command is successfully executed. There is no argument.

ERROR

There is an error in the command. the argument is the error message.

Error Messages

malformed_request

The command request is malformed. There is an error in the request.

not_found

The key involved in the command is not found.

Commands

GET <key>

with data: no

Get the value of the key in the storage. This operation is atomic.

GET key\r\n

SET <key> <expiration> <length>

with data: yes

Set a value of length <length> for the key <key> in the storage with the expiration time in second <expiration>. This operation is atomic.

When <expiration> is 0, the key don't expire.

SET key 10 4\r\n
1234

TOUCH <key> <expiration>

with data: no

Set the expiration time in second <expiration> for the key <key>. This operation is atomic.

When <expiration> is 0, the key don't expire. When the key does not exist the command does nothing.

TOUCH key 10\r\n

DELETE <key>

with data: no

Delete the key <key> from the storage. This operation is atomic. When the key does not exist the command does nothing.

DELETE key\r\n

INCR <key> <value> <init> <initial_expiration>

with data: no

Increment the value of the key <key> with <value> if the key exists and represent an integer.

If the value is not an integer, this operation has no effect.

If the key exist, initial_expiration has no effect.

If the key don't exist or is expired, this operation create the key and init the value to <value>+<init>. In that case, if initial_expiration is not provided or 0, the key will never expire

This operation is atomic.

INCR key -1 0\r\n

INCR key -1 0 60\r\n

RATE_LIMITER <key> <capacity> <duration>

with data: no

Consumes a token in a sliding window rate limiter with the key key. The sliding window duration is configured with duration seconds. The rate limiter is created automatically

The command attempts to consume a token and return the number of remaining available tokens. If there were no more tokens available, the command returns -1, otherwise the command return the number of tokens available between 0 and capacity

note : GET command with this key return the available tokens

This operation is atomic.

Example : consumes a token in rate limiter key with capacity 1000 tokens every 10 seconds

RATE_LIMITER key 1000 10\r\n

FLUSHALL [region]

with data: no

Remove all the key from the region. The region is optional. Without region parameter, the default region is flush.

This operation is atomic.

FLUSHALL\r\n

Or :

FLUSHALL region1\r\n

QUIT

with data: no

Close the connection with the server.

QUIT\r\n

LUA libraries support

The lua libraries (lua/dshm.lua) is used to pilot the shared memory. The library should be installed in the resty directory of the openresty distribution.

This is an exemple to use it :

local dshm = require "resty.dshm"

local store = dshm:new()

store:connect("127.0.0.1", 4321)

store:set("key", 10)
local value = store:get("key")
store:incr("key")
store:delete("key")

Resty Session support

This module could be used to activate session replication with the excellent lua library Resty Session (https://github.com/bungle/lua-resty-session)

To use it, copy the lua extension in your resty/session/storage directory and use this type of configuration in your nginx.conf :

set $session_storage               dshm;
set $session_dshm_region           sessions;
set $session_dshm_connect_timeout  1000; # (in milliseconds)
set $session_dshm_send_timeout     1000; # (in milliseconds)
set $session_dshm_read_timeout     1000; # (in milliseconds)
set $session_dshm_host             127.0.0.1;
set $session_dshm_port             4321;
set $session_dshm_pool_name        sessions;
set $session_dshm_pool_timeout     1000; # (in milliseconds)
set $session_dshm_pool_size        10;
set $session_dshm_pool_backlog     10;
set $session_secret                base64_encoded_string;

The session_storage parameter control the storage module to be used.

Docker

A Docker image can be built using the provided Dockerfile:

docker build \
  --force-rm \
  --squash \
  -t 'local/docker-ngx-distributed-shm' \
  .

Run a container:

docker run --rm -it \
  -u root \
  --name docker-ngx-distributed-shm \
  'local/docker-ngx-distributed-shm'

An official docker image build is available at quay.io or directly in the GitHub registry :

docker pull quay.io/grrolland/ngx-distributed-shm

docker pull ghcr.io/grrolland/ngx-distributed-shm

Kubernetes

See kubernetes directory for sample artefacts.
Use kustomize standalone or the one embedded in kubectl to generate kubernetes artefacts for a specific release:
- Change files in kubernetes/overlays/test/ according to your needs
  - configmap.yaml contains a basic hazelcast configuration as yaml instead of xml for ease of reading
- Generate artefacts and inspect:
```
cd kubernetes
kubectl kustomize /overlays/test > kubernetes.yaml
```
- Apply: kubectl apply -f kubernetes.yaml

Management Center

Optionally, the Hazelcast cluster can be monitored via Management Center