Awesome
ngx-distributed-shm
This projet is memcached like server based on Hazelcast and Vertx. The goals of the project is to build an easy-to-use distributed memory storage with the nginx shared memory semantic for use with lua nginx plugin.
The semantic of the protocol is the same as the lua.shared semantic.
Status
Production Ready since 06/2018.
Use cases
This project was successfully used to store rate limiting counter across a cluster of a Nginx based API Gateway in a French banking company.
This project was successfully used to distribute OpenID Connect Replying Party (based on zmartzone/lua-resty-openidc ) web session with the library bungle/lua-resty-session in a french banking company.
Principle Schema
------------------- ------------------- -------------------
| | | | | |
| Nginx/Openresty | | Nginx/Openresty | | Nginx/Openresty |
| 1 | | 2 | | 3 |
------------------- ------------------- -------------------
| (127.0.0.1:4321) | (127.0.0.1:4321) | (127.0.0.1:4321)
------------------- ------------------- -------------------
| | | | | |
| DSHM | | DSHM | | DSHM |
| 1 | | 2 | | 3 |
------------------- ------------------- -------------------
| | |
| Data Replication |
|______________________|_______________________|
Dependencies
ngx-distributed-shm depends on followings libraries :
- Hazelcast IMDG for implementing the distributed storage
- Vertx for implementing the communication protocol
The dependencies above are automatically included in the distribution jar with maven shade plugin.
Download
You can download distribution jar directly from the GitHub release.
Alternatively, you can download the distribution jar via maven :
mvn dependency:copy -Dartifact=io.github.grrolland:ngx-distributed-shm:1.0.2:jar -DoutputDirectory=.
Installation
You need a JVM (Java 8 at least) to run the distributed storage.
Simply get the distribution jar and copy it on your filesystem.
How it works
When the storage startup, it creates an hazelcast instance and a vertx connector to communicate with it. When you start a second instance, it joins the first with the hazelcast protocol.
The protocol expose commands to interact with the distributed storage :
- SET : set a value in the storage
- GET : get a value from the storage
- TOUCH : update the ttl of a key
- DELETE : delete a key from the storage
- INCR : increment the value for a key
- RATE_LIMITER : consume a token in the sliding window rate limiter identified by the key
In a clustered deployment (2 or more instances), a client need to connect to only one instance to see all the storage. The goal is to provide a near storage associated with a nginx instance.
Startup
This command startup the storage on default port (4321), with default replication options (UDP multicast) :
java -jar ngx-distributed-shm.jar
Or :
java -cp ngx-distributed-shm.jar io.github.grrolland.hcshm.Main
To startup with a configuration directory ./conf (with hazelcast.xml and logback.xml) use :
java -cp ngx-distributed-shm.jar:./conf io.github.grrolland.hcshm.Main
The dist/bin directory contains startup and shutdown scripts.
Startup Options
-Dngx-distributed-shm.port=port default : 4321
Startup the storage and bind the protocol port on <port>.
This command startup the storage on the port 40000 and the 127.0.0.1 address :
java -Dngx-distributed-shm.port=40000 -jar ngx-distributed-shm.jar
-Dngx-distributed-shm.bind_address=address default : 127.0.0.1
Startup the storage and bind the protocol on address <address>.
This command startup the storage on the 192.168.0.1 address :
java -Dngx-distributed-shm.bind_address=192.168.0.1 -jar ngx-distributed-shm.jar
Configure the hazelcast IMDG map for replication
The hazelcast IMDG is configured with a configuration file which must be present in the classpath. The file must be named hazelcast.xml.
This is an example of this file :
<?xml version="1.0" encoding="UTF-8"?>
<hazelcast xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://www.hazelcast.com/schema/config"
xsi:schemaLocation="http://www.hazelcast.com/schema/config
http://www.hazelcast.com/schema/config/hazelcast-config-5.0.xsd">
<cluster-name>ngx-dshm</cluster-name>
<network>
<port auto-increment="false">5701</port>
<join>
<multicast enabled="false"/>
<tcp-ip enabled="true">
<interface>127.0.0.1</interface>
<member-list>
<member>127.0.0.1</member>
</member-list>
</tcp-ip>
<aws enabled="false"/>
</join>
</network>
<map name="default">
<in-memory-format>BINARY</in-memory-format>
<backup-count>1</backup-count>
<async-backup-count>0</async-backup-count>
<time-to-live-seconds>0</time-to-live-seconds>
<max-idle-seconds>0</max-idle-seconds>
<eviction eviction-policy="NONE"/>
<merge-policy>PutIfAbsentMergePolicy</merge-policy>
<cache-deserialized-values>INDEX-ONLY</cache-deserialized-values>
</map>
</hazelcast>
The reference documentation for this configuration is here : https://docs.hazelcast.org/docs/3.12.1/manual/html-single/index.html#tcp-ip-element
This configuration works well for a two member cluster of the distributed shared memory.
Logging
The dist/conf directory contains an example logback.xml which control logging. The example file is the following :
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<contextName>ngx-dshm</contextName>
<jmxConfigurator/>
<appender name="FILE-GLOBAL"
class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>${ngx-distributed-shm.log_dir}/ngx-dshm.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<fileNamePattern>${ngx-distributed-shm.log_dir}/ngx-dshm-%d{yyyy-MM-dd}.log</fileNamePattern>
<maxHistory>30</maxHistory>
</rollingPolicy>
<encoder>
<pattern>%d{yyyy-MM-dd HH:mm:ss.SSS,Europe/Paris} [%thread] %-5level %logger{36} - %msg %xEx{20}%n</pattern>
</encoder>
</appender>
<appender name="ASYNC"
class="ch.qos.logback.classic.AsyncAppender">
<appender-ref ref="FILE-GLOBAL"/>
</appender>
<root level="info">
<appender-ref ref="ASYNC"/>
</root>
</configuration>
Starting the dshm with the -Dngx-distributed-shm.log_dir=log_dir permit to choose the logging directory.
Protocol
The protocol is a test protocol inspired by the memcached protocol.
There is two part in the protocol request : the command line part and the optional data part.
Command line part
The command line part has the following format :
COMMAND ARG1 ARG2 ARGN\r\n
The ARG1 is alway the key
Anatomy of a key : region cache support
The distributed shared memory support region partitioning. The format of the key control the region where will be located the value.
This key key1 will be located in the region region1 :
region1::key1
This key key1 will be located in the default region :
key1
This permit to control the hazelcast map where will be stored the key/value.
Data part
The data part is a bytes stream which the length is specified in the command part.
For example the second argument of the SET command set the length of the data to send :
SET key 0 4\r\n
1234
This command set the key "key" at the value "1234"
Response
The response protocol is fairly simple. When the command is executed the response have the following format :
\r\nRESPONSE response_arg\r\n
The following terms are used in the response :
DONE
The command is successfully executed. There is no argument.
ERROR
There is an error in the command. the argument is the error message.
Error Messages
malformed_request
The command request is malformed. There is an error in the request.
not_found
The key involved in the command is not found.
Commands
GET <key>
with data: no
Get the value of the key in the storage. This operation is atomic.
GET key\r\n
SET <key> <expiration> <length>
with data: yes
Set a value of length <length> for the key <key> in the storage with the expiration time in second <expiration>. This operation is atomic.
When <expiration> is 0, the key don't expire.
SET key 10 4\r\n
1234
TOUCH <key> <expiration>
with data: no
Set the expiration time in second <expiration> for the key <key>. This operation is atomic.
When <expiration> is 0, the key don't expire. When the key does not exist the command does nothing.
TOUCH key 10\r\n
DELETE <key>
with data: no
Delete the key <key> from the storage. This operation is atomic. When the key does not exist the command does nothing.
DELETE key\r\n
INCR <key> <value> <init> <initial_expiration>
with data: no
Increment the value of the key <key> with <value> if the key exists and represent an integer.
If the value is not an integer, this operation has no effect.
If the key exist, initial_expiration has no effect.
If the key don't exist or is expired, this operation create the key and init the value to <value>+<init>. In that case, if initial_expiration is not provided or 0, the key will never expire
This operation is atomic.
INCR key -1 0\r\n
INCR key -1 0 60\r\n
RATE_LIMITER <key> <capacity> <duration>
with data: no
Consumes a token in a sliding window rate limiter with the key key
. The sliding window duration is configured with duration
seconds. The
rate limiter is created automatically
The command attempts to consume a token and return the number of remaining available tokens. If there were no more tokens available, the
command returns -1, otherwise
the command return the number of tokens available between 0 and capacity
note : GET command with this key return the available tokens
This operation is atomic.
Example : consumes a token in rate limiter key
with capacity 1000 tokens every 10 seconds
RATE_LIMITER key 1000 10\r\n
FLUSHALL [region]
with data: no
Remove all the key from the region. The region is optional. Without region parameter, the default region is flush.
This operation is atomic.
FLUSHALL\r\n
Or :
FLUSHALL region1\r\n
QUIT
with data: no
Close the connection with the server.
QUIT\r\n
LUA libraries support
The lua libraries (lua/dshm.lua) is used to pilot the shared memory. The library should be installed in the resty directory of the openresty distribution.
This is an exemple to use it :
local dshm = require "resty.dshm"
local store = dshm:new()
store:connect("127.0.0.1", 4321)
store:set("key", 10)
local value = store:get("key")
store:incr("key")
store:delete("key")
Resty Session support
This module could be used to activate session replication with the excellent lua library Resty Session (https://github.com/bungle/lua-resty-session)
To use it, copy the lua extension in your resty/session/storage directory and use this type of configuration in your nginx.conf :
set $session_storage dshm;
set $session_dshm_region sessions;
set $session_dshm_connect_timeout 1000; # (in milliseconds)
set $session_dshm_send_timeout 1000; # (in milliseconds)
set $session_dshm_read_timeout 1000; # (in milliseconds)
set $session_dshm_host 127.0.0.1;
set $session_dshm_port 4321;
set $session_dshm_pool_name sessions;
set $session_dshm_pool_timeout 1000; # (in milliseconds)
set $session_dshm_pool_size 10;
set $session_dshm_pool_backlog 10;
set $session_secret base64_encoded_string;
The session_storage parameter control the storage module to be used.
Docker
-
A Docker image can be built using the provided Dockerfile:
docker build \ --force-rm \ --squash \ -t 'local/docker-ngx-distributed-shm' \ .
-
Run a container:
docker run --rm -it \ -u root \ --name docker-ngx-distributed-shm \ 'local/docker-ngx-distributed-shm'
-
An official docker image build is available at quay.io or directly in the GitHub registry :
docker pull quay.io/grrolland/ngx-distributed-shm
docker pull ghcr.io/grrolland/ngx-distributed-shm
Kubernetes
-
See kubernetes directory for sample artefacts.
-
Use kustomize standalone or the one embedded in
kubectl
to generate kubernetes artefacts for a specific release:-
Change files in
kubernetes/overlays/test/
according to your needsconfigmap.yaml
contains a basic hazelcast configuration as yaml instead of xml for ease of reading
-
Generate artefacts and inspect:
cd kubernetes kubectl kustomize /overlays/test > kubernetes.yaml
-
Apply:
kubectl apply -f kubernetes.yaml
-
Management Center
- Optionally, the Hazelcast cluster can be monitored via Management Center