Awesome
Zenko Backbeat
OVERVIEW
Backbeat is an engine with a messaging system at its heart. It's part of Zenko, Scality’s Open Source Multi-Cloud Data Controller. Learn more about Zenko at Zenko.io
Backbeat is optimized for queuing metadata updates and dispatching work to long-running tasks in the background. The core engine can be extended for many use cases, which are called extensions, as listed below.
EXTENSIONS
Asynchronous Replication
This feature replicates objects from one S3 bucket to another S3 bucket in a different geographical region. The extension uses the local Metadata journal as the source of truth and replicates object updates in a FIFO order.
DESIGN
Please refer to the Design document
QUICKSTART
This guide assumes the following:
- Using MacOS
brew
is installed (get it here)node
is installed (version 6.9.5)npm
is installed (version 3.10.10)aws
is installed (version 1.11.1)
Run kafka and zookeeper
Install kafka and zookeeper
brew install kafka && brew install zookeeper
Make sure you have /usr/local/bin
in your PATH
env variable (or wherever
your homebrew programs are installed):
echo 'export PATH="$PATH:/usr/local/bin"' >> ~/.bash_profile
Start kafka and zookeeper servers
mkdir ~/kafka && \
cd ~/kafka && \
curl http://apache.claz.org/kafka/0.11.0.0/kafka_2.11-0.11.0.0.tgz | tar xvz && \
sed 's/zookeeper.connect=.*/zookeeper.connect=localhost:2181\/backbeat/' \
kafka_2.11-0.11.0.0/config/server.properties > \
kafka_2.11-0.11.0.0/config/server.properties.backbeat
Start the zookeeper server:
zookeeper-server-start ~/kafka/kafka_2.11-0.11.0.0/config/zookeeper.properties
In a new shell, start the kafka server:
kafka-server-start ~/kafka/kafka_2.11-0.11.0.0/config/server.properties.backbeat
Create a zookeeper node and kafka topic
In a new shell, connect to the zookeeper server with the ZooKeeper chroot
/backbeat
path:
zkCli -server localhost:2181/backbeat
Create the replication-populator
node:
create /replication-populator my_data
We may leave the zookeeper server now:
quit
Create the backbeat-replication
kafka topic:
kafka-topics --create \
--zookeeper localhost:2181/backbeat \
--replication-factor 1 \
--partitions 1 \
--topic backbeat-replication
Run Scality Components
Start Vault and Scality S3 servers
Start the Vault server (this requires access to the private Vault repository):
git clone https://github.com/scality/Vault ~/replication/vault && \
cd ~/replication/vault && \
npm i && \
chmod 400 ./tests/utils/keyfile && \
VAULT_DB_BACKEND=MEMORY node vaultd.js
In a new shell, start the Scality S3 server:
git clone https://github.com/scality/s3 ~/replication/s3 && \
cd ~/replication/s3 && \
npm i && \
S3BACKEND=file S3VAULT=scality npm start
Setup replication with backbeat
In a new shell, clone backbeat:
git clone https://github.com/scality/backbeat ~/replication/backbeat && \
cd ~/replication/backbeat && \
npm i
Now, create an account and keys:
VAULTCLIENT=~/replication/backbeat/node_modules/vaultclient/bin/vaultclient && \
$VAULTCLIENT create-account \
--name backbeatuser \
--email dev@null \
--port 8600 >> backbeat_user_credentials && \
$VAULTCLIENT generate-account-access-key \
--name backbeatuser \
--port 8600 >> backbeat_user_credentials && \
cat backbeat_user_credentials
Output will look something like (this output is stored for reference in the file
backbeat_user_credentials
):
...
{
"id": "8CFJQ2Z3R6LR0WTP5VDS",
"value": "gB53GM7/LpKrm6DktUUarcAOcqHS2tvKI/=CxFxR",
"createDate": "2017-08-03T00:17:57Z",
"lastUsedDate": "2017-08-03T00:17:57Z",
"status": "Active",
"userId": "038628340774"
}
Store the account's credentials using the "id" and "value" fields:
aws configure --profile backbeatuser
The completed prompt should look like:
AWS Access Key ID [None]: 8CFJQ2Z3R6LR0WTP5VDS
AWS Secret Access Key [None]: gB53GM7/LpKrm6DktUUarcAOcqHS2tvKI/=CxFxR
Default region name [None]:
Default output format [None]:
Set up replication on your buckets:
node ~/replication/backbeat/bin/setupReplication.js setup \
--source-bucket source-bucket \
--source-profile backbeatuser \
--target-bucket target-bucket \
--target-profile backbeatuser
Run the backbeat queue populator:
npm --prefix ~/replication/backbeat run queue_populator
In a new shell, run the backbeat queue processor:
npm --prefix ~/replication/backbeat run queue_processor
You are now ready to put data on source-bucket
and watch it replicate to
target-bucket
!
Put an object on the source-bucket
:
echo 'content to be replicated' > replication_contents && \
aws s3api put-object \
--bucket source-bucket \
--key object-to-replicate \
--body replication_contents \
--endpoint http://localhost:8000 \
--profile backbeatuser
Check the replication status of the object we have just put:
aws s3api head-object \
--bucket source-bucket \
--key object-to-replicate \
--endpoint http://localhost:8000 \
--profile backbeatuser
The object's "ReplicationStatus" should either be "PENDING", or if some time has passed, then it should be "COMPLETED".
Check if the object has been replicated to the target bucket:
aws s3api head-object \
--bucket target-bucket \
--key object-to-replicate \
--endpoint http://localhost:8000 \
--profile backbeatuser
After some time, the object's "ReplicationStatus" should be "REPLICA". :smiley_cat:
Structure
In our $HOME
directory, we now have the following directories:
$HOME
├── kafka
│ └── kafka_2.11-0.11.0.0
├── replication
├── backbeat
├── s3
└── vault