Awesome
PSQL-Streamer
This service receives the database events from PostgreSQL using logical replication protocol and feeds them to sinks based on the configuration file settings. Also can receive the events from Kafka generated by e.g. another psql-streamer instance.
Event example
{
"Host": "db1",
"Database": "test",
"Table": "test",
"Action": "insert",
"WALPosition": 418152976,
"Timestamp": "2018-07-03T16:04:27.263625156+02:00",
"Columns": {
"a": 369223,
"b": "a"
}
}
Features
- You can configure a sink to receive events from several different sources (e.g. several Kafka clusters, PostgreSQL databases)
- Basic HTTP API
- Custom sources/sinks can be easily added - they only need to implement a simple interface
- Requires PostgreSQL 10+ (although initial implementation of logical replication came in 9.4 it lacks certain features to be useful)
Sources
PostgreSQL
Receive database events from PostgreSQL using logical replication protocol and transform them into a common event format.
Features
-
WAL log position persistence: the service persists each log position update it receives in a BoltDB database.
-
Configurable WAL logs retention on the PostgreSQL side. To allow us to rewind back (in case we need to replay some events) the service implements delayed confirmation of applied logs. This makes PostgreSQL retain the logs for the specified replication slot for some time. See walRetain parameter in the configuration file.
PostgreSQL configuration
- In
postgresql.conf
you need to setwal_level = logical
to make logical replication possible. You also may need to adjustmax_wal_senders
andmax_replication_slots
to match your requirements. - Create a publication in PostgreSQL like this:
CREATE PUBLICATION pub1 FOR ALL TABLES
. This will include in the publication all existing tables and also the ones that will be created in future. If you want only a subset of tables to be replicated - list them specifically. See PostgreSQL documentation for details. - Specify the publication name in the psql-streamer.conf
Kafka
Receive events which were generated by e.g. another psql-streamer instance from one or more Kafka topics. Expects messages in a JSON format conforming to a predefined structure (see Event structure below).
Features
Kafka source works in batching mode with confirmation: if all events in a batch from Kafka are ACKed by sinks then we commit the whole batch or don't commit anything. The batch elements are sent concurrently using goroutines, so it should not be made very large. If several topics are specified then they're worked on in separate goroutines which should provide more parallelism.
Sinks
Kafka
Send events into a Kafka topic based on several configurable rules like:
- Table -> Topic mapping
- Fallback topic messages with no mapping defined
Kafka sink handlers
Incoming events are processed by one or more handlers that generate Kafka messages. They can be used to enrich Events with additional data or encode them in different format.
Currently there's only one handler:
- passthrough: simply marshal an event into JSON and send it to Kafka
Stub
Stub sink is a discard/noop sink that can be used for testing.
HTTP API
All requests are GET.
- /health is a no-op that just returns 200
- /status checks all sinks/sources statuses and outputs them. If something is not OK then it returns 500 instead of 200. Useful for monitoring.
- /stats returns statistics from all sinks/sources
- /debug/[on|off] toggles debugging mode for all sinks/sources
- /metrics gives out Prometheus metrics
Configuration
See psql-streamer.toml for detailed instructions.
Download
Get RPM or DEB packages from Releases page
Build
psql-streamer is written in Go, so you need to install it first.
Then you can run make build
to build a binary.
If you want to build RPM/DEB packages then you need to install FPM and then run make rpm
or make deb