Home

Awesome

<h1 style="display: flex; align-items: center;"> <img width="50" alt="duck under dolphin" style="margin-right: 0.2em" src="logo/myduck-logo.png"> <span>MyDuck Server</span> </h1>

MyDuck Server unlocks serious power for your MySQL & Postgres analytics. Imagine the simplicity of (MySQL|Postgres)’s familiar interface fused with the raw analytical speed of DuckDB. Now you can supercharge your analytical queries with DuckDB’s lightning-fast OLAP engine, all while using the tools and dialect you know.

<h1 style="display: flex; align-items: center;"> <img alt="duck under dolphin" style="margin-right: 0.2em" src="logo/MyDuck.svg"> </h1>

πŸ“‘ Table of Contents

❓ Why MyDuck ❓

While MySQL and Postgres are the most popular open-source databases for OLTP, their performance in analytics often falls short. DuckDB, on the other hand, is built for fast, embedded analytical processing. MyDuck Server lets you enjoy DuckDB's high-speed analytics without leaving the (MySQL|Postgres) ecosystem.

With MyDuck Server, you can:

MyDuck Server isn't here to replace MySQL & Postgres β€” it's here to help MySQL & Postgres users do more with their data. This open-source project provides a convenient way to integrate high-speed analytics into your workflow while embracing the flexibility and efficiency of DuckDB.

✨ Key Features

πŸ“Š Performance

Typical OLAP queries can run up to 1000x faster with MyDuck Server compared to MySQL & Postgres alone, especially on large datasets. Under the hood, it's just DuckDB doing what it does best: processing analytical queries at lightning speed. You are welcome to run your own benchmarks and prepare to be amazed! Alternatively, you can refer to well-known benchmarks like the ClickBench and H2O.ai db-benchmark to see how DuckDB performs against other databases and data science tools. Also remember that DuckDB has robust support for transactions, JOINs, and larger-than-memory query processing, which are unavailable in many competing systems and tools.

πŸƒβ€β™‚οΈ Getting Started

Prerequisites

Installation

Get a standalone MyDuck Server up and running in minutes using Docker:

docker run -p 13306:3306 -p 15432:5432 apecloud/myduckserver:latest

This setup exposes:

Usage

Connecting via MySQL client

Connect using any MySQL client to run MySQL-style SQL queries:

mysql -h127.0.0.1 -P13306 -uroot

[!NOTE] MySQL CLI clients version 9.0 and above are not yet supported on macOS. Consider brew install mysql-client@8.4.

Connecting via PostgreSQL client

For full analytical power, connect to the Postgres port and run DuckDB SQL queries directly:

psql -h 127.0.0.1 -p 15432 -U postgres

Replicating Data

We have integrated a setup tool in the Docker image that helps replicate data from your primary (MySQL|Postgres) server to MyDuck Server. The tool is available via the SETUP_MODE environment variable. In REPLICA mode, the container will start MyDuck Server, dump a snapshot of your primary (MySQL|Postgres) server, and start replicating data in real-time.

[!NOTE] Supported primary database versions: MySQL>=8.0 and PostgreSQL>=13. In addition to the default settings, logical replication must be enabled for PostgreSQL by setting wal_level=logical. For MySQL, GTID-based replication (gtid_mode=ON and enforce_gtid_consistency=ON) is recommended but not required.

docker run -d --name myduck \
  -p 13306:3306 \ 
  -p 15432:5432 \
  --env=SETUP_MODE=REPLICA \
  --env=SOURCE_DSN="<postgres|mysql>://<user>:<password>@<host>:<port>/<dbname>"
  apecloud/myduckserver:latest

SOURCE_DSN specifies the connection string to the primary database server, which can be either MySQL or PostgreSQL.

[!NOTE] To replicate from a server running on the host machine, use host.docker.internal as the hostname instead of localhost or 127.0.0.1. On Linux, you must also add --add-host=host.docker.internal:host-gateway to the docker run command.

Connecting to Cloud MySQL & Postgres

MyDuck Server supports setting up replicas from common cloud-based MySQL & Postgres offerings. For more information, please refer to the replica setup guide.

HTAP Setup

With MyDuck's powerful analytics capabilities, you can create an hybrid transactional/analytical processing system where high-frequency data writes are directed to a standard MySQL or Postgres instance, while analytical queries are handled by a MyDuck Server instance. Follow our HTAP setup instructions to easily set up an HTAP demonstration:

Customizing the Docker Container

To rename the default database, pass the DEFAULT_DB environment variable to the Docker container:

docker run -d -p 13306:3306 -p 15432:5432 \
    --env=DEFAULT_DB=mydbname \
    apecloud/myduckserver:latest

To set the superuser password, pass the SUPERUSER_PASSWORD environment variable to the Docker container:

docker run -d -p 13306:3306 -p 15432:5432 \
    --env=SUPERUSER_PASSWORD=mysecretpassword \
    apecloud/myduckserver:latest

To initialize MyDuck Server with custom SQL statements, mount your .sql file to either /docker-entrypoint-initdb.d/mysql/ or /docker-entrypoint-initdb.d/postgres/ inside the Docker container, depending on the SQL dialect you're using.

For example:

# Execute `init.sql` via MySQL protocol
docker run -d -p 13306:3306 --name=myduck \
    -v ./init.sql:/docker-entrypoint-initdb.d/mysql/init.sql \
    apecloud/myduckserver:latest

# Execute `init.sql` via PostgreSQL protocol
docker run -d -p 15432:5432 --name=myduck \
    -v ./init.sql:/docker-entrypoint-initdb.d/postgres/init.sql \
    apecloud/myduckserver:latest

Query Parquet Files

Looking to load Parquet files into MyDuck Server and start querying? Follow our Parquet file loading guide for easy setup.

Already Using DuckDB?

Already have a DuckDB file? You can seamlessly bootstrap MyDuck Server with it. See our DuckDB file bootstrapping guide for more details.

Backup and Restore with Object Storage

To back up and restore your MyDuck Server database using object storage, refer to our backup and restore guide for detailed instructions.

LLM Integration

MyDuck Server can be integrated with LLM applications via the Model Context Protocol (MCP). Follow the MCP integration guide to set up MyDuck Server as an external data source for LLMs.

Access from Python

MyDuck Server can be seamlessly accessed from the Python data science ecosystem. Follow the Python integration guide to connect to MyDuck Server from Python and export data to PyArrow, pandas, and Polars. Additionally, check out the Ibis integration guide for using the Ibis dataframe API to query MyDuck Server directly.

🎯 Roadmap

We have big plans for MyDuck Server! Here are some of the features we’re working on:

🏑 Join the Community

Let's connect on Discord to discuss requirements, address issues, and share user experiences.

πŸ’‘ Contributing

MyDuck Server is open-source, and we’d love your help to keep it growing! Check out our CONTRIBUTING.md for ways to get involved. From bug reports to feature requests, all contributions are welcome!

πŸ’— Acknowledgements

MyDuck Server is built on top of a collection of amazing open-source projects, notably:

We are grateful to the developers and contributors of these projects for their hard work and dedication to open-source software.

πŸ“ License

MyDuck Server is released under the Apache License 2.0.