Awesome
<!--lint disable awesome-git-repo-age-->Awesome DuckDB <!-- omit in toc -->
A curated list of awesome DuckDB libraries, tools and resources.
DuckDB is an analytical in-process SQL database management system.
DuckDB 1.1.0 was released on 2024-09-09: see the announcement blog post.
Chat with this page
You can chat with this page's content on HuggingChat.
<!-- omit in toc -->Contents
- Chat with this page
- Resources
- Client APIs
- Tools Powered by DuckDB
- Libraries Powered by DuckDB
- SQL Clients and IDE that Support DuckDB
- Projects Powered by DuckDB
- Integrations
- Extensions
- Media
- Contribute
Resources
- Official Documentation - Official DuckDB documentation.
- Official Blog - Official DuckDB blog.
- DuckDB Clients - Client APIs for DuckDB.
- DuckDB Documentation PDF - The DuckDB documentation as a single PDF file.
- docker-duckdb - Docker image for DuckDB CLI.
- DuckDB setup - GitHub Action to install DuckDB in CI.
- Serverless DuckDB over S3 - Running DuckDB over a data lake on S3 using lambda.
- DuckDB snippets - Collection of snippets curated by MotherDuck.
- DuckDB tldr page - DuckDB's entry in tldr pages, available in CLI via the
tldr duckdb
command. - DuckDB AWS Lambda layer - Run DuckDB in AWS Lambda functions.
- Compatible DuckDB Extensions for AWS Lambda - Extensions specifically compiled for the AWS Lambda runtime (GLIBC 2.26).
- Serverless DuckDB as API - Use DuckDB as API with Amazon API Gateway and AWS Lambda.
- Serverless Parquet Repartitioner - Use DuckDB to repartition data in S3-based Data Lakes.
- Observable notebooks - Notebooks using DuckDB on the Observable data visualization platform.
- duckdb-nf - Example uses of DuckDB with Nextflow.
- DuckDB version manager (
duckman
) – Cross-platform installer and version manager for DuckDB. - DuckERD CLI – A CLI tool to create an ER Diagram from DuckDB database files.
Client APIs
- C
- C++
- CLI
- Go
- Julia
- Node.js
- Python
- R
- Ruby
- Rust
- Swift
- TypeScript
- Wasm
- ADBC
- ODBC
- .NET
- Common Lisp
- PowerShell
- Dart
Tools Powered by DuckDB
- Rill Developer - Tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL.
- Ibis Project - A DataFrame API for interacting with DuckDB (and other compute engines).
- MotherDuck - Serverless data warehouse powered by DuckDB.
- Boiling Data - Serverless data analytics overlay on top of S3 Data Lakes.
- Hex Dataframe SQL - Hex's Dataframe SQL cells are powered by DuckDB.
- Mode - Mode uses DuckDB for their in-memory data engine.
- VulcanSQL - DuckDB can be used as a caching layer or a data connector in VulcanSQL, a Data API framework for data folks to create REST APIs by writing SQL templates.
- Tad - A fast, free, cross-platform tabular data viewer application powered by DuckDB.
- Honeycomb Maps - A browser-based geospatial analysis tool leveraging DuckDB Wasm.
- Bauplan - A serverless data transformation platform for data lakes.
- Malloy - Malloy is an experimental language for describing data relationships and transformations. Malloy connects to BigQuery, Snowflake, Trino, and Postgres, and natively supports DuckDB.
- Evidence - Generate reports using SQL and markdown. The DuckDB connector allows querying across DuckDB, csv, parquet and json.
- Latitude - Latitude uses DuckDB to power data snapshots. Drop a CSV file and query it with SQL at the speed of light.
- Census - Census's dataset diffing for incremental syncs is powered by DuckDB.
- Huey - Blazing-fast & intuitive pivot tables on .parquet, .csv, .json files and .duckdb tables in the browser based on DuckDB WASM. Open source (MIT). Zero install!
- Parquet Explorer - Visual Studio Code extension for exploring Parquet files with SQL, powered by DuckDB.
- DQOps - Data quality platform for data engineers, data quality teams and data operations.
- DatalakeStudio - Load, explore, transform your datasets and expose them via API. Integration with external APIs, S3, PostgreSQL and ChatGPT.
- Spice.ai - A unified SQL query interface and portable runtime to locally materialize (using an embedded DuckDB), accelerate, and query datasets from any database, data warehouse, or data lake.
- Definite - Definite pulls all your data into a single place for analytics and dashboards. No engineering or SQL required. Get a managed data warehouse (DuckDB), ELT, data modeling / transformations and BI in a single platform.
- Amphi ETL - Low-code data pipelines for structured and unstructured data. SQL transformations are powered by DuckDB.
- Quackpipe - Serverless OLAP API/UI built on top of DuckDB with basic ClickHouse API compatibility and Motherduck support.
- ParadeDB - Postgres for Search and Analytics, powered by DuckDB-embedded-in-Postgres.
- Crunchy Bridge for Analytics - Fully managed DBaaS based in Postgres integrated with DuckDB.
- UniverSQL - An implementation of Snowflake API, enables running queries on Snowflake tables locally with DuckDB without a running warehouse.
- Whereabouts - Fast, accurate, open-source geocoding in Python, using DuckDB.
- Phoenix Analytics - Plug and play analytics for Phoenix applications, powered by DuckDB.
- sqlglot - Python transpiler that translates between 23 different SQL dialects including DuckDB.
- yato - The smallest DuckDB SQL orchestrator on Earth.
Web Clients
- Online DuckDB Shell - Online DuckDB shell powered by WebAssembly.
- SQL Workbench - DuckDB-WASM based SQL Workbench for running queries on local or remote data, being able to show data as tables or visually as graphs, and sharing queries via URLs.
- Sekuel Playground - Query your local parquet, csv, json. Your data will not be sent out of the device you are using.
- CSVFiddle - Free tool to explore and share insights from CSV files using SQL. Import data, write SQL, then instantly share it with anyone.
- Codapi - Embed executable code snippets directly into your product documentation, online course or blog post.
- QuackDB - Open-source online DuckDB SQL playground and editor.
- WhatTheDuck - WhatTheDuck is an open-source web application built on DuckDB. It allows users to upload CSV files, store them in tables, and perform SQL queries on the data.
Backends
- DuckDB API - a TypeScript-based Docker image containing DuckDB, and a Hono framework REST API with JSON or streaming Arrow responses
- Mosaic DuckDB Server - A Python-based server that runs a local DuckDB instance and support queries over Web Sockets or HTTP, returning data in either Apache Arrow or JSON format
- duckdb-server - A Rust-based server that runs a local DuckDB instance and support queries over Web Sockets or HTTP/HTTPS, returning data in either Apache Arrow or JSON format.
Libraries Powered by DuckDB
- Mosaic - An extensible framework for linking databases and interactive views.
- Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
- Splink - A free Python library for fast, accurate data deduplication and record linkage.
- Simple-data-analysis - Easy-to-use and high-performance JavaScript library for data analysis.
- pg_analytics - PostgreSQL extension embedding DuckDB-in-Postgres for fast on-disk and remote object storage analytics from Postgres. Built as a Foreign Data Wrapper with full query pushdown to DuckDB.
- duckdb_fdw - DuckDB Foreign Data Wrapper for PostgreSQL.
- @jetblack/duckdb-react - A context manager for React and duckdb-wasm.
- QuackOSM - A Python library for downloading and transforming raw OpenStreetMap data into GeoParquet files.
- PyGWalker - A Pyhon library that turns your dataframe into an interactive UI for data visualization.
- [https://github.com/DataZooDE/flapi] API Framework heavily relies on the power of DuckDB and DuckDB extensions. Ready to build performant and cost-efficient APIs on top of BigQuery or Snowflake for AI Agents and Data Apps
SQL Clients and IDE that Support DuckDB
- Harlequin - The DuckDB IDE for your terminal. (GitHub).
- qStudio - A free SQL tool specialized for data analysts. It runs on every operating system and allows easy browsing of tables and charting of results.
- DuckDB SQL Tools - Free DuckDB SQL Tools for VS Code IDE. Premium version available with advanced features.
- VSCode SQLTools (Free) - Free open-source VSCode extension to query and explore your DuckDB databases with latest DuckDB support.
- DBeaver - DBeaver is a universal database access and development tool that can be used to connect almost any type of database.
- DataGrip - Paid SQL IDE by Jetbrains that supports many different database technologies, including DuckDB.
- Duckling - A fast viewer for CSV/Parquet files and DuckDB/SQLite, based on Tauri.
- rsql - CLI for DuckDB, LibSQL, MariaDB, MySQL, PostgreSQL, SQLite3 and SQL Server.
- jsqltranspiler - Rewrite BigQuery, Redshift, Snowflake and Databricks queries into DuckDB compatible SQL.
- jOOQ - Type safe querying of DuckDB (and many other RDBMS) from Java. A transpiler from and to DuckDB is also available.
- SQL DATA LENS - A lightweight, commercial SQL IDE that supports different DBMS, including DuckDB. The focus on performance and special DBMS features.
- Dataflare - Simple easy-to-use database manager, supports DuckDB, PostgreSQL, MySQL, SQL Server, SQLite etc.
- manifold-sql (DuckDB for Java) - Use native DuckDB SQL of any complexity directly & type-safely in Java source with comprehensive IntelliJ support.
Projects Powered by DuckDB
- NBA Monte Carlo - Monte Carlo simulation of the NBA season, leveraging Meltano, dbt, DuckDB and Evidence.
- Datadex - Open source and local friendly data platform to collaborate on Open Data using DuckDB, Dagster, dbt, and Quarto.
endoflife.date
database - Daily dumps of endoflife.date data.transfermarkt-datasets
- Curated football datasets from Transfermarkt.- duckDB-embedding-search - A search engine for DuckDB that uses embedding vectors to find similar documents.
- DuckDB PyPI stats live dashboard (GitHub repository) - Live dashboard of PyPI downloads using DuckDB, dbt, Evidence and MotherDuck with code source to build your own.
- Specter - Specter is a CLI tool to search and monitor Databricks audit logs.
Integrations
- dbt-duckdb - DuckDB dbt adapter.
- data load tool - DuckDB destination - Extract and load data from APIs to DuckDB using dlt.
- target-duckdb - Load data to DuckDB based on Singer spec.
- Airbyte DuckDB destination - Load data to DuckDB with Airbyte.
- Kestra DuckDB plugin - Run queries with DuckDB to schedule data transformations and process automations, and run event-driven anomaly detection pipelines.
- SQLFlite - Arrow Flight SQL Server - An example implementation of the Arrow Flight SQL protocol that runs in a client-server setup with DuckDB or SQLite as backends.
- SQLFlow - Enables SQL-based stream-processing, powered by DuckDB.
- nf-sqldb - This plugin provides support for interacting with SQL databases in Nextflow scripts.
- MindsDB - The platform for customizing AI from enterprise data. MindsDB integrates with DuckDB, making data from DuckDB accessible to a diverse range of AI/ML models.
- Sqlite2Duckdb - A CLI tool to convert SQLite database to DuckDB.
- nodbi - NoSQL Database Connector for R, providing a common API across Elasticsearch, CouchDB, MongoDB, SQLite, PostgreSQL, and DuckDB.
- duckplyr - Drop-in replacement for dplyr in R that uses DuckDB for performance.
- kwack - In-Memory Analytics for Kafka using DuckDB.
- PSDuckDB - A PowerShell module for DuckDB integration.
- duckdb-tableau-connector - DuckDB Tableau connector.
- duckdb-power-query-connector - DuckDB Power Query Custom Connector.
- metabase_duckdb_driver - Metabase DuckDB Driver shipped as 3rd party plugin.
Extensions
Official Extensions
Official DuckDB extensions, which can installed via INSTALL ⟨extension_name⟩
.
arrow
- A zero-copy data integration between Apache Arrow and DuckDB.aws
- For handling AWS credentials.azure
- For using the Azure Blob storage.delta
- For Delta Lake support.fts
- To support full text search.iceberg
- For reading Iceberg tables.inet
- For storing and handling IPv4 and IPv6 Internet addresses.mysql
- To read from and write to MySQL databases.postgres
- To read from and write to PostgreSQL databases.spatial
- Enables geospatial processing.sqlite
- To read from and write to SQLite databases.vss
- Add support for vector similarity search.
Community Extensions
Community-contributed DuckDB extensions, which can be installed via INSTALL ⟨extension_name⟩ FROM community
.
chsql
- ClickHouse SQL Dialect macros for DuckDBcrypto
- Cryptographic hash functions and HMACduckpgq
- Graph workloads that supports the SQL/PGQ standard.evalexpr_rhai
- Evaluates the Rhai scripting language as part of SQL.fuzzycomplete
- Performs fuzzy string matching for autocompletion.h3
- Adds support for the H3 discrete global grid system.lindel
- Linearization/Delinearization, Z-Order, Hilbert and Morton Curvesprql
- Run PRQL commands directly within DuckDB.scrooge
- A set of aggregation functions and data scanners on financial data.shellfs
- Allows shell commands to be used for input and output.ulid
- ULID data type for DuckDB. A ULID is similar to a UUID except that it also contains a timestamp component.gsheets
- Read and write Google Sheets using SQL.httpserver
- DuckDB HTTP API Server and Query Interface.
Other Extensions
- DuckDB Extension Radar - Repository that contains DuckDB extensions on GitHub. Refreshed daily.
- duckdb-bigquery - Enables seamless integration and querying of BigQuery datasets within DuckDB.
- duckdb-engine - SQLAlchemy driver for DuckDB.
- duckdb-extension-template-zig - A Zig & Nix toolkit template for building extensions against multiple versions of DuckDB using Zig, C or C++.
- duckdb-jfr-extension - DuckDB extension to read JFR (Java Flight Recorder) files directly.
- duckdb-protobuf - Plugin for querying encoded protobuf messages (both sequences and individual messages per file).
- duckdb-pytables - DuckDB extension to allow running SQL on arbitrary data sources.
- ERPL - DuckDB SAP connector using RFC, ODP, or BICS.
- Kùzu - Scan DuckDB tables in Kùzu, an embeddable property graph database management system.
- Lance - Integrate Lance (modern columnar data format for ML implemented in Rust) with DuckDB.
- ODBC Scanner DuckDB Extension - DuckDB extension to read data directly from databases supporting the ODBC interface.
- QDuckDB - Plugin for reading DuckDB spatial tables in QGIS software.
uc_catalog
- Proof-of-concept extension combining thedelta
extension with Unity Catalog.- duckdb-flockmtl - Integrate language model (LLM) capabilities directly into your queries and workflows.
- erpl-web ERPL WEB is a DuckDB extension that connects API-based ecosystems via standard interfaces like OData, GraphQL, and REST.
Media
Talks
- DuckDB: Crunching data anywhere from laptops to servers @ GOTO Amsterdam 2024 - Gábor Szárnyas.
- DuckDB – Overview and latest developments @ DuckCon #5 - Hannes Mühleisen and Mark Raasveldt.
- DuckCon #5 playlist
- DuckCon #4 playlist
- DuckCon #3 playlist
- In-Process Analytical Data Management with DuckDB @ PyData Amsterdam - Hannes Mühleisen.
- DuckDB: The Power of a Data Warehouse in your Python Process @ PyData Yerevan - Gábor Szárnyas.
- DuckDB: Bringing analytical SQL directly to your Python shell @ EuroPython - Pedro Holanda.
- DuckDB keynote @ Data + AI Summit 2023 - Hannes Mühleisen.
- DuckDB: Bringing Analytical SQL Directly To Your Python Shell @ FOSDEM - Pedro Holanda.
- State of the Duck @ DuckCon #2 - Hannes Mühleisen & Mark Raasveldt.
- DuckDB Extensions @ DuckCon - Pedro Holanda & Sam Ansmink.
- Developing Systems in Academia: The Good, the Bad, and the not-so-Ugly Duckling @ CIDR - Hannes Mühleisen.
- DuckDB An Embeddable Analytical Database @ FOSDEM - Hannes Mühleisen.
- DuckDB tutorials playlist by Learn Data with Mark - Mark Needham.
- DuckDB tutorials playlist by MotherDuck - Mehdi Ouazza.
- Nextflow and database uses: powering data engineering, exploring DuckDB, and beyond - Edmund Miller.
- Why should you care about DuckDB? @ Dublin DuckDB meetup - Mihai Bojin.
- Exploring Monte Carlo Simulations With DuckDB @ Dublin DuckDB meetup - James McNeill.
- DuckDB and recommenders : a lightning fast synergy @ Dublin DuckDB meetup - Khalil Muhammad.
Podcasts
- Developer Voices: Implementing Hardware-Friendly Databases - Hannes Mühleisen.
- The Geek Narrator: DuckDB Internals - Mark Raasveldt.
- Software Engineering Daily: DuckDB - Hannes Mühleisen.
- Data Engineering Podcast: Move Your Database To The Data And Speed Up Your Analytics With DuckDB - Hannes Mühleisen.
- The Analytics Engineering Podcast: The Personal Data Warehouse - Jordan Tigani.
Blog Posts
- Modern Data Stack in a Box - Fast, free, and open-source Modern Data Stack deployed on a laptop using the combination of DuckDB, Meltano, dbt, and Apache Superset.
- How to use DuckDB, Motherduck and Kestra for ETL - How DuckDB can transform data, mask sensitive PII information, detect anomalies in event-driven workflows, and streamline reporting use cases.
- DuckDB vs. MotherDuck — how do they compare - What are key differences between them, and when to choose each of these options.
- Building DuckDB Extensions with Zig and Nix - For Nix users and Zig developers familiar with DuckDB looking to extend its capabilities with custom extensions.
- Exploring StarCraft 2 data with Airflow, DuckDB and Streamlit - Example project using DuckDB to persist API data, but also explains how to use DuckDB as a versatile data manipulation tool in data wrangling scripts.
- DuckDB: The Rising Star in the Big Data Landscape
- How to Make a DuckDB Extension for a Table Function? - How to make a DuckDB extension to fetch data from external sources.
- Putting DuckDB in Postgres to Query Iceberg - How ParadeDB embedded DuckDB in Postgres to achieve fast analytics and Apache Iceberg compatibility from Postgres.
Books
- DuckDB in Action - DuckDB in Action will show you how to quickly get your hands dirty with DuckDB.
- Getting Started with DuckDB - A practical guide for accelerating your data science, data analytics, and data engineering workflows.
Contribute
Contributions welcome! Read the contribution guidelines first.