Awesome
DuckDB Azure Extension
This extension adds a filesystem abstraction for Azure blob storage to DuckDB. To use it, install latest DuckDB. The extension currently supports only reads and globs.
Basics
Setup authentication (leverages either Azure CLI or Managed Identity):
CREATE SECRET secret1 (
TYPE AZURE,
PROVIDER CREDENTIAL_CHAIN,
ACCOUNT_NAME '⟨storage account name⟩'
);
Then to query a file on azure:
SELECT count(*) FROM 'az://<my_container>/<my_file>.<parquet_or_csv>';
Globbing is also supported:
SELECT count(*) FROM 'az://dummy_container/*.csv';
Other authentication methods
Other authentication options available:
Connection string
CREATE SECRET secret2 (
TYPE AZURE,
CONNECTION_STRING '<value>'
);
Service Principal
(replace CLIENT_SECRET
with CLIENT_CERTIFICATE_PATH
to use a client certificate)
CREATE SECRET azure3 (
TYPE AZURE,
PROVIDER SERVICE_PRINCIPAL,
TENANT_ID '⟨tenant id⟩',
CLIENT_ID '⟨client id⟩',
CLIENT_SECRET '⟨client secret⟩',
ACCOUNT_NAME '⟨storage account name⟩'
);
Access token
(its audience needs to be https://storage.azure.com
)
CREATE SECRET secret4 (
TYPE AZURE,
PROVIDER ACCESS_TOKEN,
ACCESS_TOKEN '<value>'
ACCOUNT_NAME '⟨storage account name⟩'
);
Anonymous
CREATE SECRET secret5 (
TYPE AZURE,
PROVIDER CONFIG,
ACCOUNT_NAME '⟨storage account name⟩'
);
Supported architectures
The extension is tested & distributed for Linux (x64, arm64), MacOS (x64, arm64) and Windows (x64)
Documentation
See the Azure page in the DuckDB documentation.
Check out the tests in test/sql
for more examples.
Building
For development, this extension requires CMake, Python3, a C++11
compliant compiler, and the Azure C++ SDK. Run make
in the root directory to compile the sources. Run make debug
to build a non-optimized debug version. Run make test
to verify that your version works properly after making changes. Install the Azure C++ SDK using vcpkg and set the VCPKG_TOOLCHAIN_PATH
environment variable when building.
sudo apt-get update && sudo apt-get install -y git g++ cmake ninja-build libssl-dev
git clone --recursive https://github.com/duckdb/duckdb_azure
git clone https://github.com/microsoft/vcpkg
./vcpkg/bootstrap-vcpkg.sh
cd duckdb_azure
GEN=ninja VCPKG_TOOLCHAIN_PATH=$PWD/../vcpkg/scripts/buildsystems/vcpkg.cmake make
Please also refer to our Build Guide and Contribution Guide.