Home

Awesome

duckdb-extension-workflow

Table function workflow

Here is the workflow of a table function in DuckDB from the user perspective. Let's assume that the extension is called example extension and it implements a table function called example_function.

sequenceDiagram
    actor DuckDB user
    participant DuckDB engine
    participant Example extension

    DuckDB user->>DuckDB engine: INSTALL example.duckdb_extension;
    DuckDB engine->>Example extension: INSTALL
    Example extension->>DuckDB engine: duckdb_library_version()
    Note over Example extension,DuckDB engine: ensure the library is compatible with current engine

    DuckDB user->>DuckDB engine: LOAD example;
    DuckDB engine->>Example extension: request extension init
    Example extension->>DuckDB engine: example_extension_init()
    Note over Example extension,DuckDB engine: initialize extension provided functions

    DuckDB user->>DuckDB engine: SELECT * FROM example_function();
    Note over DuckDB user,DuckDB engine: example_function is a table function

    DuckDB engine->>Example extension: request example_function bind
    Example extension->>Example extension: example_function.set_bind_data(bind data)
    Example extension->>DuckDB engine: example_function.bind(function params)
    Note over Example extension,DuckDB engine: Bind is the stage to connect to the data layer <br/> (eg: create a connection to another DB) <br/> it allows to return columns for column pushdown too.

    DuckDB engine->>Example extension: request example_function init
    Example extension->>Example extension: example_function.set_init_data(init data)
    Example extension->>DuckDB engine: example_function.init(bind state)
    Note over Example extension,DuckDB engine: function to init a global function state <br/> (eg: generate the query with filter pushdown & track global progress)
    
    DuckDB engine->>Example extension: request example_function local_init
    Example extension->>Example extension: example_function.set_local_init_data(local init data)
    Example extension->>DuckDB engine: example_function.init(bind state)
    Note over Example extension,DuckDB engine: Optional function to init a thread local function state <br/> (eg: track thread status if you're sharding the workload)

    DuckDB engine->>Example extension: request example_function function
    Example extension->>DuckDB engine: example_function.function(bind data, init data, local init data, output chunk)
    Note over Example extension,DuckDB engine: The "main" function where you insert the data in the output chunk <br/> It's going to be called until the global state is `done` <br/> and should be returned once the output chunk is full

    DuckDB engine->>DuckDB user: Request result