Home

Awesome

Introduction

A Column Level Lineage Graph for SQL.

Have you ever wondered what is the column level relationship among your SQL scripts and base tables? Don't worry, this tool is intended to help you by creating an interactive graph on a webpage to explore the column level lineage among them(Currently only supports Postgres, other connection types or dialects are under development).

How to run

Here is a live demo with the mimic-iv concepts_postgres files(navigation instructions) and that is created with one line of code:

from lineagex.lineagex import lineagex
  
lineagex(sql=path/to/sql, target_schema="schema1", conn_string="postgresql://username:password@server:port/database", search_path_schema="schema1, public")

Check out more detailed usage and examples here.

What does it output

The input can be a path to a SQL file, a path to a folder containing SQL files, a list of SQLs or a list of view names and/or schemas. Optionally, you can provide less information with only the SQLs, but providing the schema information and database connection is highly recommended for the best result. The output would be a output.json and a index.html file in the folder. Start a local http server and you would be able to see the interactive graph. <img src="https://raw.githubusercontent.com/sfu-db/lineagex/main/docs/example.gif"/> Check out more detailed navigation instructions here.

Why use LineageX

A general introduction of the project can be found in this blog post.

Supported JSON format:

You can upload JSON files into the HTML produced and draw its lineage graph. Here is the supported format:

{
    table_name: {
        tables:[],
        columns:{
            column1: [[], []], // The first element is the list of columns that contribute directly to column1, 
                               // The second element is the list of columns that are referenced, such as columns from WHERE/GROUP BY
            column2: [[], []]
        },
        table_name: "",
        sql: "",
    }, 
}

As an example:

{
  table1: {
    tables: [schema1.other_table], 
    columns: {
      column1: [[schema1.other_table.columns1], [schema1.other_table.columns3]], 
      column2: [[schema1.other_table.columns2], [schema1.other_table.columns3]]
    }, 
    table_name: schema1.table1,
    sql: SELECT column1, column2 FROM schema1.other_table WHERE column3 IS NOT NULL;
  }, 
}

Supported Database Connection Types

When entering the conn_string parameter, only supported databases' connection types can be parsed successfully, or the lineage graph would be created as if no conn_string parameter is given.

Database Connection Types

Documentation

Doc: https://sfu-db.github.io/lineagex/intro.html or just here