Awesome
dbt-invoke
dbt-invoke is a CLI (built with Invoke) for creating, updating, and deleting dbt property files.
-
Supported dbt resource types:
- models
- seeds
- snapshots
- analyses
-
Under the hood, this tool works by combining the power of the dbt ls and dbt run-operation commands with dbt's built-in
get_columns_in_query
macro.- This methodology allows the tool to work on ephemeral models and analyses, which other approaches, such as those based on listing data warehouse tables/views, can miss.
Installation
- If you are using dbt-core version 1.5.0 or later:
pip install dbt-invoke
- If you are using a dbt-core version earlier than 1.5.0:
pip install dbt-invoke~=0.2
Usage
-
You must have previously executed
dbt run
/dbt seed
/dbt snapshot
on the resources for which you wish to create/update property files.- If you have made updates to your resources, execute the appropriate command
(
dbt run
/dbt seed
/dbt snapshot
) before using this tool to create/update property files.
- If you have made updates to your resources, execute the appropriate command
(
-
Property files will be created, updated, or deleted on a one-to-one basis in the same paths as the resource files they represent (the only change being a
.yml
file extension).-
For example, given a resource file in the location
models/marts/core/users.sql
, this tool will create, update, or delete a property file in the locationmodels/marts/core/users.yml
. -
If your dbt project defines properties for multiple resources per
.yml
file, see the Migrating to One Resource Per Property File section.
-
-
Any newly generated property files are created with the correct resource type, resource name, and columns. A blank description field will be included for each column and for the resource itself.
- For example, when generating a new property file for a model
users
with column namesuser_id
andcreated_at
, the following yaml will be generated:-
version: 2 models: - name: users description: '' columns: - name: user_id description: '' - name: created_at description: ''
-
- For example, when generating a new property file for a model
-
When updating an already existing property file, new columns in the resource will be added, and columns that no longer exist will be removed.
-
You may fill in the blank
description
properties and add other properties (e.g.tests
). They will remain intact when updating existing property files as long as the column/resource name to which they belong still exists.
Creating/Updating Property Files
dbt-invoke properties.update <options>
# OR, because 'update' is set as the default command:
dbt-invoke properties <options>
-
The first time you run this command, you should be prompted to add a short macro called
_log_columns_list
to your dbt project.- You may accept the prompt to add it automatically.
- Otherwise, copy/paste it into one your dbt project's macro-paths yourself.
- To print the macro, at any time, run
dbt-invoke properties.echo-macro
.
-
<options>
primarily uses the same arguments as thedbt ls
command to allow flexibility in selecting the dbt resources for which you wish to create/update property files (rundbt ls --help
for details).- --resource-type
- --models
- --select
- --selector
- --exclude
- --project-dir
- --profiles-dir
- --profile
- --target
- --vars
- --bypass-cache
- --state
-
Notes:
- This tool supports only the long flags of
dbt ls
options (for example:--models
, and not short flags like-m
). - Multiple values for the same argument can be passed as a comma separated
string (Example:
--models modelA,modelB
)- Keep in mind that dbt may not support multiple values for certain options.
- This tool supports only the long flags of
-
Two additional flags are made available.
--log-level
to alter the verbosity of logs.- It accepts one of Python's standard logging levels (debug, info, warning, error, critical).
--threads
to set a maximum number of concurrent threads to use in collecting resources' column information from the data warehouse and in creating/updating the corresponding property files. Each thread will run dbt's get_columns_in_query macro against the data warehouse.
-
Some examples:
# Create/update property files for all supported resource types dbt-invoke properties # Create/update property files for all supported resource types, using 4 concurrent threads dbt-invoke properties --threads 4 # Create/update property files for all models in a models/marts directory dbt-invoke properties --models marts # Create/update property files for a 'users' model and an 'orders' models dbt-invoke properties --models users,orders # Create/update property files for a 'users' model and all downstream models dbt-invoke properties --models users+ # Create/update property files for all seeds dbt-invoke properties --resource-type seed # Create/update a property file for a snapshot called 'users_snapshot' dbt-invoke properties --resource-type snapshot --select users_snapshot # Create/update property files when your working directory is above your dbt project directory dbt-invoke properties --project-dir path/to/dbt/project/directory
Deleting Property Files
dbt-invoke properties.delete <options>
<options>
uses the same arguments as for creating/updating property files, except for--threads
.
Migrating to One Resource Per Property File
-
Two conventions for dbt property files are:
- One resource per property file
- Multiple resources per property file
-
dbt-invoke abides by the "One resource per property file" convention.
-
If your dbt project uses the "Multiple resources per property file" convention, you can migrate to one resource per property file by using dbt-invoke's
properties.migrate
command as shown here:dbt-invoke properties.migrate <options>
- Successfully migrated properties will be removed from existing
multiple-resource property files.
- At then end of migration, property files that are newly empty (other than
version: 2
) will be automatically deleted.
- At then end of migration, property files that are newly empty (other than
<options>
uses the same arguments as for creating/updating property files, except for--threads
.
- Successfully migrated properties will be removed from existing
multiple-resource property files.
Help
-
To view the list of available commands and their short descriptions, run:
dbt-invoke --list
-
To view in depth command descriptions and available options/flags, run:
dbt-invoke <command_name> --help
Limitations
- dbt-invoke will try to preserve formatting and comments when updating
existing files. If you want to preserve line-breaks, use
>
or|
on your multiline strings, as recommended here. - In order to collect or update the list of columns that should appear in
each property file, dbt's
get_columns_in_query
macro is run for each matching resource. As of the time of writing,get_columns_in_query
uses a SELECT statement limited to zero rows. While this is not typically a performance issue for table or incremental materializations, execution may be slow for complex analyses, views, or ephemeral materializations.- This may be partially remedied by increasing the value of the
--threads
option indbt-invoke properties.update
.
- This may be partially remedied by increasing the value of the
- dbt-invoke has not been tested across different types of data warehouses.