The code in this repository demonstrates beginner and advanced implementation concepts at the intersection of dbt and Airflow. This repository complements this presentation.
We are currently using the jaffle_shop sample dbt project.
Airflow DAGs only need the dbt_project.yml
, profiles.yml
, and target/manifest.json
files to run,
but we also included the models for completeness. If you would like to try these DAGs with your own dbt workflow,
feel free to add your own project files.
This project uses ClickHouse. You can deploy a Managed ClickHouse cluster on DoubleCloud for free using trial period credits.
The CI/CD pipeline uses the custom images feature available in DoubleCloud's Managed Airflow service. Follow the quick start guide to run a free Apache Airflow cluster and try out the provided code examples.
- Airflow 2.8+ is required to use these DAGs. They have been tested with Airflow 2.8.1.
- If you make changes to the dbt project, you need to run
dbt compile
to update themanifest.json
file. You can do it manually during development or in a CI/CD pipeline. This has to be done before the DAGs get into Airflow — otherwise the scheduler wouldn't be able to build dynamic workflow from dbt's manifest file. - The example dbt project contains
profiles.yml
that is configured to use environment variables. The database credentials from an Airflow connection are passed as environment variables to theBashOperator
tasks running the dbt commands. - Each DAG runs a
dbt_seed
task at the beginning that loads sample data into the database. This is simply for the purpose of this demo.