Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[async] Support microbatching when using ExecutionMode.AIRFLOW_ASYNC #1270

Open
1 task
tatiana opened this issue Oct 21, 2024 · 0 comments
Open
1 task

[async] Support microbatching when using ExecutionMode.AIRFLOW_ASYNC #1270

tatiana opened this issue Oct 21, 2024 · 0 comments
Labels
area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc

Comments

@tatiana
Copy link
Collaborator

tatiana commented Oct 21, 2024

Context

Incremental models in dbt is a materialization strategy designed to efficiently update your data warehouse tables by only transforming and loading new or changed data since the last run. Instead of processing your entire dataset every time, incremental models append or update only the new rows, significantly reducing the time and resources required for your data transformations.

Even with all the benefits of incremental models as they exist today, there are limitations with this approach, such as:

  • burden is on YOU to calculate what’s “new” - what has already been loaded, what needs to be loaded, etc.
  • can be slow if you have many partitions to process (like when running in full-refresh mode) as it’s done in “one big” SQL statement - can time out, if it fails you end up needing to retry already successful partitions, etc.
  • if you want to specifically name a partition for your incremental model to process, you have to add additional “hack”y logic, likely using vars
    data tests run on your entire model, rather than just the "new" data

dbt-labs/dbt-core#10624

Acceptance criteria

  • ExecutionMode.AIRFLOW_ASYNC can leverage dbt microbatching strategies
@dosubot dosubot bot added the area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc label Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc
Projects
None yet
Development

No branches or pull requests

1 participant