Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snowflake warehouse in dbt_project.yml models hierarchy does not apply to tests #2981

Closed
1 of 5 tasks
zemekeneng opened this issue Dec 28, 2020 · 4 comments
Closed
1 of 5 tasks
Labels
bug Something isn't working dbt tests Issues related to built-in dbt testing functionality
Milestone

Comments

@zemekeneng
Copy link

zemekeneng commented Dec 28, 2020

Describe the bug

Snowflake warehouse in dbt_project.yml models hierarchy does not apply to tests. While dbt run --models XXX runs in the custom warehouse, dbt test --models XXX uses the warehouse specified in the profile/target.

Steps To Reproduce

dbt_project.yml

models:
    my_project:
        my_model:
            +snowflake_warehouse: "custom_warehouse"

~/.dbt/profiles.yml

default:
  target: "{{ env_var('DBT_ENVIRONMENT', 'dev') }}"
  outputs:
    dev:
      type: snowflake
      account: my_sf_account
      user: "{{ env_var('DBT_USER') }}"
      role: "profile_role"

      private_key_path: "{{ env_var('PRIVATE_KEY_PATH') }}"
      private_key_passphrase: "{{ env_var('PRIVATE_KEY_PASSPHRASE') }}"

      database: "profile_db"
      schema: "public"
      warehouse: "profile_warehouse"
      threads: 1
      client_session_keep_alive: False

models/my_model/model_one.sql

select 1 one, 2 two, 3 three

models/my_model/schema.yml

version: 2
models:
  - name: model_one
    columns:
      - name: one 
        tests:
          - unique
dbt run --models my_model
dbt test --models my_model

Expected behavior

The test should be run on custom_warehouse, not profile_warehouse.
I verify that the model is run with custom_warehouse but the test is run on profile_warehouse via snowflake history.

System information

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

installed version: 0.18.1
   latest version: 0.18.1

Up to date!

Plugins:
  - snowflake: 0.18.1
  - bigquery: 0.18.1
  - postgres: 0.18.1
  - redshift: 0.18.1

The operating system you're using:
Debian GNU/Linux 10 (buster)
In a docker container
The output of python --version:
Python 3.8.5

Additional context

Add any other context about the problem here.

@zemekeneng zemekeneng added bug Something isn't working triage labels Dec 28, 2020
@zemekeneng
Copy link
Author

Maybe this is a feature request instead of a bug, but it would be nice to be able to define snowflake_warehouse independently for individual tests as is possible in the {{ config }} for models.

@jtcohen6
Copy link
Contributor

Thanks for the write-up @zemekeneng! I agree, I think this would be an enhancement, and a good one at that.

Currently, the only supported config for all tests is severity: warn|error. I think it would make a lot of sense to support additional configs, including and not limited to ones like enabled: true|false and snowflake_warehouse.

Within a data test file:

{{ config(snowflake_warehouse = 'big') }}

select 1 as fun

For schema tests, within models/*.yml or seeds/*.yml or snapshots/*.yml:

version: 2

models:
  - name: my_model
    columns:
      - name: my_column
        tests:
          - unique:
            snowflake_warehouse: big

Someday

In dbt_project.yml (post-#2401):

# for schema tests
models:
  my_project:
    some_models:
      +tests:
        snowflake_warehouse: big

# for data tests
tests:
  my_project:
    snowflake_warehouse: big

This type of inheritance would also be necessary if you wanted to override a config/property of a test defined in an installed package.

Questions

Should a schema test take its default value of snowflake_warehouse from the resource (model/seed/snapshot) on which it is defined?

I'm inclined to say yes, though I also see a good reason in the opposite direction—that snowflake_warehouse should be treated as an exceptional override, not the rule, and only when warranted by especially heavy (materialization) queries. In any case, it should be possible to override this default for specific schema tests.

Next steps

I'm going to add this to our v0.20 milestone, since we aim for that release to improve several of the current constraints and inconsistencies around tests.

@jtcohen6 jtcohen6 removed the triage label Dec 31, 2020
@jtcohen6 jtcohen6 added this to the Oh-Twenty [TBD] milestone Dec 31, 2020
@jtcohen6 jtcohen6 added the dbt tests Issues related to built-in dbt testing functionality label Dec 31, 2020
@zemekeneng
Copy link
Author

Thanks @jtcohen6 for your thoughtful response! One comment with regard to your question:

Should a schema test take its default value of snowflake_warehouse from the resource (model/seed/snapshot) on which it is defined?

Warehouse choice depends on a number of factors:

  1. right-sizing compute resources for query load (dbt's current setup seems to emphasize this one)
  2. segregating workloads to avoid resource contention
  3. segregating workloads for cost accounting

The 2nd and 3rd factors are especially important in organizations (like mine, Age of Learning) where the data org serves multiple (6+) independent product warehouses as well as some global data warehouses that serve and consume data from each.

That said, it seems prudent to wait for tests to be a first-class config object like model/seed/snapshot to address the problem.

Happy to help with this, too. I would love to contribute code and move this along! Please let me know what I can do.

@jtcohen6
Copy link
Contributor

Closing in favor of #3255 + #3256, which split up the conversation we were having above. All in all, I think the implementation of #3255 in particular would be very straightforward—there's just some foundational work we need to do first, to enable test nodes to "carry" these additional configs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working dbt tests Issues related to built-in dbt testing functionality
Projects
None yet
Development

No branches or pull requests

2 participants