Package project yml configs overriding root project configs #3698

fivetran-jamie · 2021-08-05T17:59:36Z

Describe the bug

A clear and concise description of what the bug is. What command did you run? What happened?

I installed this branch of the zendesk_source package in my root project. This branch uses the custom collect_freshness macro we built in order to disable (or rather, automatically pass) freshness tests for source tables that the user has elected to disable (note: the source entity itself is not configured/disabled, but rather the staging models drawing from these source tables are disabled). When variables (such as using_ticket_form_history) are defined in the package's own dbt_project.yml file, the package-provided value overrides any value provided in the root project file. I am unable to disable/pass freshness tests (using dbt source snapshot-freshness --select zendesk.ticket_form_history) by setting the appropriate variable to false in my root project (but if i change the package-defined variable to false, it passes).

Note: I am only seeing this conflict with freshness tests. When i provide using_ticket_form_history in my root project (false in my project yml and true in the package yml) and execute dbt run, the respective staging models that are appropriately disabled by the variable. So perhaps this has to do with the collect_freshness macro being a part of dbt core rather than utils or anything?

Steps To Reproduce

In as much detail as possible, please provide steps to reproduce the issue. Sample data that triggers the issue, example model code, etc is all very helpful here.

Install the bug/disable-freshness branch of the zendesk_source package
Add the below to your root project file

vars:
  zendesk_source:
    using_ticket_form_history: false

Add the following to the dbt_modules/zendesk_source/dbt_project.yml

vars:
  zendesk_source:
    using_ticket_form_history: true

Run dbt source snapshot-freshness --select zendesk.ticket_form_history
Expected: freshness test should pass, since we have chosen to disable/automatically pass freshness tests for this table
Actual: freshness test fails (presuming you do not have this table or just have stale data)
Change using_ticket_form_history to false in dbt_modules/zendesk_source/dbt_project.yml
Run dbt source snapshot-freshness --select zendesk.ticket_form_history
This passes, which makes it seem like dbt is indeed grabbing the variable value from the package yml

Removing the variable from the package project yml allows me to disable freshness tests from my root project yml, but a good deal of our packages have these default-set variables in their dbt_project.yml files, so just trying to figure out if we need to remove those in order for this solution to continue to work.

Expected behavior

A clear and concise description of what you expected to happen.
(see bold text in steps to reproduce ^)

Screenshots and log output

If applicable, add screenshots or log output to help explain your problem.

System information

Which database are you using dbt with?

The output of dbt --version:

installed version: 0.20.0
   latest version: 0.20.0

Up to date!

Plugins:
  - bigquery: 0.20.0
  - snowflake: 0.20.0
  - redshift: 0.20.0
  - postgres: 0.20.0

The operating system you're using:
MacOS
The output of python --version:
Python 3.8.9

Additional context

Add any other context about the problem here.

The text was updated successfully, but these errors were encountered:

jtcohen6 · 2021-08-29T13:43:55Z

@fivetran-jamie Thanks for the really detailed write-up! Sorry for the delay getting back to you. I've been able to reproduce the bug, it's just a real head-scratcher to figure out what's going on here.

I do think the peculiarities of the source freshness task have something to do with this one. Unlike other tasks, which generate a runtime context, compile SQL, and then execute that compiled SQL within a materialization, the freshness task jumps straight to the finish: find and execute a certain macro using a macro_context.

There are two things working right:

dbt picks the right macro, preferring fivetran_utils.collect_freshness over built-in dbt.collect_freshness
The macro_context does seem to be doing the right sort of var resolution (sometimes)

However:

I see different results when I pass the macro into dbt.clients.jinja.get_rendered vs. dbt.clients.jinja.MacroGenerator

I dropped a debugger in here:

        macro_context.update(context_override)
        if 'collect_freshness' in macro.unique_id:
            import ipdb; ipdb.set_trace()
        macro_function = Macro
        Generator(macro, macro_context)

When I just wrap the macro in get_rendered, it does the right thing (no error):

ipdb> from dbt.clients.jinja import get_rendered
ipdb> get_rendered("{{ fivetran_utils.collect_freshness('zendesk.ticket_form_history') }}", macro_context)
'| column         | data_type |\n| -------------- | --------- |\n| max_loaded_at  | DateTime  |\n| snapshotted_at | DateTime  |\n'

But when I follow the actual codepath, and execute the macro using self.connections.exception_handler, it clearly renders differently (raising an error):

ipdb> macro_function = MacroGenerator(macro, macro_context)
ipdb> with self.connections.exception_handler(f'macro {macro_name}'):
            result = macro_function(**kwargs)

*** dbt.exceptions.DatabaseException: Database Error
  relation "zendesk.ticket_form_history" does not exist
  LINE 42:     from "jerco"."zendesk"."ticket_form_history"
                    ^

When I try to just get the rendered variable value on its own, it doesn't appear to be accessible (??), but at least it shows me the correct value (false), as set in my root dbt_project.yml.

ipdb> get_rendered("{{ var('using_ticket_form_history') }}", macro_context)
*** dbt.exceptions.CompilationException: Compilation Error in macro collect_freshness (macros/collect_freshness.sql)
  Required var 'using_ticket_form_history' not found in config:
  Vars supplied to collect_freshness = {
      "zendesk_source": {
          "using_ticket_form_history": false
      }
  }

I want to confirm whether I'm arriving at the right distinction: get_rendered does one thing, MacroGenerator does a different thing, and that's at the heart of the matter about picking the wrong var value. Even if that distinction holds, I imagine it's a really just symptom of some really low-lying subtleties of the way we're constructing contexts + templates in dbt/clients/jinja.py.

@gshank I think you know the most about rendering contexts today, so I'd be curious to hear if what I've laid out above seems like the right thread to pull on, or if I've totally missed the boat.

jtcohen6 · 2021-09-19T21:08:18Z

I took another look at this one, just to see if anything new has clicked.

I think this is not specific to the execution of source freshness, as opposed to what I said above. After all, in manifest.json, dbt renders meta.is_enabled for the source as true! It prefers the value of the variable set in the package dbt_project.yml over the value of the variable set in the root project dbt_project.yml, when it should be the exact reverse. If we simplify the issue to that—yaml properties are rendered with the wrong var precedence—we might have a better shot at getting to the root cause here.

As a separate matter, I do think there's a better way to achieve this functionality, rather than overriding the collect_freshness macro, which can yield some surprising behavior for end users (e.g. slack thread). It's possible to disable freshness for a specific source table by setting freshness: source; I see that Fivetran packages often do this for certain tables. So how about something like:

      - name: ticket_form_history
        description: Ticket forms allow an admin to define a subset of ticket fields for display to both agents and end users.
        freshness: "{{ ({} if var('using_ticket_form_history', true) else none) | as_native }}"

This prevents the freshness check from running at all for this table when using_ticket_form_history: false. (Unfortunately, it still doesn't resolve the bug in this issue: at parse time, dbt is rendering the var with preference for the package value over the root project's.)

Ultimately, I'd like to see us do unto sources as we've done unto other resource types (#3662):

add a config property to sources, making it possible to set enabled for a given source, e.g. to a Jinja expression that leverages a var value
convert freshness into an honest-to-goodness config that can be set/overridden in the root project's dbt_project.yml (as an always-available last resort)

bill-warner · 2021-12-16T21:13:20Z

I am also running into what I think is the same issue when developing packages. Using the jaffle shop example on this branch.

Customers model enabled using the enabled_customers var within the models config

enabled_customers default value false set in package's dbt_project file.

enabled_customers set to true in root project, which imports the package.

Generic tests on customers model are also enabled using enabled_customers var. This is intended to stop the warning dbt would throw if the customers model was disabled in the root project:

[WARNING]: Test 'test.jaffle_shop.unique_customers_customer_id.c5af1ff4b1' (models/schema.yml) depends on a node named 'customers' which is disabled

With this setup, given enabled_customers is set to true in the root project, you would expect to be able to both run and test the customers model. Instead you can run the customer model but no tests are found. This suggests the vars value in the schema.yml seemingly is taken from the packages dbt_project.yml file while the model's config is taking the value from the root projects dbt_project.yml

github-actions · 2022-06-15T02:14:11Z

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

github-actions · 2022-06-22T02:14:12Z

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest; add a comment to notify the maintainers.

rlh1994 · 2022-09-28T10:20:43Z

@jtcohen6 I understand this was closed for inactivity but do you know was there a planned resoltuion for this? I am seeing the same issue with variable priority - a variable defined in a package yaml is being prioritised over the project yaml overwrite when that variable is called from within a schema yaml within the package?

fivetran-jamie added bug Something isn't working triage labels Aug 5, 2021

jtcohen6 added packages Functionality for interacting with installed packages vars and removed triage labels Aug 29, 2021

jtcohen6 added the Team: Language label Nov 18, 2021

This was referenced Dec 16, 2021

Release/snowplow-web/0.5.0 snowplow/dbt-snowplow-web#53

Merged

Set default values for enrichment toggles in project yaml snowplow/dbt-snowplow-web#54

Closed

jtcohen6 mentioned this issue Dec 22, 2021

[Bug] Source Freshness not running on renamed source tables #4525

Closed

1 task

bill-warner mentioned this issue Jan 26, 2022

Internal module development snowplow/dbt-snowplow-mobile#3

Merged

emielver mentioned this issue May 13, 2022

Dev/incremental media model snowplow/dbt-snowplow-media-player#2

Merged

github-actions bot added the stale Issues that have gone stale label Jun 15, 2022

github-actions bot closed this as completed Jun 22, 2022

agnessnowplow mentioned this issue Aug 9, 2022

Modify source file for updated databricks support snowplow/dbt-snowplow-web#101

Merged

rlh1994 mentioned this issue Jan 24, 2023

[CT-1888] [Bug] Package project yml configs overriding root project configs when used in other yml files #6705

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Package project yml configs overriding root project configs #3698

Package project yml configs overriding root project configs #3698

fivetran-jamie commented Aug 5, 2021

jtcohen6 commented Aug 29, 2021 •

edited

Loading

jtcohen6 commented Sep 19, 2021

bill-warner commented Dec 16, 2021

github-actions bot commented Jun 15, 2022

github-actions bot commented Jun 22, 2022

rlh1994 commented Sep 28, 2022

Package project yml configs overriding root project configs #3698

Package project yml configs overriding root project configs #3698

Comments

fivetran-jamie commented Aug 5, 2021

Describe the bug

Steps To Reproduce

Expected behavior

Screenshots and log output

System information

Additional context

jtcohen6 commented Aug 29, 2021 • edited Loading

jtcohen6 commented Sep 19, 2021

bill-warner commented Dec 16, 2021

github-actions bot commented Jun 15, 2022

github-actions bot commented Jun 22, 2022

rlh1994 commented Sep 28, 2022

jtcohen6 commented Aug 29, 2021 •

edited

Loading