More powerful custom ref() #2857

jtcohen6 · 2020-10-27T13:43:15Z

Description

The RefableCache contains the full node information about all models, snapshots, and seeds in the project. Currently, it is only possible to look up a refable by matching its unique_id, and the only available result is a rendered relation_name.

I like the idea of expanding the context available to users when overriding builtins.ref. The default ref() macro will remain the same; it will still be on the user to define a custom ref implementation; but they can leverage more of the information that dbt already accesses under the hood.

I wonder if it would be possible to enable extensions of ref that:

Return the full node specification instead of the rendered relation name, thereby allowing users to override builtins.ref with different behavior depending on (e.g.) the resource_type or path of the refable node (Incorrect identifiers in snapshot schema test when overriding the ref() macro #2848).
Accept a defer argument, such that if a --state manifest is supplied, it always returns the deferred relation name instead of the current environment's rendered relation name (Automating Non Regression Test : How to get a ref() to the deferred version of the selected model ? #2740).
[Heaviest lift] Support returning multiple references by passing a node property (a la CLI node selectors) instead of a single unique_id, e.g. fqn:events.* (Dynamically reference dbt models #1212). I don't think this can be a priority in the short term, but this proposal is never far from mind when thinking about advanced ref behavior...

Describe alternatives you've considered

These are all nice-to-haves. We can usually find workarounds.

Who will this benefit?

See linked issues for motivations and use cases

The text was updated successfully, but these errors were encountered:

clrcrl · 2020-11-16T15:36:22Z

Adding an extra use case here, which demonstrates some usefulness, but also some other attributes that might be useful as part of this work. (Or you might find this is a separate issue)

In the codegen package, we have code like this:

{% macro generate_base_model(source_name, table_name) %}

{%- set source_relation = source(source_name, table_name) -%}

{% set base_model_sql %}
with source as (

    select * from {% raw %}{{ source({% endraw %}'{{ source_name }}', '{{ table_name }}'{% raw %}) }}{% endraw %}

),
...
{% endmacro %}

This means this macro can't be used with anything other than a source, which is fine 95% of the time (my guess), but sometimes people want to generate a model from a ref.

Let's say we had some attribute, relation.from_type, which is one of 'source', 'ref', 'create_method', and another attribute, relation.definition

Then we could change this macro to have something like:

{% macro generate_base_model(relation) %}

{% set base_model_sql %}
with source as (
    {% if relation.from_type == 'source' %}
    select * from {% raw %}{{ source({% endraw %}'{{ source_name }}', '{{ table_name }}'{% raw %}) }}{% endraw %}
    {% elif relation.from_type == 'ref' %}
    select * from {% raw %}{{ ref({% endraw %}'{{ ref_name }}'{% raw %}) }}{% endraw %}
    {% elif relation.from_type == 'create_method' %}
    select * from {{ relation }}
    {% endif %}
),
...
{% endmacro %}

You may have noticed — I haven't yet figured out how one would get the source_name, table_name or ref_name. I expect as part of this I'd also need the arguments that were passed to the source / ref / create macro. There's probably some Jinja paradigm that bundles together the macro name and its arguments that would work here

github-actions · 2022-11-05T02:13:58Z

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

github-actions · 2022-11-13T02:12:57Z

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest; add a comment to notify the maintainers.

jtcohen6 added enhancement New feature or request discussion labels Oct 27, 2020

jtcohen6 mentioned this issue Oct 28, 2020

Automating Non Regression Test : How to get a ref() to the deferred version of the selected model ? #2740

Closed

jtcohen6 removed the discussion label Apr 19, 2022

github-actions bot added the stale Issues that have gone stale label Nov 5, 2022

github-actions bot closed this as completed Nov 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More powerful custom ref() #2857

More powerful custom ref() #2857

jtcohen6 commented Oct 27, 2020 •

edited

Loading

clrcrl commented Nov 16, 2020 •

edited

Loading

github-actions bot commented Nov 5, 2022

github-actions bot commented Nov 13, 2022

More powerful custom ref() #2857

More powerful custom ref() #2857

Comments

jtcohen6 commented Oct 27, 2020 • edited Loading

Description

Describe alternatives you've considered

Who will this benefit?

clrcrl commented Nov 16, 2020 • edited Loading

github-actions bot commented Nov 5, 2022

github-actions bot commented Nov 13, 2022

jtcohen6 commented Oct 27, 2020 •

edited

Loading

clrcrl commented Nov 16, 2020 •

edited

Loading