[SPIKE] [CT-2650] Materialization macros should support dispatch #7799

jtcohen6 · 2023-06-06T13:22:36Z

Copied from #4646

Adapter-specific materializations work differently from adapter-specific macros. Namely:

{% materialization table, adapter = 'snowflake' %} creates a macro named materialization_table_snowflake, rather than (as one would expect) snowflake__materialization_table
"Parent" adapter inheritance is not supported for materializations, as it is for all other macros. This causes confusion for external adapter maintainers: re-implement sqlserver test materialization microsoft/dbt-synapse#74 (comment), Cleanup unnecessary codes by the inheritance of dbt-spark any more. databricks/dbt-databricks#40 (comment)

Let's make materialization macro generation / discovery work more like dispatch. We'd need to offer backwards compatibility / avoiding breaking changes for folks who currently rely on the materialization_<name>_<adapter> construction.

This is important in cases where we want to "call" a materialization macro from within another macro (perhaps from within another materialization). I found a concrete use case for doing this in my spiking on dbt clone (#7258), whereby the clone materialization would in certain instances want to "shell out" to the view materialization.

Specific comment & spot in the spiking code where this is relevant: #7258 (comment)

Depending on the implementation we pursue for #7442, this could also be a relevant capability there.

The text was updated successfully, but these errors were encountered:

aranke · 2023-06-23T14:23:14Z

Also came up in #7881 (comment)

gshank · 2023-06-23T18:56:13Z

Materializations are a variant of Jinja macro (jinja extension) which is created in core/dbt/clients/jinja.py, in MaterializationExtension. They are saved in the manifest.macros dictionary with names such as: "macro.dbt.materialization_incremental_default"

In order to use adapter.dispatch and use its macro resolution code we would need to normalize the names. (Could we have synonyms?) (Note: we currently have two macro resolvers, MacroResolver and MacroNamespace.)

In the "def execute" method in dbt/task/run.py and dbt/task/clone.py we do manifest.get_materialization_macro_name. Instead we would have to execute adapter.dispatch to get the materialization macro, which would return a MacroGenerator object so it wouldn't need to be created. We don't currently call adapter.dispatch in Python, but the materialization macros are the "launcher" macros that get everything started so this might be an exception.

Concerns here are the inconsistency in naming conventions between materializations and "normal" macros and the changes that would need to be made in the adapter repos. It remains to be seen how much incompatibility there would be. Possibly we might be able to have a temporary mapping from old to new materialization macros. But it's also possible that this would be a fairly big breaking change.

jtcohen6 · 2023-06-26T17:03:36Z

Users define these as:

{% materialization table, adapter="snowflake" %}

Lowest-risk solution is add the materialization macro by both names:

"macro.dbt.materialization_table_snowflake"
"macro.dbt.snowflake__materialization_table"

The main thing this gets us is allowing someone to write this and have it work:

adapter.dispatch("materialization_table")

We should raise a deprecation warning if people are still calling a materialization directly by the old name. In order to do this, we need to update our own materialization lookup logic (get_materialization_macro_name), so that we're fetching it by the new (rather than old) name.

Potential acceptance criteria:

Add materialization macro to manifest by both names
Change materialization macro lookup logic
Add deprecation logic when old materialization macro name is called explicitly

@aranke @gshank to keep discussing

gshank · 2023-06-28T18:35:42Z

On thinking about this a bit more, I think it's probably a bad idea to have duplicated macros with different names. It would complicate partial parsing quite a bit. I'm now thinking that the best way to do this would be to add code in adapter.dispatch so that if we fail to find a materialization macro we use special resolution code which will mangle the name into the "standard" materialization name.

jtcohen6 · 2023-06-30T11:32:57Z

Having a way to call a materialization as a function (with reliable dispatch + ability to pass in arguments instead of patching the context) would also give us more options for implementing this Mesh UX improvement:

[CT-2468] [Feature] For versioned models, automatically create view/clone of latest version in unsuffixed database location #7442

gshank · 2023-06-30T18:02:58Z

In conversation with Kshitij just now, it seems that the main changes need to be in core/dbt/context/macros.py, in MacroNamespace.get_from_package where we can add special processing of "materialization" macros. The "get_from_package" method is called from adapter.dispatch, which comes from the adapter "contextproperty" of the ProviderContext in core/dbt/context/providers.py. The actual "dispatch" method is in the BaseDatabaseWrapper.

The materialization macro can be found in the "execute" method of the clone task (and probably the same method in runnable.py) by doing context.adapter.dispatch from the constructed context. This should already be a MacroGenerator callable, so we wouldn't need to construct a MacroGenerator as we do further down in "execute".

jtcohen6 · 2023-07-05T11:31:15Z

@gshank @aranke Thanks for continuing the conversation! I'm going to close out this spike, and open a new issue to implement what we've landed on above

jtcohen6 · 2023-07-05T11:33:30Z

Closing in favor of #8031

jtcohen6 mentioned this issue Jun 6, 2023

[CT-2386] [EPIC] Mature deployment: Retry & Slimmer CI #7301

Closed

1 task

github-actions bot changed the title ~~Materialization macros should support dispatch~~ [CT-2650] Materialization macros should support dispatch Jun 6, 2023

jtcohen6 assigned jtcohen6 and unassigned jtcohen6 Jun 6, 2023

jtcohen6 added the tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality label Jun 6, 2023

jtcohen6 mentioned this issue Jun 6, 2023

[CT-2348] [Feature] dbt clone command #7256

Closed

3 tasks

emmyoop added the spike label Jun 12, 2023

emmyoop changed the title ~~[CT-2650] Materialization macros should support dispatch~~ [SPIKE] [CT-2650] Materialization macros should support dispatch Jun 12, 2023

jtcohen6 mentioned this issue Jun 20, 2023

dbt clone #7881

Merged

6 tasks

leahwicz assigned aranke Jun 21, 2023

jtcohen6 assigned gshank Jun 21, 2023

jtcohen6 mentioned this issue Jul 5, 2023

[CT-2790] Support dispatch for materialization macros #8031

Open

jtcohen6 closed this as completed Jul 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPIKE] [CT-2650] Materialization macros should support dispatch #7799

[SPIKE] [CT-2650] Materialization macros should support dispatch #7799

jtcohen6 commented Jun 6, 2023 •

edited

Loading

aranke commented Jun 23, 2023

gshank commented Jun 23, 2023

jtcohen6 commented Jun 26, 2023

gshank commented Jun 28, 2023

jtcohen6 commented Jun 30, 2023

gshank commented Jun 30, 2023

jtcohen6 commented Jul 5, 2023

jtcohen6 commented Jul 5, 2023

[SPIKE] [CT-2650] Materialization macros should support dispatch #7799

[SPIKE] [CT-2650] Materialization macros should support dispatch #7799

Comments

jtcohen6 commented Jun 6, 2023 • edited Loading

aranke commented Jun 23, 2023

gshank commented Jun 23, 2023

jtcohen6 commented Jun 26, 2023

gshank commented Jun 28, 2023

jtcohen6 commented Jun 30, 2023

gshank commented Jun 30, 2023

jtcohen6 commented Jul 5, 2023

jtcohen6 commented Jul 5, 2023

jtcohen6 commented Jun 6, 2023 •

edited

Loading