Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbt_clone macros, materialization and tests to dbt-spark #816

Merged
merged 8 commits into from
Jul 11, 2023

Conversation

McKnight-42
Copy link
Contributor

@McKnight-42 McKnight-42 commented Jun 27, 2023

resolves #815

Description

adding the macros, and customer materialization needed on adapter side for dbt_clone and trying to design adapter zone version of Clone tests from defer_state in dbt-labs/dbt-core#7881

Checklist

@aranke aranke self-requested a review June 29, 2023 19:52
@McKnight-42
Copy link
Contributor Author

testing update 6/29: passing test_dbt_clone.py passing locally. failing in ci/cd

@McKnight-42
Copy link
Contributor Author

@jtcohen6 Curious I know we want the shallow clone Do we also want to do deep copy as an option? I noticed it is not mentioned in updated comments on dbt-labs/dbt-core#7549 but still in the original draft pr dbt-labs/dbt-core#7258.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Jul 5, 2023

@McKnight-42 IMO no need to support deep copy/clone as an option! As I understand it, this is the same as fully recreating the table (create table as select * from {other}), and loses a lot of the cost/speed/performance benefits of using clone.

{%- if file_format != 'delta' -%}
{% set invalid_format_msg -%}
Invalid file format: {{ file_format }}
shallow clone requires file_format be set to 'delta'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious, why can't we do this with non-delta formats?

Copy link
Contributor Author

@McKnight-42 McKnight-42 Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may be a misunderstanding on my part honestly will double check I know our writeup and the docs for Databricks specifies delta tables might of confused that with our delta format suggestion, we use in some other places

I know dbt-databricks uses delta tables by default but I don't believe dbt-spark does fully

Copy link
Contributor

@colin-rogers-dbt colin-rogers-dbt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question but LGTM

@McKnight-42 McKnight-42 merged commit a6dc99f into main Jul 11, 2023
@McKnight-42 McKnight-42 deleted the feature/dbt-clone-sp branch July 11, 2023 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ADAP-660] [Feature] dbt-spark Cloning from production
3 participants