Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type aliasing for model contract column data_type #8592

Merged
merged 19 commits into from
Oct 10, 2023

Conversation

gshank
Copy link
Contributor

@gshank gshank commented Sep 7, 2023

resolves #8007

Problem

Currently contracts data_types must use the correct type for their warehouse, instead of using the built in type aliasing.

Solution

For model dictionaries provided by the provider context, convert the column data_type using the "translate_types" method.

Other teams

Affects adapter constraint tests, only BigQuery impacted: dbt-labs/dbt-bigquery#954

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX

@gshank gshank requested review from a team as code owners September 7, 2023 20:30
@gshank gshank requested review from chrismorrisette-dbt and aranke and removed request for a team September 7, 2023 20:30
@cla-bot cla-bot bot added the cla:yes label Sep 7, 2023
@codecov
Copy link

codecov bot commented Sep 7, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (549dbf3) 86.43% compared to head (4084fc4) 86.43%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #8592   +/-   ##
=======================================
  Coverage   86.43%   86.43%           
=======================================
  Files         176      176           
  Lines       26009    26019   +10     
=======================================
+ Hits        22480    22490   +10     
  Misses       3529     3529           
Flag Coverage Δ
integration 83.23% <100.00%> (+<0.01%) ⬆️
unit 64.86% <66.66%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
core/dbt/adapters/base/column.py 69.23% <ø> (ø)
core/dbt/context/providers.py 89.04% <100.00%> (+0.09%) ⬆️
core/dbt/contracts/graph/model_config.py 92.22% <100.00%> (+0.02%) ⬆️
core/dbt/contracts/graph/nodes.py 95.09% <100.00%> (+<0.01%) ⬆️
core/dbt/parser/base.py 93.63% <100.00%> (+0.05%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@aranke aranke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a net new check we can add to ensure that some of the most complicated types are parsed correctly?

Otherwise, LGTM.

@gshank gshank marked this pull request as draft September 8, 2023 16:14
@gshank gshank added the user docs [docs.getdbt.com] Needs better documentation label Sep 8, 2023
@gshank
Copy link
Contributor Author

gshank commented Sep 8, 2023

This causes test breakage in the adapter repos (mostly for test_constraints.py), at first glance mainly because of SQL in which statements such as "id integer not null" are replaced by "id int not null" because the TYPE_LABELS in the base Column class change INTEGER to INT (and STRING to TEXT).

@jtcohen6 @graciegoheen Just want to verify that this is something we actually want to have happen.

@gshank
Copy link
Contributor Author

gshank commented Sep 8, 2023

Note: currently it looks like string => text and integer => int are the only data types changed.

@gshank gshank marked this pull request as ready for review September 14, 2023 16:08
@gshank gshank requested a review from a team as a code owner September 14, 2023 16:08
Copy link
Member

@aranke aranke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, can we add a test in case an adapter hasn't implemented the translate_type method?

if "data_type" in column:
orig_data_type = column["data_type"]
# translate data_type to value in Column.TYPE_LABELS
new_data_type = self.adapter.Column.translate_type(orig_data_type)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the translate_type function fails? Can we do this instead?

column["data_type"] = self.adapter.Column.translate_type(orig_data_type) or column["data_type"]

Also, can we add a test for this scenario?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base "translate_type" function return the passed in type if no translation exists. If the adapter hasn't implemented a different translate_type function it would use the base one.

Copy link
Contributor

@colin-rogers-dbt colin-rogers-dbt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggested change to changelog, but otherwise LGTM

gshank and others added 2 commits October 10, 2023 09:29
Co-authored-by: colin-rogers-dbt <111200756+colin-rogers-dbt@users.noreply.github.com>
@gshank gshank merged commit 4391dc1 into main Oct 10, 2023
50 checks passed
@gshank gshank deleted the 8007-contract_type_aliasing branch October 10, 2023 15:43
@FishtownBuildBot
Copy link
Collaborator

Opened a new issue in dbt-labs/docs.getdbt.com: dbt-labs/docs.getdbt.com#4241

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla:yes user docs [docs.getdbt.com] Needs better documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-2774] Optional type aliasing for model contract data_type
4 participants