Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CT-1922: Parsing and postgres implementations for model level constraints #7230

Merged
merged 14 commits into from
Apr 11, 2023

Conversation

peterallenwebb
Copy link
Contributor

@peterallenwebb peterallenwebb commented Mar 27, 2023

resolves #6754

Description

Adds code to parse and validate constraints on models. Implements default (and postgres) adapter support for model level constraint rendering.

Checklist

@cla-bot cla-bot bot added the cla:yes label Mar 27, 2023
@github-actions
Copy link
Contributor

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

@peterallenwebb peterallenwebb changed the title CT-1922: Rough in functionality for parsing model level constraints CT-1922: Parsing and postgres implementations for model level constraints Mar 28, 2023
@peterallenwebb peterallenwebb marked this pull request as ready for review March 29, 2023 14:34
@peterallenwebb peterallenwebb requested review from a team as code owners March 29, 2023 14:34

@dataclass
class ModelLevelConstraint(ColumnLevelConstraint):
pass # REVIEW: The columns property above should be on this class instead
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reasons I am still working to understand, moving the columns property above to this class, where it would actually makes sense, results in the columns property being missing when the raw dictionary is passed to the adapter implementation of render_raw_model_constraint().

@gerda It seems like there is a mechanism at work which adds defaults values to and removes unexpected values from the raw dictionaries, but is confused about which type this one has. Can you help me understand what's going on here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"constraints" is not processed as a config, it's a node-level field. So you need to actually copy it at the top of the "patch" method on parsed node, where "description" and "columns" and "access" are copied. It's only on model nodes so you'd also want to check the node.resource_type. (At some point we may want to split out more finely grained "patch" methods...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find the answer, but I want to make sure I actually understand the question. It sounds like if columns is a key in raw_constraint, then BaseAdapter._parse_model_constraint(raw_constraint) does not have a columns attribute, but BaseAdapter._parse_column_constraint(raw_constraint) does have a columns attribute. Is that right?

Is raw_constraint["columns"] just a List[str]? Or is it more complicated?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(At some point we may want to split out more finely grained "patch" methods...)

Thinking along the same lines for patching other new node-level fields that are only on model nodes in the model version parsing spike (version, is_latest_version).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This turned out to be the result of a typo fixed in my last commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't count the time I've lost to typos. Turns out the computer is right again.


@dataclass
class ModelLevelConstraint(ColumnLevelConstraint):
pass # REVIEW: The columns property above should be on this class instead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find the answer, but I want to make sure I actually understand the question. It sounds like if columns is a key in raw_constraint, then BaseAdapter._parse_model_constraint(raw_constraint) does not have a columns attribute, but BaseAdapter._parse_column_constraint(raw_constraint) does have a columns attribute. Is that right?

Is raw_constraint["columns"] just a List[str]? Or is it more complicated?

raise ParsingError(
f"Original File Path: ({original_file_path})\nConstraints must be defined in a `yml` schema configuration file like `schema.yml`.\nOnly the SQL table and view materializations are supported for constraints. \n`data_type` values must be defined for all columns and NOT be null or blank.{self.convert_errors_to_string(error_messages)}"

if isinstance(node, ModelNode) and node.config.contract is True:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code works, but right now we're doing node-level field setting in the patch method so ideally this should happen in that method if possible. I presume that access to "validate_constraints" is the issue? Can that be changed into a function that can be called in both places?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could pull that "patch" method into where this code is called from. It doesn't need to stay on the node object...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per huddle convo, I've added some comments for now and we'll plan to make this more consistent soon.


@available
@classmethod
def render_raw_model_constraints(cls, raw_constraints: Dict[str, Any]) -> str:
Copy link
Member

@emmyoop emmyoop Apr 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the return type be a List here? Probably an Optional[List]?

Suggested change
def render_raw_model_constraints(cls, raw_constraints: Dict[str, Any]) -> str:
def render_raw_model_constraints(cls, raw_constraints: Dict[str, Any]) -> List:


@classmethod
def render_raw_model_constraint(cls, raw_constraint: Dict[str, Any]) -> Optional[str]:
constraint = cls._parse_model_constraint(raw_constraint)
Copy link
Contributor

@VersusFacit VersusFacit Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incidentally, I do love the design here since we get to use an enumerated type instead of strings ✨

{%- for c in constraints -%}
{%- set constraint_str = adapter.render_raw_column_constraint(c) -%}
{%- if constraint_str -%}
{{ ' ' ~ constraint_str }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: WS

assert len(results) == 1
generated_sql = read_file("target", "run", "test", "models", "my_model.sql")
generated_sql_list = _normalize_whitespace(generated_sql).split(" ")
generated_sql_list = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This is repeating the same variable name with different semantics and it just strikes me is something we might want to clean up

Copy link
Contributor

@VersusFacit VersusFacit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some syntax-related clean up nits

@peterallenwebb peterallenwebb merged commit 6ac5c90 into main Apr 11, 2023
@peterallenwebb peterallenwebb deleted the paw/ct-1922-model-level-constraints branch April 11, 2023 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-1922] Support 'constraints' as a model-level property
6 participants