Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store relation name in manifest's node and source objects #2837

Conversation

franloza
Copy link
Contributor

@franloza franloza commented Oct 16, 2020

Resolves #2647

Description

As @beckjake mentioned in #2641 , the relation name can be obtained from the adapter at compile-time. It was only necessary to add relation as an optional member of CompiledNode class.

In the tests, I added relation_name as the list of required keys for a compile node and values for relation_name in a couple of test cases (Despite the fact that it was not necessary for the test, I wanted to show this new field explicitly).

In development, the manifest.json included the field relation_name in every node object. Example:

"model.sample_project.base_customers": {
      "raw_sql": "....",
      "compiled": true,
      ....
      "relation_name": "\"postgres\".\"public\".\"base_customers\""
    },

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@cla-bot cla-bot bot added the cla:yes label Oct 16, 2020
Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up @franloza! Always cool when a community member preempts us in sprint work :)

It looks like there are some failing tests in 029_docs_generate_tests. I think you'll want to add relation_name as an expected node value here. Some of the other discrepancies in those test failures may have to do with other artifact changes that we've merged into dev/kiyoshia-kuromiya.

@franloza
Copy link
Contributor Author

It looks like there are some failing tests in 029_docs_generate_tests. I think you'll want to add relation_name as an expected node value here. Some of the other discrepancies in those test failures may have to do with other artifact changes that we've merged into dev/kiyoshia-kuromiya.

Yes! You're right, I forgot to add this new value in the doc generation tests. I'm fixing these tests and discrepancies with dev/kiyoshia-kuromiya.

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on getting all tests passing, especially the quoting, which is the real end goal of this effort.

I'm not quite sold on having relation_name defined for all node objects. In particular, it seems inconsistent for tests and for ephemeral models, which currently have null for database, schema, and alias. Is there a way we could null out the relation_name for those resources?

compiled_database = self.default_database
if self.adapter_type != 'snowflake':
compiled_database = self._quote(compiled_database)
quote_database = self.adapter_type != 'snowflake'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice touch here

@franloza
Copy link
Contributor Author

Hey @jtcohen6, thank you for your review!

I'm not quite sold on having relation_name defined for all node objects.

I agree with you about this. As you say, it is misleading for ephemeral models and tests. At this point, I think there might be multiple ways of nulling the relation name for these types. Let me come up with one implementation and let's discuss it. Any feedback or hint about the implementation is welcome!

@franloza
Copy link
Contributor Author

I have considered the following criteria in my implementation to set a value for the relation_name field: The resource type is a non-ephemeral model, a seed or a snapshot

        if (node.resource_type in NodeType.refable() and
                node.config.materialized != "ephemeral"):
            adapter = get_adapter(self.config)
            relation_cls = adapter.Relation
            relation_name = str(relation_cls.create_from(self.config, node))

In addition, I missed a snapshot in the docs generation tests to verify that the relation name was the expected one. This is why I have decided to include a snapshot example in the tests. @jtcohen6 what do you think about this approach?

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Tiny comment on the implementation.

Thanks for all the work around tests, and I'm really happy you added a snapshot example. Just wondering about disabling for BQ + Redshift—what's the common incompatibility?

core/dbt/compilation.py Outdated Show resolved Hide resolved
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
@franloza
Copy link
Contributor Author

franloza commented Oct 28, 2020

Thanks again for your review!

Just wondering about disabling for BQ + Redshift—what's the common incompatibility?

There is no incompatibility, we are testing documentation for snapshots in test__bigquery__run_and_generate and test__redshift__run_and_generate. I did not want to include a snapshot in test__bigquery__complex_models and test__redshift__incremental_view to not make the PR bigger, but I think is good to include them also for these two methods for consistency. Let me include the snapshot test for these two methods.

@franloza franloza changed the title Store relation name in manifest's node object Store relation name in manifest's node and source objects Oct 28, 2020
@franloza
Copy link
Contributor Author

Hey @jtcohen6, finally I included the resource_name field for the sources node in the manifest. I checked locally and it seems to work:

  "sources": {
    "source.sample_project.base.customers": {
      "fqn": [ "sample_project", "base", "base", "customers" ],
      "database": "postgres",
      "schema": "public",
      "unique_id": "source.sample_project.base.customers",
      ...
      "patch_path": null,
      "unrendered_config": {},
      "relation_name": "\"postgres\".\"public\".\"customers\""
    }

I also modified the test to accept this new field. I did not modify the condition if node.resource_type in NodeType.refable() because I needed to add the resource_name in a different module (core/dbt/parser/schemas.py) to appear in the sources object.

Finally, I added the snapshot to the tests test__bigquery__complex_models and test__redshift__incremental_view. What do you think about this implementation?

@jtcohen6
Copy link
Contributor

@franloza This is looking great! Thanks for all the polish at the end. I'm out for a few days, so I'll take a more detailed look when I'm back.

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @franloza! Thank you so much for the contribution

Comment on lines +177 to +178
if (node.resource_type in NodeType.refable() and
not node.is_ephemeral_model):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jtcohen6 jtcohen6 merged commit 02e5a96 into dbt-labs:dev/kiyoshi-kuromiya Nov 9, 2020
@franloza
Copy link
Contributor Author

franloza commented Nov 9, 2020

Great work @franloza! Thank you so much for the contribution

Thanks for your review! Glad to contribute 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Store resolved node names in manifest
2 participants