Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only create schemas for selected nodes (#1239) #1258

Merged
merged 3 commits into from
Jan 28, 2019

Conversation

beckjake
Copy link
Contributor

Fixes #1239

This PR passes along the set of selected UIDs to the before_run method of the chosen Runner class, which is then used to filter out the nodes that dbt needs to build schemas for. We still unconditionally create the default schema, for snowflake.

Also I have no idea what was up with check_schema_exists, but it's fixed.

For tests, I just modified some of the existing graph selection tests, which seemed sufficient.

Jacob Beck added 2 commits January 23, 2019 16:35
 - pass selected uids along to the before_run method for schema creation
 - only schemas used by selected nodes are created
 - fix some big issues with check_schema_exists (such as: it does not work)
 - integration test changes to test this
@beckjake beckjake force-pushed the fix/dont-create-unused-schemas branch from 9e6ca80 to 9ff8705 Compare January 23, 2019 23:35
@beckjake beckjake force-pushed the fix/dont-create-unused-schemas branch from a4b77e4 to 7d332aa Compare January 24, 2019 15:14
@beckjake
Copy link
Contributor Author

Ignore that last force-push, it's not a change! Just reverting an accidental wrong-branch commit.

Copy link
Contributor

@drewbanin drewbanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of comments, but otherwise this looks great

@@ -241,4 +241,4 @@ def check_schema_exists(self, database, schema, model_name=None):
kwargs={'database': database, 'schema': schema},
connection_name=model_name
)
return results[0] > 0
return results[0][0] > 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happened here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the existing check_schema_exists was wrong and did not work.

schemas = set()
for node in manifest.nodes.values():
if node.unique_id not in selected_uids:
continue
if cls.is_refable(node) and not cls.is_ephemeral(node):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this going to cause a problem with Sources? I imagine sources are refable (idk if you actually made that change in the corresponding PR), but we definitely wouldn't want to touch source schemas at all! Might be worth being more explicit here? I'd only care about schemas for:

  • models
  • seeds
  • archives (to come in Wilt Chamberlain)

No change required necessarily, but wanted to surface it because i saw it :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sources are not refable, so no.

Copy link
Contributor

@drewbanin drewbanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@beckjake beckjake merged commit c5f8cc7 into dev/stephen-girard Jan 28, 2019
@beckjake beckjake deleted the fix/dont-create-unused-schemas branch January 28, 2019 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants