Preserve the case of schemas and databases when listing relations (#2403) #2411

beckjake · 2020-05-06T19:41:10Z

resolves #2403

Description

There's hopefully nothing too wild or controversial in here - this is a substantial change, but it should fix some correctness issues in how dbt lists schemas. I think in the long run this will be more pleasant for adapter authors than having to figure out how to render the database/schema in each of those, and just defer it to the relations.

Change list_relations_without_caching (both the macro and the method) to take a single argument, a Relation with no identifier and identifier set to not include. Macros can use this to get a database-specific representation of the relation. The logic around creating missing schemas was updated similarly - dbt will faithfully issue "create schema" statements for whatever you give it, rather than coercing everything to lowercase and wrapping it in the given quoting.

In the process and for the same reasons, I updated drop_schema and create_schema to take the same kind of argument.

Checklist

I have signed the CLA
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

The argument is a Relation object with no identifier field, configured with the appropriate quoting information Unique quoted/unquoted representations will be treated as distinct The logic for generating what schemas to search for relations is now distinct from the catalog search logic. Schema creation/dropping takes a similar relation argument Add tests

drewbanin

One comment here around community plugins (ie. Spark), but otherwise this LGTM. Let me know what you think about that comment and if there's anything we should do differently here. If not, this is approved on my end

drewbanin · 2020-05-08T00:41:32Z

core/dbt/adapters/base/impl.py

-        self.cache.update_schemas(schema_map.schemas_searched())
+        cache_update: Set[Tuple[str, Optional[str]]] = set()
+        for relation in cache_schemas:
+            if relation.database is None:


Just wanted to mention here that some plugins (namely Spark) use database as an alias for schema, and only have the one level of hierarchy. I think it's ok to add this logic here -- we can override it in a plugin if we need to -- but the idea that all relations will have a database and schema is not quite so invariant from dbt Core's perspective as this code indicates

Spark will already need to be updated, as it overrides _get_cache_schemas and list_relations_without_caching.

I think this change will actually make it easier for spark to handle it: it sets database = schema but the include policy for relations is set to database=False - there's probably a couple rough edges to clear up, but my goal is that spark's default include policy + behavior will play pretty nicely with this and we can delete the create/drop schema implementations, at least.

core/dbt/task/runnable.py

Co-authored-by: Drew Banin <drew@fishtownanalytics.com>

cla-bot bot added the cla:yes label May 6, 2020

beckjake force-pushed the fix/always-preserve-case branch from efc3ddc to 12e5567 Compare May 7, 2020 19:18

beckjake changed the title ~~[WIP] Always preserve the case of values in the SchemaSearchMap (#2403)~~ Preserve the case of schemas and databases when listing relations (#2403) May 7, 2020

beckjake force-pushed the fix/always-preserve-case branch from 12e5567 to 7c99b76 Compare May 7, 2020 19:40

beckjake marked this pull request as ready for review May 7, 2020 19:42

beckjake force-pushed the fix/always-preserve-case branch from 7c99b76 to bd83126 Compare May 7, 2020 19:53

beckjake force-pushed the fix/always-preserve-case branch from bd83126 to e392212 Compare May 7, 2020 20:14

beckjake requested a review from drewbanin May 7, 2020 20:50

drewbanin reviewed May 8, 2020

View reviewed changes

Update core/dbt/task/runnable.py

ab8392b

Co-authored-by: Drew Banin <drew@fishtownanalytics.com>

drewbanin approved these changes May 8, 2020

View reviewed changes

beckjake merged commit c9eec4f into dev/octavius-catto May 8, 2020

beckjake deleted the fix/always-preserve-case branch May 8, 2020 14:15

jtcohen6 mentioned this pull request Mar 18, 2022

[CT-254] dbt-core - Snapshot - macro dbt_macro__create_schema takes not more than 1 argument(s) #4742

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserve the case of schemas and databases when listing relations (#2403) #2411

Preserve the case of schemas and databases when listing relations (#2403) #2411

beckjake commented May 6, 2020 •

edited

Loading

drewbanin left a comment

drewbanin May 8, 2020

beckjake May 8, 2020

Preserve the case of schemas and databases when listing relations (#2403) #2411

Preserve the case of schemas and databases when listing relations (#2403) #2411

Conversation

beckjake commented May 6, 2020 • edited Loading

Description

Checklist

drewbanin left a comment

Choose a reason for hiding this comment

drewbanin May 8, 2020

Choose a reason for hiding this comment

beckjake May 8, 2020

Choose a reason for hiding this comment

beckjake commented May 6, 2020 •

edited

Loading