Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the report schema and its models #122

Closed
MattTriano opened this issue Apr 18, 2023 · 1 comment · Fixed by #136
Closed

Remove the report schema and its models #122

MattTriano opened this issue Apr 18, 2023 · 1 comment · Fixed by #136

Comments

@MattTriano
Copy link
Owner

This adds some clutter to the repo and I'd rather just generate reports via superset dashboards (which can be periodically updated and emailed or sent to slack per these instructions).

@MattTriano
Copy link
Owner Author

This isn't a problem caused by the report schema, but the long names of resources in the report schema caused this issue to emerge. I included a heavily truncated subset of the error message below.

...
START sql table model report.chicago_residential_parcel_sales_per_quarter_and_cpd_beat  [RUN]
START sql table model report.chicago_residential_parcel_sales_per_quarter_and_cpd_district  [RUN]
ERROR creating sql table model report.chicago_residential_parcel_sales_per_quarter_and_cpd_district  [ERROR [0m in 7.44s]]
OK created sql table model report.chicago_residential_parcel_sales_per_quarter_and_cpd_beat  [SELECT [0m in 7.54s]]

Finished running 2 view models, 21 table models in 0 hours 18 minutes and 45.97 seconds (1125.97s).
Completed with 1 error and 0 warnings:
Internal Error
Cache inconsistency detected: in rename, new key _ReferenceKey(database='re_dwh', schema='report', identifier='chicago_residential_parcel_sales_per_quarter_and_cp__dbt_backup') already in cache: [_ReferenceKeyschema='report', identifier='chicago_residential_parcel_sales_per_quarter_and_cpd_district'),
_ReferenceKey(schema='report', identifier='chicago_residential_parcel_sales_per_quarter_and_cp__dbt_backup')]

Name collisions can occur when the length of two models' names approach your database's builtin limit. Try restructuring your project such that no two models share the prefix 'chicago_residential_parcel_sales_per_quarter_and_cp'. Then, clean your warehouse of any removed models.

Done. PASS=22 WARN=0 ERROR=1 SKIP=0 TOTAL=23

Apparently postgres's default max identifier length is 63 bytes, and it seems dbt creates backup tables with the suffix __dbt_backup (and this behavior may be vestigial), which reduces the length limit for identifiers down to 51 characters, as dbt will just truncate identifiers longer than 51 characters. This is the root of my issue, as two of my models have names with the same first 51 characters, so truncating to 51 and tacking on a constant caused a naming collision. This issue was astutely anticipated by this comment.

In any case, I'll just make a note to limit table_names to 50 characters.
Note to future me: if you had to look this up again, make an issue and take the time to programmatically enforce the name limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant