Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Display the model-level constraints in the documentation generated by dbt #507

Closed
3 tasks done
salimmoulouel opened this issue Apr 24, 2024 · 10 comments
Closed
3 tasks done
Labels

Comments

@salimmoulouel
Copy link

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

When I run dbt docs generate, it fails to include constraints defined at the model level in the documentation. For instance, if I establish a primary key at the model level, this information isn't reflected in the documentation. Instead, it only appears when I define the primary key at the column constraint level. However, because I'm utilizing dbt-bigquery, I'm restricted to declaring only one primary key field at the column level. This necessitates the declaration of primary keys at the model level, thereby creating a discrepancy in the documentation.

Describe alternatives you've considered

To enhance the documentation process in dbt for dbt-bigquery, consider implementing the ability to define multiple primary key constraints at the column level while also ensuring that primary key constraints established at the model level are accurately reported in the documentation. This improvement would streamline the documentation process and provide comprehensive information about the primary keys used in the data models.

Who will this benefit?

No response

Are you interested in contributing this feature?

No response

Anything else?

No response

@salimmoulouel salimmoulouel added enhancement New feature or request triage labels Apr 24, 2024
@dbeatty10 dbeatty10 self-assigned this Apr 24, 2024
@dbeatty10
Copy link
Contributor

Thanks for reaching out @salimmoulouel !

It sounds like you have two asks (please correct me if I'm misunderstanding!):

  1. Display the model-level constraints in the documentation generated by dbt
  2. Ability to define multiple primary key constraints at the column level

Assuming that complete documentation is your main ask here, I'm going to transfer this issue to dbt-docs for further consideration.

1. Display the model-level constraints in the documentation generated by dbt

You want the documentation website (dbt docs generate && dbt docs serve) to display a "PK" label for each of those columns, possibly like this?

image

2. Multiple primary key constraints at the column level

I don't remember off the top of my head, but I think there are technical reasons for us not supporting this. Feel free to open up a separate issue for this one if you'd like, but I'm guessing we'd close as "won't do".

Reprex

models/my_model.sql

select 1 as pk_1, 2 as pk_2

models/_models.yml

models:
  - name: my_model
    
    config:
      materialized: table

      contract:
        enforced: true
    
    # model-level constraints
    constraints:
      - type: primary_key
        columns: [pk_1, pk_2]

    # column-level constraints
    columns:
      - name: pk_1
        data_type: int
        constraints:
          - type: not_null
      - name: pk_2
        data_type: int
        constraints:
          - type: not_null

Build and launch the dbt project docs:

dbt docs generate && dbt docs serve

@dbeatty10 dbeatty10 transferred this issue from dbt-labs/dbt-core Apr 24, 2024
@dbeatty10 dbeatty10 removed their assignment Apr 24, 2024
@dbeatty10 dbeatty10 changed the title [Feature] Display the model-level constraints in the documentation generated by dbt, also put multiple primary key constraint for dbt-bigquery [Feature] Display the model-level constraints in the documentation generated by dbt Apr 24, 2024
@salimmoulouel
Copy link
Author

For the initial request:
As of now, when specified at the column level, it displays accordingly. However, if declared at the model level constraint, it doesn't appear at either level. I wouldn't oppose consistently displaying it at the column level. What's essential to me is ensuring visibility of the primary key, especially when handling multiple primary key fields in BigQuery, which currently doesn't function when declared in the column level constraint.

@dbeatty10
Copy link
Contributor

Currently, the docs have five main sections:

  1. Details
  2. Description
  3. Columns (which contains column-level constraints)
  4. Depends on
  5. Code
image

And none of those sections includes model-level constraints like these:

image

Does that sound right? If so, would adding a new section for model-level constraints solve this for you?

@dbeatty10
Copy link
Contributor

Here's the files + commands I'm using to see how this is behaving currently:

Reprex

models/dual.sql

{{ config(materialized="ephemeral") }}

select 'X' as dummy

models/my_other_model.sql

select 
    3 as other_pk_1,
    4 as other_pk_2

from {{ ref("dual") }}

models/my_model.sql

select 
    1 as pk_1,
    2 as pk_2,
    3 as fk_1,
    4 as fk_2,
    5 as check_1,
    6 as check_2

from {{ ref("dual") }}

models/_models.yaml

models:
  - name: my_other_model
    config:
      materialized: table
      contract:
        enforced: true
    
    # model-level
    constraints:
      - type: primary_key
        columns: [other_pk_1, other_pk_2]
      - type: unique
        columns: [other_pk_1, other_pk_2]

    # column-level
    columns:
      - name: other_pk_1
        data_type: int
      - name: other_pk_2
        data_type: int

  - name: my_model

    config:
      materialized: table
      contract:
        enforced: true
    
    # model-level
    constraints:
      - type: primary_key
        columns: [pk_1, pk_2]
      - type: unique
        columns: [pk_1, pk_2]
      - type: foreign_key
        columns: [fk_1, fk_2]
        expression: "YOUR_SCHEMA_HERE.my_other_model (other_pk_1, other_pk_2)"
      - type: check
        columns: [check_1, check_2]
        expression: "check_1 != check_2"
        name: human_friendly_name

    # column-level
    columns:
      - name: pk_1
        data_type: int
      - name: pk_2
        data_type: int
      - name: fk_1
        data_type: int
      - name: fk_2
        data_type: int
      - name: check_1
        data_type: int
      - name: check_2
        data_type: int
dbt build --full-refresh
dbt docs generate && dbt docs serve

@salimmoulouel
Copy link
Author

sorry for the late answer, i Think yes, that would be perfect, thank you.

@salimmoulouel
Copy link
Author

Do you believe this is a challenge we can resolve in the coming days? This matter pertains to a proof of concept (POC) that, if successful, will guide several teams within our company toward adopting dbt.

@dbeatty10
Copy link
Contributor

Summary

We'd be open to adding model-level constraints to the generated documentation website.

I've labeled this as refinement for us to determine how exactly the user experience would look for this and to determine acceptance criteria.

Timeline

This isn't a time-sensitive priority for us. So no, this is not something that would be available on the timeline you mentioned.

Workaround

However, we do have other capabilities that would allow you to add your own free-form content to the documentation website.

The description field can contain Markdown which provides some flexibility for how you want to format things.

Example

models:
  - name: my_other_model
    config:
      materialized: table
      contract:
        enforced: true
    
    description: >
      ### Model-level constraints

      - primary_key: [`other_pk_1`, `other_pk_2`]

      - unique: [`other_pk_1`, `other_pk_2`]

    # model-level
    constraints:
      - type: primary_key
        columns: [other_pk_1, other_pk_2]
      - type: unique
        columns: [other_pk_1, other_pk_2]
dbt docs generate && dbt docs serve

Screenshot

image

Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Oct 28, 2024
Copy link
Contributor

github-actions bot commented Nov 5, 2024

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 5, 2024
@Cquad
Copy link

Cquad commented Nov 6, 2024

Workaround

However, we do have other capabilities that would allow you to add your own free-form content to the documentation website.

* [Adding model descriptions to your project](https://docs.getdbt.com/docs/collaborate/documentation#adding-descriptions-to-your-project)

* [Using `docs` blocks](https://docs.getdbt.com/docs/collaborate/documentation#using-docs-blocks)

The description field can contain Markdown which provides some flexibility for how you want to format things.

Example

models:
  - name: my_other_model
    config:
      materialized: table
      contract:
        enforced: true
    
    description: >
      ### Model-level constraints

      - primary_key: [`other_pk_1`, `other_pk_2`]

      - unique: [`other_pk_1`, `other_pk_2`]

    # model-level
    constraints:
      - type: primary_key
        columns: [other_pk_1, other_pk_2]
      - type: unique
        columns: [other_pk_1, other_pk_2]
dbt docs generate && dbt docs serve

Adding primary key and foreign key in dbt docs is needed by a lot of users.
Duplicate this information in model description is a workaround, but it increases risk of incoherence between contract model and documentation.

@github-actions github-actions bot added the triage label Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants