-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-2998] [SPIKE] Handle unit testing of JSON and ARRAY data types #8423
Comments
From refinement:
Example for testing constraints - https://github.com/dbt-labs/dbt-core/blob/main/tests/adapter/dbt/tests/adapter/constraints/test_constraints.py#L70-L73 |
Relevant to #8499 |
Example for why someone might want to unit test a model that has inputs with JSON data type:
|
@aranke to document edge cases (will update this ticket) |
Reason this is higher priority:
This is in addition to not being able to use input data with complex data type (the initial reason we opened the issue) |
Spike Report / UpdateCurrent Statedbt-snowflake, dbt-bigquery, and dbt-spark support mocking inputs with complex types in unit testing, including json and array types. tests here:
dbt-postgres and dbt-redshift support json, but not arrays. These implementations largely required very minor & precise changes to However, dbt-postgres and dbt-redshift will require a different approach to support complex types. This is because the strategy for obtaining the column schemas in unit testing is adapter.get_columns_in_relation. adapter.get_columns_in_relation works flawlessly for the 3 adapters above, but is lossy for dbt-postgres and dbt-redshift. For example, array types are simply 'ARRAY' and don't include the type of array which is necessary for safe casting to an appropriate type. An alternative strategy we can use for these adapters is So: we'll need to support both mechanisms of retrieving the column schema, as dictated by a specific adapter implementation because one strategy will work for many adapters but not others, and vice versa. Proposal
New Issues
|
Description
We should enable adding unit test mock inputs and outputs that contain fields of type JSON or type ARRAY.
Example using in-line yaml
dict
format:Example using
csv
format:Note: we can assume we already know the data types and columns of the inputs.
Acceptance criteria
FOR SPIKE:
FOR IMPLEMENTATION:
format
s, ideally all)format
s, ideally all)Impact to other teams
Impact adapter teams
Will backports be required?
No
Context
The text was updated successfully, but these errors were encountered: