Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable inline csv format in unit testing #8743

Merged
merged 11 commits into from
Oct 5, 2023

Conversation

gshank
Copy link
Contributor

@gshank gshank commented Sep 28, 2023

resolves #8626

Problem

Users want to be able to use the csv format in addition to dictionaries in their unit tests.

Solution

Add a "format" field to the given and expect structure, create an OutputFixture class (in addition to InputFixture).

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX
  • This PR includes type annotations for new and modified functions

@gshank gshank requested review from a team as code owners September 28, 2023 20:30
@gshank gshank requested review from thisispvb, McKnight-42 and aranke and removed request for a team September 28, 2023 20:30
@cla-bot cla-bot bot added the cla:yes label Sep 28, 2023
@codecov
Copy link

codecov bot commented Sep 28, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (bb6fd30) 86.66% compared to head (ada13fb) 86.69%.
Report is 4 commits behind head on unit_testing_feature_branch.

Additional details and impacted files
@@                       Coverage Diff                       @@
##           unit_testing_feature_branch    #8743      +/-   ##
===============================================================
+ Coverage                        86.66%   86.69%   +0.02%     
===============================================================
  Files                              178      178              
  Lines                            26234    26276      +42     
===============================================================
+ Hits                             22737    22779      +42     
  Misses                            3497     3497              
Flag Coverage Δ
integration 83.51% <95.45%> (+0.02%) ⬆️
unit 64.98% <65.90%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
core/dbt/contracts/graph/model_config.py 92.34% <100.00%> (+0.08%) ⬆️
core/dbt/contracts/graph/nodes.py 95.21% <100.00%> (+<0.01%) ⬆️
core/dbt/parser/unit_tests.py 95.09% <100.00%> (+2.16%) ⬆️
core/dbt/contracts/graph/unparsed.py 93.76% <94.44%> (-0.01%) ⬇️

... and 3 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gshank gshank changed the base branch from main to unit_testing_feature_branch September 28, 2023 20:39
core/dbt/contracts/graph/unparsed.py Outdated Show resolved Hide resolved
core/dbt/parser/unit_tests.py Outdated Show resolved Hide resolved
@gshank gshank requested a review from MichelleArk October 2, 2023 19:48
Comment on lines 64 to 74
if test_case.expect.format == "dict":
if isinstance(test_case.expect.rows, List):
expected_rows = test_case.expect.rows
else:
raise ParsingError("Wrong format for expected rows")
else: # test_case.expect.format == "csv":
# build a dictionary from the csv string
if isinstance(test_case.expect.rows, str):
expected_rows = self._build_rows_from_csv(test_case.expect.rows)
else:
raise ParsingError("Wrong format for expected rows")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm brainstorming ways that we could consolidate this logic and share it across InputFixture and OutputFixture. The simplest thing I can think of is adding a parsed_rows property on a shared UnitTestFixture base class that does this conditional logic to parse the raw rows given a format and convert it into a consistent internal representation. What do you think?

Copy link
Contributor

@MichelleArk MichelleArk Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative could be to have a UnitTestFixture class that can be created from either an InputFixture or OutputFixture (making these just the external-facing interfaces). That way we could consolidate the logic of parsing the rows based on the format, into a consistent internal fixture object and store the parsed rows as state on that object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought it would be a good idea to have some kind of combined logic but when I wrote this my brain was kind of offline :). Since we know we're going to also support csv files soon, it should be a solution that makes that easy. Let me think about whether a base class or derived class makes more sense...

Comment on lines +747 to +753
@property
def format(self) -> UnitTestFormat:
return UnitTestFormat.Dict

@property
def rows(self) -> Union[str, List[Dict[str, Any]]]:
return []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it necessary for these to be properties? Could they instead be attributes that are inherited by UnitTestInputFixture and UnitTestOutputFixture?

e.g.

rows: Union[str, List[Dict[str, Any]]] = ""
format: UnitTestFormat = UnitTestFormat.Dict

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do that then we run into the frustrating issue of fields without defaults can't come after fields with defaults issue and have to split them out into a special class and do a different order. I'd kind of rather not.

@gshank gshank requested a review from MichelleArk October 5, 2023 00:27
Copy link
Contributor

@MichelleArk MichelleArk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some small suggestions, otherwise LGTM!

@gshank gshank merged commit 3b6f9bd into unit_testing_feature_branch Oct 5, 2023
48 checks passed
@gshank gshank deleted the 8626-ut_format_inline_csv branch October 5, 2023 15:17
gshank added a commit that referenced this pull request Jan 16, 2024
* Initial implementation of unit testing (from pr #2911)

Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>

* 8295 unit testing artifacts (#8477)

* unit test config: tags & meta (#8565)

* Add additional functional test for unit testing selection, artifacts, etc (#8639)

* Enable inline csv format in unit testing (#8743)

* Support unit testing incremental models (#8891)

* update unit test key: unit -> unit-tests (#8988)


* convert to use unit test name at top level key (#8966)

* csv file fixtures (#9044)

* Unit test support for `state:modified` and `--defer` (#9032)

Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>

* Allow use of sources as unit testing inputs (#9059)

* Use daff for diff formatting in unit testing (#8984)

* Fix #8652: Use seed file from disk for unit testing if rows not specified in YAML config (#9064)

Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
Fix #8652: Use seed value if rows not specified

* Move unit testing to test and build commands (#9108)

* Enable unit testing in non-root packages (#9184)

* convert test to data_test (#9201)

* Make fixtures files full-fledged members of manifest and enable partial parsing (#9225)

* In build command run unit tests before models (#9273)

---------

Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Kshitij Aranke <kshitij.aranke@dbtlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants