Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit test support for state:modified and --defer #9032

Merged

Conversation

jtcohen6
Copy link
Contributor

@jtcohen6 jtcohen6 commented Nov 8, 2023

resolves #8517 and #9053

Problem

Unit tests should be selectable via state:modified. This includes cases where the unit test definition itself has changed, and where the underlying model has changed (following the existing behavior of --indirect-selection).

Unit tests should also respect --defer. This enables column type checking against the production versions of unbuilt models upstream of the model being unit tested.

Solution

  • Implement same_contents for UnitTestDefinition, and add it as a resource type to state:modified.
  • I added a checksum for faster comparison — but if this is over-engineered, I'm happy to do something simpler.
  • Unit tests are not yet selected by list or build

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX
  • This PR includes type annotations for new and modified functions

@cla-bot cla-bot bot added the cla:yes label Nov 8, 2023
Copy link
Contributor

github-actions bot commented Nov 8, 2023

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

@jtcohen6 jtcohen6 changed the title Select unit tests with state:modified Unit test support for state:modified and --defer Nov 8, 2023
Copy link

codecov bot commented Nov 8, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (3b033ac) 86.69% compared to head (8c9eb48) 86.75%.

Files Patch % Lines
core/dbt/contracts/graph/nodes.py 90.90% 1 Missing ⚠️
Additional details and impacted files
@@                       Coverage Diff                       @@
##           unit_testing_feature_branch    #9032      +/-   ##
===============================================================
+ Coverage                        86.69%   86.75%   +0.05%     
===============================================================
  Files                              181      181              
  Lines                            27033    27056      +23     
===============================================================
+ Hits                             23437    23473      +36     
+ Misses                            3596     3583      -13     
Flag Coverage Δ
integration 83.71% <96.15%> (+0.12%) ⬆️
unit 64.59% <53.84%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jtcohen6 jtcohen6 marked this pull request as ready for review November 8, 2023 04:10
@jtcohen6 jtcohen6 requested a review from a team as a code owner November 8, 2023 04:10
@jtcohen6 jtcohen6 requested a review from aranke November 8, 2023 04:10
# We have the selected models from the "regular" manifest, now we switch
# to using the unit_test_manifest to run the unit tests.
self.using_unit_test_manifest = True
self.manifest = self.build_unit_test_manifest()
self.compile_manifest() # create the networkx graph
self.job_queue = self.get_graph_queue()

def before_run(self, adapter, selected_uids: AbstractSet[str]) -> None:
# We already did cache population + deferral earlier (above)
# and we don't need to create any schemas
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After moving the calling of unit tests into the test task, this is going to be harder to do differently like this since running unit tests will hook into the standard runnable flow.

Copy link
Contributor

@MichelleArk MichelleArk Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, exactly... 🤔 happy to brainstorm alternative approaches in the test task work

@MichelleArk
Copy link
Contributor

MichelleArk commented Nov 13, 2023

This was implemented before CSV file fixtures were supported (#9044) and unfortunately doesn't 'just work' in combination with file-based fixtures. Crucially if a unit test fixture is modified, all downstream unit tests that build on top of this fixture should be re-run. Otherwise, it'd be possible to introduce a change to a fixture that surfaces errors in later CI builds, when the test/model is actually modified.

I tried a couple naive approaches, without success, to get them working together:

  1. Add given.inputs::fixture to the unit test checksum
    a. This felt promising and the way this should work but unfortunately does not because of partial parsing. File-based fixtures are not considered as part of the diff so the node is not re-parsed, meaning no checksum is built on subsequent runs even if a file-based fixture has changes to it. We'd need to include a file-based fixtures as linked files during partial parsing for this to work as expected.
  2. Mark any unit test that depends on a file-based fixture as having been modified between runs
    This is super defensive and feels overly crude - having a single unit test that uses a file-based fixture would mean it's impossible for state:modified selection to return 0 results.

I've implemented the beginnings of (1) including a test (that uses --no-partial-parse explicitly) and have made a follow-up (p0) issue to wrap up the partial parsing component here: #9067

@MichelleArk MichelleArk force-pushed the jerco/8517-unit-test-state-modified branch from cfc1b1f to 2bb4ed6 Compare November 13, 2023 23:33
@MichelleArk MichelleArk force-pushed the jerco/8517-unit-test-state-modified branch from 2bb4ed6 to 8c9eb48 Compare November 13, 2023 23:34
@MichelleArk MichelleArk requested a review from gshank November 14, 2023 14:48
Copy link
Contributor

@gshank gshank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that moving unit testing into the test and build tasks will probably require moving where we're populating the adatper cache in this. But I can deal with that in my pull request. The important thing is probably to have the tests...

# We have the selected models from the "regular" manifest, now we switch
# to using the unit_test_manifest to run the unit tests.
self.using_unit_test_manifest = True
self.manifest = self.build_unit_test_manifest()
self.compile_manifest() # create the networkx graph
self.job_queue = self.get_graph_queue()

def before_run(self, adapter, selected_uids: AbstractSet[str]) -> None:
# We already did cache population + deferral earlier (above)
# and we don't need to create any schemas
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After moving the calling of unit tests into the test task, this is going to be harder to do differently like this since running unit tests will hook into the standard runnable flow.

@MichelleArk MichelleArk merged commit ebf48d2 into unit_testing_feature_branch Nov 14, 2023
49 checks passed
@MichelleArk MichelleArk deleted the jerco/8517-unit-test-state-modified branch November 14, 2023 18:42
gshank added a commit that referenced this pull request Jan 16, 2024
* Initial implementation of unit testing (from pr #2911)

Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>

* 8295 unit testing artifacts (#8477)

* unit test config: tags & meta (#8565)

* Add additional functional test for unit testing selection, artifacts, etc (#8639)

* Enable inline csv format in unit testing (#8743)

* Support unit testing incremental models (#8891)

* update unit test key: unit -> unit-tests (#8988)


* convert to use unit test name at top level key (#8966)

* csv file fixtures (#9044)

* Unit test support for `state:modified` and `--defer` (#9032)

Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>

* Allow use of sources as unit testing inputs (#9059)

* Use daff for diff formatting in unit testing (#8984)

* Fix #8652: Use seed file from disk for unit testing if rows not specified in YAML config (#9064)

Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
Fix #8652: Use seed value if rows not specified

* Move unit testing to test and build commands (#9108)

* Enable unit testing in non-root packages (#9184)

* convert test to data_test (#9201)

* Make fixtures files full-fledged members of manifest and enable partial parsing (#9225)

* In build command run unit tests before models (#9273)

---------

Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Kshitij Aranke <kshitij.aranke@dbtlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants