Unit Testing #8275
Replies: 33 comments 70 replies
-
Additional motivation & design goalsMotivation & Design GoalsWhen writing unit tests for software development, I always find myself going back to Kent Beck’s Test Desiderata, which describes a set of ‘desirable properties’ for unit tests - all of which come together to produce a sense of confidence that our efforts are progressing, that the whole system is working. How might these properties guide the design a unit testing framework in dbt? While all twelve properties are relevant to this problem, there are a handful that feel most relevant and (and/or challenging!) to me when applied to unit testing data models. My notes as bullets, but Kent says it best:
|
Beta Was this translation helpful? Give feedback.
-
I am extremely excited this will be part of dbt-core! A few random thoughts:
For example, in Ruby on Rails fixtures, if you have a table called
This is very powerful, because it allows you to have a "name" for different types of rows. For example, for Note: The exact same idea is seen in different "factory" libraries like rosie or factory_bot, the main difference being factories have a different way of implementing the actual data, but they are at the same level of abstraction as fixtures. The other reason I'm rambling about this, it would allow things to be much more composable/dry. Here is what I think it could look like:
The P2 of that would be "overrides of inline fixtures". for example:
|
Beta Was this translation helpful? Give feedback.
-
@MichelleArk : Awesome write up! I am looking forward to this feature Right off the bat, the following comes to mind:
|
Beta Was this translation helpful? Give feedback.
-
Great initiative, really looking forward to getting this into core!
Would love to see
Could be more expressive. E.g. of other assertions or interpretations; either way i think it needs some clarification in reagards to ordering.
Getting incremental logic right is complex and would love to cover in tests. Not sure it is best solved with unit tests since it requires multiple stages |
Beta Was this translation helpful? Give feedback.
-
Fantastic! Looking forward to it! Thanks for prioritizing this effort.
Can you provide an example of how you'd envision the "pass through model" approach? I'd love to see the need for testing macros get move attention. Might you consider promoting it to the "follow-on" list? I find myself writing unit tests for macros quite a bit and these are the sort of things see:
Cheers! |
Beta Was this translation helpful? Give feedback.
-
Really nice to see this feature coming !!! My use caseIn one word : let's say i have a pretty straighforward pipeline : I would like to be able to test the whole pipeline by just defining the Some answers to your questions :
I like the idea of the package dbt-coverage. It would be nice to quickly identify which model has test on column, unit_test...
I was planning to use a
It really depends on how easy i can identify which model has unit_test or not. |
Beta Was this translation helpful? Give feedback.
-
Super exciting!
This and overriding is_incremental are really important features for us as it unblocks proper testing of incremental models. |
Beta Was this translation helpful? Give feedback.
-
Will arrays and arrays of structs (and further nesting) be supported? |
Beta Was this translation helpful? Give feedback.
-
@MichelleArk Looking forward to this new feature. Core functionality
Performance requirements
UX requirements
|
Beta Was this translation helpful? Give feedback.
-
@MichelleArk - with respect to your forth product requirement, "configure overrides of macros, vars, or environment variables for a given unit test" which "dbt-core naturally has access to". While you are building that feature into dbt-core for usage by the dbt unit testing functionality, would it be possible to expose some of that from the dbt python module, so that other testing tools can utilize that? I am thinking about building some dbt testing tools (similar but different from what you have laid out here), and these building blocks would be super helpful. Let's imagine
It would be amazing if I could do this (all pseudo code):
which would return
which would return
which would return
|
Beta Was this translation helpful? Give feedback.
-
Awesome that you're working on this @MichelleArk! We're very much looking forward. Together with @ZhouSu89 I've been involved in the development of our internal unit testing framework. I was more on the end-user side of it. So very happy to share some of the requirements we set or learnings we had during the way + some examples of our input. Input:
config:
meta:
unit_tests:
- name: unit_test_x
mocks:
input_mock: |
path_to_input_mock
expected: |
path_to_expected_mock
vars:
start_date: z
config:
meta:
unit_tests:
....
test_config:
expected_columns:
[
x,
y,
z
]
models:
- name: test_get_merge_sql__standard
config:
meta:
unit_tests:
- name: test_standard_merge
description: |
Test standard merge behaviour where existing records are updated and
new ones are inserted.
mocks:
source_mock_dataset__incremental_model_source: |
INTEGER,DATE,INTEGER
id,kafka_write_time,kafka_offset
1,2021-01-01,1
1,2021-01-02,2
2,2021-01-01,1
3,2021-01-02,1
actual_current: |
INTEGER,TIMESTAMP,INTEGER
id,kafka_write_time,kafka_offset
2,2021-01-01 00:00:00 UTC,1
1,2021-01-01 00:00:00 UTC,1
expected: |
INTEGER,TIMESTAMP,INTEGER
id,kafka_write_time,kafka_offset
2,2021-01-01 00:00:00 UTC,1
1,2021-01-02 00:00:00 UTC,2
3,2021-01-02 00:00:00 UTC,1
vars:
data_interval_start: '2021-01-02' In terms of output, here are some of the requirements we had:
|
Beta Was this translation helpful? Give feedback.
-
Would like to chime in and also strongly request CTE testing. Arguments of whitebox vs blackbox testing are age old - however there are very few languages that actively disempower the engineer from taking the decision. I understand trying to set the standards, but agree with previous points raised that breaking out CTEs into multiple models causes a different set of headaches. Currently, we are testing our CTEs individually and it allows us to interrogate our code with more test cases where needed. Would strongly request that we have the ability to test CTEs in isolation. |
Beta Was this translation helpful? Give feedback.
-
This is really awesome to hear! Two questions come to mind:
|
Beta Was this translation helpful? Give feedback.
-
The UX you're proposing for testing is extraordinarily similar to the one we've rigged up in our Airflow repo for doing unit testing for data transforms, and we'd love to have similar functionality available in dbt. We use Snowflake as our data warehouse and have encountered some special issues working with it in the test environment, so I wanted to share a tech blog we wrote about our experience in case it's helpful to your team: https://github.com/BookBub/public-docs/blob/main/snowflake_testing_tech_blog.md |
Beta Was this translation helpful? Give feedback.
-
Hey! |
Beta Was this translation helpful? Give feedback.
-
Select statements are not easily readable so would try to avoid them.
We need something else than CSV for complex types (e.g. nested table,
repeated field). This might be caught in your dict implementation though.
Will this be a python dict? If not, why not implement this as json? E.g.
escape characters should follow some standard and then you quickly end up
with json instead of a dict I guess.
Op do 23 nov. 2023 om 18:31 schreef Grace Goheen ***@***.***>
… Thanks for your feedback! Our initial spec provides 3 ways to define
mocked inputs & expected outputs.
In-line dictionary:
unit-tests:
- name: test_my_model
model: my_model
given:
- input: ref('my_model_a')
format: dict
rows:
- {id: 1, name: gerda}
- {id: 2, b: michelle}
...
In-line csv:
unit-tests:
- name: test_my_model
model: my_model
given:
- input: ref('my_model_a')
format: csv
rows: |
id,name
1,gerda
2,michelle
...
Fixture csv files:
unit-tests:
- name: test_my_model
model: my_model
given:
- input: ref('my_model_a')
format: csv
fixture: my_model_a_fixture
...
# tests/fixtures/my_model_a_fixture.csv
1,gerda
2,michelle
Would this satisfy your use case? What are the advantages you see for json
and/or select statements? Thanks!
—
Reply to this email directly, view it on GitHub
<#8275 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEMALOFDDS5OISR6VJJWEC3YF6B5JAVCNFSM6AAAAAA3AOLPASVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TMNJUGAZDQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
In my experience, the most efficient way of creating realistic test data is to define that data using code with tools such as faker, dbldatagen, SDV, etc.. Static test cases quickly become unwieldly to handle, as you're always playing a cat and mouse game between what the changes in production are, and the test cases you need to replicate. Imagine a simple change rippling through 1000 test cases, you're not going to want to write another test again. Code is always easier to change than data. Ideally the solution in dbt caters for this. |
Beta Was this translation helpful? Give feedback.
-
Also excited to see this! Some opinions Unit testing should only occur on contracted models. Unit testing should be linked to versions specs. We will constantly break these rules. Please make it easy for devs. Snowplow is a great example. Should unit testing be enforced at the semantic layer? When is all this yaml too much? New devs are going to be even more confused. This needs a really good package to get adopted by my team. |
Beta Was this translation helpful? Give feedback.
-
Hi friends - @jtcohen6 pinged me about this a while ago after we discussed the state of testing in dbt. My claim which preceded that discussion was that dbt has successfully applied many of the successful paradigms of software engineering (e.g. source control; dependency management) to analytics/data, but has thus far lagged on perhaps the most important one that is critical for enterprise software - testing. Of course I hadn’t seen this rich discussion yet 🙂 Belated but just want to transfer some learnings after ~3 years of building/maintaining a homegrown test platform across a large data eng team (+ many more years of testing with CSVs ;) ). Why do we test, again?To a first order, yes, testing does allow engineers to provide automated, protective guarantees about the behavior of their software. And that’s a win. “But at what cost?” Good testing practices and tools introduce very short term, very local friction but beyond that, actually improve productivity and quality simultaneously. You should very soon see these effects, else your solution needs improvement ;) :
I’d interview early adopters of unit testing to see if these effects are being felt - it usually takes some months before they’re obvious but tech leads will see signals early. The holy grail of testing in SWE is “test driven development (TDD)” (https://en.wikipedia.org/wiki/Test-driven_development), whereby developers feel so empowered by their testing tools that they write the tests first, before even writing their logic (omg). This is most often a pipedream but does happen with some combo of great tools and v. skilled/experienced engineers. A good north star… (I’m imagining a future where product folks write down some test expectations in a spreadsheet and plain english of what transformation logic should do, and gen ai writes the PR with passing tests. ) Some land mines/ideas(having built an internal dbt testing tool for ~50 engineers) Testing introduces friction in the development process. How to minimize extra pain such that engineers have no qualms about updating a large set of tests on code written years ago, or even consider writing tests first? Testing has to be ergonomic:
Here’s a screenshot of the dbt test spec format that’s the core of the tool I’d built on previous team - it’s finely tuned for a specific team, there's things I don't like about it, etc. so just intended for inspo. It’s HOCON format - a more composable, flexible version of YAML or JSON (common in Scala land) (e.g. you can reference variables!). The entire test spec for a model (or set of models if desired) can be contained in one file. FWIW, having tens of these test specs completely greatly accelerated the efficiency of our team and prevented myriad bugs from leaking into production. https://docs.spongepowered.org/stable/en/server/getting-started/configuration/hocon.html https://blog.ometer.com/2015/09/07/json-like-config-a-spectrum-of-underoverengineering/ |
Beta Was this translation helpful? Give feedback.
-
Hey folks! Dropping by to let you know a community feedback / office hours session that we'll be running in a couple weeks. Thursday, 18 January, 8am Pacific: Unit testing as native functionality in dbt
Some supporting resources:
|
Beta Was this translation helpful? Give feedback.
-
Hey there! I'm quite new to DBT discussions, so I hope this finds good. Checking the proposal and the comments, we would take advantage with something like: given
do
then
Since we have all the dbt tests capabilities, having the OthersWe could expect that given the dbt lineage of the selector, the unittest validates that all the source nodes are provided and defined in the It could also return a unit:
- model: my_model
tests:
- name: test_name
given:
- input: ref('my_model_a')
format: seed
seed: ref('test_001/input/model_a.csv')
do:
selector: "@marketing"
then:
- expect: ref('campaigns')
format: seed
seed: ref('test_001/output/campaigns.csv') |
Beta Was this translation helpful? Give feedback.
-
Here's the Zoom recording from the feedback session earlier today: https://dbtlabs.zoom.us/rec/share/kPEECDqWh3Sbp4ApWbJ8_q3uGbyNcH7FGmm03MkhVOudV2gKrgvqvNn6T1z0QZuK.CcDkzLIvZPQGd2_B?startTime=1705593243000 |
Beta Was this translation helpful? Give feedback.
-
Great presentation of the feature. Some comments:
|
Beta Was this translation helpful? Give feedback.
-
@graciegoheen @MichelleArk this is great to see! In the webinar recording, Grace mentions that you need not provide all of the columns the model requires for test input data where the test does not require them. I can see this working fine where there is no aggregation in the model and the grain of the data doesn't change in the model. However, where the model does aggregate data, I can see a situation where only specifying a subset of the group by fields could cause issues with the output of an aggregate function... therefore the unit test not reflecting reality of model use. How do the internals of unit test work in this context? Do unit tests require all fields from a group by be present in the test input data, or will it just group by the provided fields? |
Beta Was this translation helpful? Give feedback.
-
Hey @graciegoheen! It would be great if we could have a release tagged v1.8-beta. This would allow us to conduct testing even while the unit tests are not fully ready. |
Beta Was this translation helpful? Give feedback.
-
hi @MichelleArk, according to your idea
Has this been supported in dbt v1.8 yet? |
Beta Was this translation helpful? Give feedback.
-
Hi dbt folks! Great job on implementing unit tests in dbt 1.8 👏 This feature is a huge improvement. Here are my 2 cents on future improvements I would love to see:
Edit on 1: My teammate David reminded me that it's already possible to mock the current state of the model, like the doc mentions:
That's great! All that's missing is testing the result of the merge given the input :) |
Beta Was this translation helpful? Give feedback.
-
Thanks for all the people implementing unit tests in dbt. This is really great to improve data quality and reliability. As a dbt newbie, I'm wondering if the unit test feature supports Athena query engines? Our project heavily relies on Athena and all our SQL is Athena variant. Does the default engine used in unit test support such SQL Dialect? Or do we have to use dbt-athena adaptor to make the unit test work? Is it still possible to run unit tests without relying on the cloud infrastructure and the real glue database and tables? Thanks. |
Beta Was this translation helpful? Give feedback.
-
Hi @graciegoheen. Is there's any development going on to support testing incremental model's merge/insert/append behaviour and if so, do you have any (super broad) timelines? Or has this been deprioritised for the moment? Thanks in advance! |
Beta Was this translation helpful? Give feedback.
-
Hello DBT team, first of all great to see the unit testing working well :) However, I've detected an interesting behavior, my AE team implemented the unit tests which given some input, they were expecting an output, the unit test is working well, but the output doesn't comply with the test rules defined for the model. Which in my opinion should make the unit test fail, but it isn't. I believe that the What do you think? |
Beta Was this translation helpful? Give feedback.
-
Background
Since the original unit testing discussion was opened, community members have created several packages with their own frameworks for unit testing dbt models. Others have built proprietary solutions on top of dbt. The call is coming from inside Coalesce: unit testing ought to be native functionality.
Unit testing in dbt would provide the ability to check your modeling logic on a small set of static inputs (instead of your full production data) - a low-cost method to ship changes with confidence. Need to refactor an important, mature, heavily-used model? Unit testing is imperative for data teams — whether small or gigantic! — to (a) optimize warehouse spend/performance and (b) increase data quality at a fraction of the cost.
A well-designed unit testing framework can also enable test-driven development, with benefits for iteration speed & quality. Finally, unit testing is a natural thematic extension of our work this year around multi-project collaboration and model governance — they all come together to create stable and reliable interfaces for cross-team collaboration at scale.
I won’t bury the lede further than that - we’re going to start tackling this problem in the next month. The rest of this post overviews a proposal that we’d love to get feedback on from all the testing enthusiasts who have participated in this discussion, and in the broader discussions about unit testing in the modern data stack, over the past few years.
Proposal
Let’s walk through an example (based on a true story!) with a minimal SQL model called
my_model
that evaluates for eachuser_id
whether they have a valid email address:The proposed interface for writing a unit test is a data structure, represented in yaml. I can define a corresponding unit test under
models/unit_tests.yml
to ensure all of my edge cases are captured - emails without ‘.’, emails without ‘@’, emails from invalid domains.The above example illustrates defining input fixtures inline, and the options to specify reusable input fixtures or configure input
format
s are described more completely in the product requirements section below.Running a new dbt command,
dbt unit
provides the following output:Uh oh! It looks like our clever regex statement wasn’t as clever as we thought - our model is incorrectly flagging
missingdot@gmailcom
as a valid email address.Updating it to
'^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,}$'
and runningdbt unit
again does the trick:Product Requirements
🚀 Features of the proposed spec that would encompass an MVP [P0s]:
dbt unit
- that accepts an optional--select
method and runs any associated unit tests.ref
orsource
macros resolving to mocked input data as opposed to production datasets.input
s not relevant to a test case. This enables writing succinct and specific unit tests.overrides
of macros, vars, or environment variables for a given unit test. This is something that’s really tricky to do as a package or dbt wrapper implementation without direct access to dbt’s jinja context(s)… which dbt-core naturally has access to!format
support for inputs (instead of a list of dictionaries). For example:⭐️ Features that would make exciting follow-on work [P1s] would be:
--threads
argument. We could consider--groups <n>
and--split <n>
arguments that would run a particularsplit
of the unit test collection given a total number ofgroups
, similar to functionality the pytest-split plugin providesthis
macros
or a special-case ofinput
so that it is overridable as a fixture similar to aninput
even though it typically returns aRelation
🚫 A handful of ‘anti-goals’, or features not likely to be prioritized for the first cut of this framework would be:
Next steps, and how to get involved!
There are many many implementation considerations and tradeoffs that I’ll leave open for @gshank to run point on as the technical lead for this project over in the epic. Additionally, @graciegoheen and @dbeatty10 will be offering their leadership and support from the product and developer experience perspectives for this initiative.
Please feel free to chime into this discussion with any related thoughts, considerations, and suggestions. We’ll also be aiming to release beta milestones for this functionality, and early beta testers and feedback is always welcome!
To get the ball rolling, here are some open questions we’ve been noodling on so far:
format
s you’d expect to see in dbt-core?models/
directory? In yourtests/
directory? Somewhere else?Beta Was this translation helpful? Give feedback.
All reactions