-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
init push up of converted unique_key tests #4958
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
TODO: update changelog before final merge |
@McKnight-42 Very glad to see an early draft PR! There's a namespacing question here:
@gshank @ChenyuLInx Definitely interested in hearing your thoughts. This distinction is something we'd want to reflect in our documentation as well (dbt-labs/docs.getdbt.com#1263). It also leads me to wonder if there's some way to "auto-document" the set of test cases available in |
That's a good question about where it should live. At some point we probably do want to put the required tests into one directory. Adapter tests will have to end up in the "adapter zone", which is in a directory in tests/adapter/dbt/tests/adapter. That's where they have to be in order to be subclasses in the adapter repos. So right now I'd put them in tests/adapter/dbt/tests/adapter/incremental_unique_id. Then when we create override tests they can be in tests/functional/adapter/test_incremental_unique_id.py. We might want to reorganize them later. |
The idea is that tests/functional will have various directories of tests, including ones that are specific to that repo, but that tests that are overridden would be in tests/functional/adapter. That was my initial idea, anyway, but if someone has a better idea, please suggest it. |
Moved test up one directory to be on level with with basic. will keep looking into the errors that are popping up, have we picked the first adapter we want to convert in? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests need to be in the base test class so they can be importable.
return run_result.status, run_result.message | ||
|
||
|
||
class TestIncrementalWithoutUniqueKey(IncrementalUniqueKeyBase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of these tests are actually importable, because they start with the name 'Test'. Why aren't the tests in the base test class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the original tests were separated out into classes that inherited the setup based on type of unique_key case it was testing. do we have a preferred way to reformat the test/class names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved all tests to the base class and created a new test class based on the other examples we have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this conversation clarified for me why we're doing it like this:
dbt-core/tests/adapter/dbt/tests/adapter/basic/test_base.py
Lines 109 to 110 in 5071b00
class TestSimpleMaterializations(BaseSimpleMaterializations): | |
pass |
As I understand it:
- By default,
pytest
only runs tests if the method starts withtest_
, and the class starts withTest
, in a file whose name starts withtest_
- Our move has been to create a "base" class (not named
Test
), with test methods defined on it that do start withtest_*
- Then, inherit that "base" class into an actual test class, whose name starts with
Test
, and either pass (no changes needed) or override/reimplement fixtures/methods from the base class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's correct :-). You can import a base class that starts with Test, the problem is that pytest would then run all of the tests in the base class plus all of those same tests in the subclass. Which means that if you need to override a test method or get failures, you'd get failures.
test/integration/076_incremental_unique_id_test/test_incremental_unique_id.py
Outdated
Show resolved
Hide resolved
state::varchar(2) as state, | ||
county::varchar(12) as county, | ||
city::varchar(12) as city, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll want to be thoughtful about data types in order to get this running successfully on multiple databases. E.g. BigQuery doesn't support a data type named varchar(length)
. Our options here:
- Override/reimplement all fixtures when we run this test in
dbt-bigquery
(ugh) - Remove type casts here, try to handle all data type business via seeds instead. That may also require a different approach for
add_new_rows.sql
/duplicate_insert.sql
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests are also working on the adapter repos? Did you change the datatypes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could provide some of that information (with data_types) in a custom fixture. Avoiding that would be even better. There is actually an update_rows function in dbt.tests.util which came from dbt-adapter-tests and does work on all of our repos because it calls an adapter method.
Also, I think putting the test settings in variables and then putting them into the structures isn't the best practice. I'm gong to look at it a bit more to come up with recommendations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests are also working on the adapter repos? Did you change the datatype
Yes, I believe there are slight modifications for each adapter repo
There is actually an update_rows function in dbt.tests.util which came from dbt-adapter-tests and does work on all of our repos because it calls an adapter method.
I'm a bit skeptical of this method tbh (I know it required us to override a data type in Redshift for basic tests), and want to discuss alongside table comparison logic—but maybe it's the way to go!
Maybe the move should be:
- Once this test case is passing in this repo (for Postgres), we try running it (from this branch) in a branch of each plugin repo as well
- Update the test cases here, override them there, get them all passing for all plugins
- Fine-tune the framework so that the overrides feel minimal + appropriate
- Merge across the board (and delete all the previously duplicated test code)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like the right process.
The override in Redshift for a seed data column was because the test goes and updates the name by appending '_update'. Redshift created the seed table with a varchar datatype that wasn't big enough. The best solution might actually have been to change that update so the size didn't increase, but the 'update_rows' method was kind of fiddly.
dbt-labs/dbt-redshift#92 redshift seems to be passing tests locally as we would hope |
Let's put 'test_incremental_unique_id.py' into an 'incremental' directory. Then when we convert the other incremental tests they will all be in the same directory. This will reduce the number of files that adapter repos will have to create to pull in adapter tests. |
return run_result.status, run_result.message | ||
|
||
# no unique_key test | ||
def test__no_unique_keys(self, project): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the factoring in the way these tests are done could be more straightforward and easier to read. I would recommend that you do something like:
def test__no_unique_keys(self, project):
"""with no unique keys, seed and model should match"""
expected_fields = self.get_expected_fields(relation="seed", seed_rows=8)
test_case_fields = self.get_test_fields(
project, seed="seed", incremental_model="no_unique_key", update_sql_file="add_new_rows"
)
self.check_scenario_correctness(project, expected_fields, test_case_fields)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename stub_expected_fields to 'get_expected_fields'. Rename 'setup_test' to 'get_test_fields'. Refactor get_test_fields to return the ResultHolder object and change the signature to use keywords, except for project, which should be first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have pushed up a new version if we think of better ways to handle it from here on willing to keep iterating, have all tests passing in postgres
would this be within the dbt/tests/adapter area or would we want it on same level as functional and the other directories? |
…of leftover breakpoint
Within the adapter zone, so tests/adapter/dbt/tests/adapter/incremental |
I believe we probably want the level like what Gerda suggested and for other repos, it would be just one file running all of the tests for 'incremental'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Trying to swap to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Everything is passing. Just remove the commented out lines :-)
@cla-bot[bot] check |
The cla-bot has been summoned, and re-checked this pull request! |
* init push up of converted unique_key tests * testing cause of failure * adding changelog entry * moving non basic test up one directory to be more broadly part of adapter zone * minor changes to the bad_unique_key tests * removed unused fixture * moving tests to base class and inheriting in a simple class * taking in chenyu's changes to fixtures * remove older test_unique_key tests * removed commented out code * uncommenting seed_count * v2 based on feedback for base version of testing, plus small removal of leftover breakpoint * create incremental test directory in adapter zone * commenting out TableComparision and trying to implement check_relations_equal instead * remove unused commented out code * changing cast for date to fix test to work on bigquery
resolves #4882
Description
Converting the new unique_id_as_list tests to new pytest version in core and then in each adapter as needed.
Checklist