Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

init push up of converted unique_key tests #4958

Merged
merged 18 commits into from
Apr 7, 2022

Conversation

McKnight-42
Copy link
Contributor

@McKnight-42 McKnight-42 commented Mar 25, 2022

resolves #4882

Description

Converting the new unique_id_as_list tests to new pytest version in core and then in each adapter as needed.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have added information about my change to be included in the CHANGELOG.

@cla-bot cla-bot bot added the cla:yes label Mar 25, 2022
@github-actions
Copy link
Contributor

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

@McKnight-42
Copy link
Contributor Author

TODO: update changelog before final merge

@jtcohen6
Copy link
Contributor

jtcohen6 commented Mar 26, 2022

@McKnight-42 Very glad to see an early draft PR!

There's a namespacing question here:

  • Do we want this test to live in tests/adapter/basic, alongside the "baseline" dbt behavior? You must be this tall to call yourself a dbt adapter
  • Or still within the "adapter zone," but in a way that signals it's more of an opt-in? tests/adapter/optional, tests/adapter/advanced, ...?

@gshank @ChenyuLInx Definitely interested in hearing your thoughts. This distinction is something we'd want to reflect in our documentation as well (dbt-labs/docs.getdbt.com#1263). It also leads me to wonder if there's some way to "auto-document" the set of test cases available in tests/adapter? A programmatic update to the module README?

@McKnight-42 McKnight-42 marked this pull request as ready for review March 28, 2022 20:14
@McKnight-42 McKnight-42 requested a review from a team as a code owner March 28, 2022 20:14
@McKnight-42 McKnight-42 requested a review from ChenyuLInx March 28, 2022 20:14
@gshank
Copy link
Contributor

gshank commented Mar 28, 2022

That's a good question about where it should live. At some point we probably do want to put the required tests into one directory. Adapter tests will have to end up in the "adapter zone", which is in a directory in tests/adapter/dbt/tests/adapter. That's where they have to be in order to be subclasses in the adapter repos. So right now I'd put them in tests/adapter/dbt/tests/adapter/incremental_unique_id. Then when we create override tests they can be in tests/functional/adapter/test_incremental_unique_id.py. We might want to reorganize them later.

@gshank
Copy link
Contributor

gshank commented Mar 28, 2022

The idea is that tests/functional will have various directories of tests, including ones that are specific to that repo, but that tests that are overridden would be in tests/functional/adapter. That was my initial idea, anyway, but if someone has a better idea, please suggest it.

@McKnight-42
Copy link
Contributor Author

That's a good question about where it should live. At some point we probably do want to put the required tests into one directory. Adapter tests will have to end up in the "adapter zone", which is in a directory in tests/adapter/dbt/tests/adapter. That's where they have to be in order to be subclasses in the adapter repos. So right now I'd put them in tests/adapter/dbt/tests/adapter/incremental_unique_id. Then when we create override tests they can be in tests/functional/adapter/test_incremental_unique_id.py. We might want to reorganize them later.

Moved test up one directory to be on level with with basic. will keep looking into the errors that are popping up, have we picked the first adapter we want to convert in?

@McKnight-42 McKnight-42 requested a review from gshank March 28, 2022 20:48
Copy link
Contributor

@gshank gshank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests need to be in the base test class so they can be importable.

return run_result.status, run_result.message


class TestIncrementalWithoutUniqueKey(IncrementalUniqueKeyBase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of these tests are actually importable, because they start with the name 'Test'. Why aren't the tests in the base test class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the original tests were separated out into classes that inherited the setup based on type of unique_key case it was testing. do we have a preferred way to reformat the test/class names?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved all tests to the base class and created a new test class based on the other examples we have.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this conversation clarified for me why we're doing it like this:

class TestSimpleMaterializations(BaseSimpleMaterializations):
pass

As I understand it:

  • By default, pytest only runs tests if the method starts with test_, and the class starts with Test, in a file whose name starts with test_
  • Our move has been to create a "base" class (not named Test), with test methods defined on it that do start with test_*
  • Then, inherit that "base" class into an actual test class, whose name starts with Test, and either pass (no changes needed) or override/reimplement fixtures/methods from the base class

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct :-). You can import a base class that starts with Test, the problem is that pytest would then run all of the tests in the base class plus all of those same tests in the subclass. Which means that if you need to override a test method or get failures, you'd get failures.

@McKnight-42 McKnight-42 requested a review from gshank March 28, 2022 21:28
Comment on lines 20 to 22
state::varchar(2) as state,
county::varchar(12) as county,
city::varchar(12) as city,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want to be thoughtful about data types in order to get this running successfully on multiple databases. E.g. BigQuery doesn't support a data type named varchar(length). Our options here:

  • Override/reimplement all fixtures when we run this test in dbt-bigquery (ugh)
  • Remove type casts here, try to handle all data type business via seeds instead. That may also require a different approach for add_new_rows.sql/duplicate_insert.sql

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are also working on the adapter repos? Did you change the datatypes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could provide some of that information (with data_types) in a custom fixture. Avoiding that would be even better. There is actually an update_rows function in dbt.tests.util which came from dbt-adapter-tests and does work on all of our repos because it calls an adapter method.

Also, I think putting the test settings in variables and then putting them into the structures isn't the best practice. I'm gong to look at it a bit more to come up with recommendations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are also working on the adapter repos? Did you change the datatype

Yes, I believe there are slight modifications for each adapter repo

There is actually an update_rows function in dbt.tests.util which came from dbt-adapter-tests and does work on all of our repos because it calls an adapter method.

I'm a bit skeptical of this method tbh (I know it required us to override a data type in Redshift for basic tests), and want to discuss alongside table comparison logic—but maybe it's the way to go!

Maybe the move should be:

  • Once this test case is passing in this repo (for Postgres), we try running it (from this branch) in a branch of each plugin repo as well
  • Update the test cases here, override them there, get them all passing for all plugins
  • Fine-tune the framework so that the overrides feel minimal + appropriate
  • Merge across the board (and delete all the previously duplicated test code)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like the right process.

The override in Redshift for a seed data column was because the test goes and updates the name by appending '_update'. Redshift created the seed table with a varchar datatype that wasn't big enough. The best solution might actually have been to change that update so the size didn't increase, but the 'update_rows' method was kind of fiddly.

@McKnight-42
Copy link
Contributor Author

dbt-labs/dbt-redshift#92 redshift seems to be passing tests locally as we would hope

@gshank
Copy link
Contributor

gshank commented Mar 31, 2022

Let's put 'test_incremental_unique_id.py' into an 'incremental' directory. Then when we convert the other incremental tests they will all be in the same directory. This will reduce the number of files that adapter repos will have to create to pull in adapter tests.

return run_result.status, run_result.message

# no unique_key test
def test__no_unique_keys(self, project):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the factoring in the way these tests are done could be more straightforward and easier to read. I would recommend that you do something like:

  def test__no_unique_keys(self, project):
        """with no unique keys, seed and model should match"""

        expected_fields = self.get_expected_fields(relation="seed", seed_rows=8)
        test_case_fields = self.get_test_fields(
            project, seed="seed", incremental_model="no_unique_key", update_sql_file="add_new_rows"
        )
        self.check_scenario_correctness(project, expected_fields, test_case_fields)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename stub_expected_fields to 'get_expected_fields'. Rename 'setup_test' to 'get_test_fields'. Refactor get_test_fields to return the ResultHolder object and change the signature to use keywords, except for project, which should be first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have pushed up a new version if we think of better ways to handle it from here on willing to keep iterating, have all tests passing in postgres

@McKnight-42
Copy link
Contributor Author

Let's put 'test_incremental_unique_id.py' into an 'incremental' directory. Then when we convert the other incremental tests they will all be in the same directory. This will reduce the number of files that adapter repos will have to create to pull in adapter tests.

would this be within the dbt/tests/adapter area or would we want it on same level as functional and the other directories?

@McKnight-42 McKnight-42 requested a review from gshank March 31, 2022 22:08
@gshank
Copy link
Contributor

gshank commented Mar 31, 2022

Within the adapter zone, so tests/adapter/dbt/tests/adapter/incremental

@ChenyuLInx
Copy link
Contributor

Let's put 'test_incremental_unique_id.py' into an 'incremental' directory. Then when we convert the other incremental tests they will all be in the same directory. This will reduce the number of files that adapter repos will have to create to pull in adapter tests.

would this be within the dbt/tests/adapter area or would we want it on same level as functional and the other directories?

I believe we probably want the level like what Gerda suggested and for other repos, it would be just one file running all of the tests for 'incremental'.

Copy link
Contributor

@gshank gshank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@McKnight-42 McKnight-42 requested a review from gshank April 4, 2022 15:44
@McKnight-42
Copy link
Contributor Author

Trying to swap to the check_relations_equal implementation.

Copy link
Contributor

@gshank gshank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Everything is passing. Just remove the commented out lines :-)

@McKnight-42 McKnight-42 closed this Apr 6, 2022
@McKnight-42 McKnight-42 reopened this Apr 6, 2022
@McKnight-42
Copy link
Contributor Author

@cla-bot[bot] check

@cla-bot
Copy link

cla-bot bot commented Apr 7, 2022

The cla-bot has been summoned, and re-checked this pull request!

@McKnight-42 McKnight-42 merged commit 3ade206 into main Apr 7, 2022
@McKnight-42 McKnight-42 deleted the mcknight/convert_unique_key branch April 7, 2022 16:29
agoblet pushed a commit to BigDataRepublic/dbt-core that referenced this pull request May 20, 2022
* init push up of converted unique_key tests

* testing cause of failure

* adding changelog entry

* moving non basic test up one directory to be more broadly part of adapter zone

* minor changes to the bad_unique_key tests

* removed unused fixture

* moving tests to base class and inheriting in a simple class

* taking in chenyu's changes to fixtures

* remove older test_unique_key tests

* removed commented out code

* uncommenting seed_count

* v2 based on feedback for base version of testing, plus small removal of leftover breakpoint

* create incremental test directory in adapter zone

* commenting out TableComparision and trying to implement check_relations_equal instead

* remove unused commented out code

* changing cast for date to fix test to work on bigquery
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-375] Unique_key as list tests reformat
4 participants