Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor load_file tests #194

Closed
dimberman opened this issue Mar 10, 2022 · 2 comments · Fixed by #229
Closed

Refactor load_file tests #194

dimberman opened this issue Mar 10, 2022 · 2 comments · Fixed by #229
Assignees
Labels
priority/high High priority testing Unit or integration testing improvements

Comments

@dimberman
Copy link
Collaborator

Description
As a step towards "maturing" the astro DAG authoring project, we must rewrite our tests to ensure that every integration test runs against every database.

This step will simultaneously reduce the number of tests we need to maintain, make testing much simpler as we add new databases, and will make a future refactor much simpler as we can ensure proper coverage.

To do this, we will take advantage of two features in pytest, fixtures and parameterize.

For this ticket, we will update the append function.

Acceptance criteria
Have a single test file validating append across all databases.

  1. Integration tests should be marked with pytest.marker.integration:
  • Each test should work across all databases
  • Validating both Table and TempTable as inputs
  • Use test_utils.run_dag
  • Use PyTest fixtures and parameterize to have a single main test that will validate transform across multiple databases

The tests should validate these scenarios:

  1. loading against all formats (csv, parquet, avro, etc.)
  2. loading to a temp_table or named table
  3. loading to default schema and to named schema
@dimberman dimberman added priority/high High priority testing Unit or integration testing improvements labels Mar 10, 2022
@tatiana tatiana changed the title Refactor load tests Refactor load_file tests Mar 12, 2022
@dimberman dimberman assigned tatiana and unassigned tatiana Mar 15, 2022
@tatiana
Copy link
Collaborator

tatiana commented Mar 17, 2022

@dimberman the description mentions append, but the title says load_file. Which one is this?

Also, I think it is important to mention that load_file was the first place where we started introducing the new style of integration tests.

These lines were the first place where we adopted the new integration tests approach:
https://github.com/astro-projects/astro/blob/main/tests/operators/test_agnostic_load_file.py#L532-L801

What is missing, IMPOV is to refactor these lines:
https://github.com/astro-projects/astro/blob/a83308a32f513a1e4049bc6dd98730c4e12de9f5/tests/operators/test_agnostic_load_file.py#L68-L529

Finally, I don't think we should be aiming to have a single test function per operator. I'd vote for us to have three independent tests than this type of fixture:
https://github.com/astro-projects/astro/blob/main/tests/operators/test_agnostic_merge.py#L47-L79

We are at risk of overusing fixtures, which can compromise the readability of the tests.

@utkarsharma2
Copy link
Collaborator

@dimberman, I think I can partially work on this, refactoring testcases which are having conflicting table names.

dimberman pushed a commit that referenced this issue Mar 23, 2022
Closes: #194 

- [x] Each test should work across all databases
- [x] Use test_utils.run_dag
- [x] Use PyTest fixtures and parameterize to have a single main test that will validate transform across multiple databases
- [x] Loading against all formats (csv, parquet, avro, etc.)  -- **Partially done, since it's not required for all tests.**
- [x] Loading to a temp_table or named table -- **Partially done, since it's not required for all tests.**
- [x] Loading to default schema and to named schema -- **Partially done, since it's not required for all tests.**
utkarsharma2 added a commit that referenced this issue Mar 30, 2022
Closes: #194 

- [x] Each test should work across all databases
- [x] Use test_utils.run_dag
- [x] Use PyTest fixtures and parameterize to have a single main test that will validate transform across multiple databases
- [x] Loading against all formats (csv, parquet, avro, etc.)  -- **Partially done, since it's not required for all tests.**
- [x] Loading to a temp_table or named table -- **Partially done, since it's not required for all tests.**
- [x] Loading to default schema and to named schema -- **Partially done, since it's not required for all tests.**
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/high High priority testing Unit or integration testing improvements
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants