Refactored Load_file Op. testcases #229

utkarsharma2 · 2022-03-21T17:46:56Z

Closes: #194

Each test should work across all databases
Use test_utils.run_dag
Use PyTest fixtures and parameterize to have a single main test that will validate transform across multiple databases
Loading against all formats (csv, parquet, avro, etc.) -- Partially done, since it's not required for all tests.
Loading to a temp_table or named table -- Partially done, since it's not required for all tests.
Loading to default schema and to named schema -- Partially done, since it's not required for all tests.

codecov · 2022-03-21T17:50:58Z

Codecov Report

Merging #229 (69a14f9) into main (8a4c23a) will increase coverage by 0.09%.
The diff coverage is 98.18%.

❗ Current head 69a14f9 differs from pull request most recent head d181021. Consider uploading reports for the commit d181021 to get more accurate results

@@            Coverage Diff             @@
##             main     #229      +/-   ##
==========================================
+ Coverage   89.78%   89.88%   +0.09%     
==========================================
  Files          67       67              
  Lines        3594     3538      -56     
  Branches      342      341       -1     
==========================================
- Hits         3227     3180      -47     
+ Misses        325      316       -9     
  Partials       42       42

Impacted Files	Coverage Δ
tests/integration_test_dag.py	`0.00% <0.00%> (ø)`
conftest.py	`92.30% <94.11%> (-0.29%)`	⬇️
tests/operators/test_agnostic_append.py	`89.09% <100.00%> (ø)`
tests/operators/test_agnostic_load_file.py	`100.00% <100.00%> (+3.58%)`	⬆️
tests/operators/test_agnostic_merge.py	`96.47% <100.00%> (ø)`
...sts/operators/transform/test_postgres_transform.py	`96.66% <100.00%> (ø)`
...ts/operators/transform/test_snowflake_transform.py	`79.59% <100.00%> (ø)`
tests/operators/transform/test_transform.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8a4c23a...d181021. Read the comment docs.

dimberman · 2022-03-21T19:57:11Z

conftest.py

@@ -64,32 +64,46 @@ def sample_dag():


 @pytest.fixture
-def tmp_table(sql_server):
+def tmp_table(request, sql_server):


We should just name this "test_table" if it's not actually a table or tmp_table"

Suggested change

def tmp_table(request, sql_server):

def test_table(request, sql_server):

Make sense, updated it.

dimberman · 2022-03-21T19:57:34Z

conftest.py

@@ -64,32 +64,46 @@ def sample_dag():


 @pytest.fixture
-def tmp_table(sql_server):
+def tmp_table(request, sql_server):
+    table_type = True


You should name this is_tmp_table

Make sense, updated it.

dimberman · 2022-03-21T19:58:06Z

tests/operators/test_agnostic_load_file.py

-            },
-        )
-
+def get_dataframe_from_table(sql_name: str, tmp_table: Table, hook):


Suggested change

def get_dataframe_from_table(sql_name: str, tmp_table: Table, hook):

def get_dataframe_from_table(sql_name: str, tmp_table: Optional[Table, TempTable], hook):

I think you mean Union[Table, TempTable], if so I have added it.

dimberman

One change otherwise looks good!

dimberman · 2022-03-22T19:00:46Z

tests/operators/test_agnostic_load_file.py

+            output_table=test_table,
+        )
+    test_utils.run_dag(sample_dag)
+    df = sql_hook.get_pandas_df(f"SELECT * FROM {test_table.qualified_name()}")


@utkarsharma2 we've run into issues with hanging tests in the past when people run get_pandas_df like this. Instead, could you please create an @adf decorated function that validates the test? You should see an example of this in the test_transform file

@dimberman, I think that would be because of conflicting table names, but since now we are using fixture to get tables, it should not reoccur. But we are adopting this as best practice I'll change it.

Also, I think this test should only fail when there is something wrong with load_file operator only and not @adf.

I believe it is worth using the get_pandas_df - it keeps the tests simple enough.

An example: a recent refactor on load has resulted in lots of broken tests in most of the other operators - which is quite an inconvenient side-effect. I believe the integration tests per operator should focus on the operator itself, and avoid - where possible - using other Astro operators.

tatiana · 2022-03-23T10:24:05Z

tests/operators/test_agnostic_load_file.py

-        OUTPUT_TABLE_NAME = "expected_table_from_s3_csv"
-
-        self.hook_target = PostgresHook(
-            postgres_conn_id="postgres_conn", schema="pagila"


It's so satisfying to see all these lines deleted 🙌

tatiana

Looks great, thanks, @utkarsharma2 ! 🚀

…r table object creation. 2. Used the modified fixture to write testcase that uses temp/named tables.

Closes: #194 - [x] Each test should work across all databases - [x] Use test_utils.run_dag - [x] Use PyTest fixtures and parameterize to have a single main test that will validate transform across multiple databases - [x] Loading against all formats (csv, parquet, avro, etc.) -- **Partially done, since it's not required for all tests.** - [x] Loading to a temp_table or named table -- **Partially done, since it's not required for all tests.** - [x] Loading to default schema and to named schema -- **Partially done, since it's not required for all tests.**

utkarsharma2 marked this pull request as draft March 21, 2022 17:47

dimberman requested changes Mar 21, 2022

View reviewed changes

utkarsharma2 marked this pull request as ready for review March 22, 2022 16:24

utkarsharma2 requested review from dimberman, tatiana and kaxil March 22, 2022 16:26

utkarsharma2 force-pushed the refactor_load_file_tests branch from ea7fd5b to 5d035f4 Compare March 22, 2022 16:30

dimberman requested changes Mar 22, 2022

View reviewed changes

tatiana reviewed Mar 23, 2022

View reviewed changes

tatiana approved these changes Mar 23, 2022

View reviewed changes

utkarsharma2 added 12 commits March 23, 2022 16:15

Saving progress

0d80c7b

1. Made tmp_file fixture to generic to accept params for temp_table o…

c97d912

…r table object creation. 2. Used the modified fixture to write testcase that uses temp/named tables.

Added missing database in testcase.

1835eba

Refactored existing testcase to use new pattern.

e4ef451

Added workaround for Snoflake db capital col names.

8da242a

Fixed Typo.

5012e65

Renamed testcase.

1c826d7

Fixed testcase.

d2fdfc4

Added a testcase for custom schema for load_file operator.

2efee05

Renamed tmp_table fixture to test_table.

39f2b35

Added .DS_Store to gitignore.

7248960

Addressed PR suggestion.

d181021

utkarsharma2 force-pushed the refactor_load_file_tests branch from 69a14f9 to d181021 Compare March 23, 2022 10:47

dimberman approved these changes Mar 23, 2022

View reviewed changes

dimberman merged commit 9ef45bf into main Mar 23, 2022

dimberman deleted the refactor_load_file_tests branch March 23, 2022 14:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactored Load_file Op. testcases #229

Refactored Load_file Op. testcases #229

utkarsharma2 commented Mar 21, 2022 •

edited

Loading

codecov bot commented Mar 21, 2022 •

edited

Loading

dimberman Mar 21, 2022

utkarsharma2 Mar 22, 2022

dimberman Mar 21, 2022

utkarsharma2 Mar 22, 2022

dimberman Mar 21, 2022

utkarsharma2 Mar 22, 2022

dimberman left a comment

dimberman Mar 22, 2022

utkarsharma2 Mar 23, 2022 •

edited

Loading

tatiana Mar 23, 2022 •

edited

Loading

tatiana Mar 23, 2022

tatiana left a comment

	def tmp_table(request, sql_server):
	def test_table(request, sql_server):

	def get_dataframe_from_table(sql_name: str, tmp_table: Table, hook):
	def get_dataframe_from_table(sql_name: str, tmp_table: Optional[Table, TempTable], hook):

Refactored Load_file Op. testcases #229

Refactored Load_file Op. testcases #229

Conversation

utkarsharma2 commented Mar 21, 2022 • edited Loading

codecov bot commented Mar 21, 2022 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dimberman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

utkarsharma2 Mar 23, 2022 • edited Loading

Choose a reason for hiding this comment

tatiana Mar 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tatiana left a comment

Choose a reason for hiding this comment

utkarsharma2 commented Mar 21, 2022 •

edited

Loading

codecov bot commented Mar 21, 2022 •

edited

Loading

utkarsharma2 Mar 23, 2022 •

edited

Loading

tatiana Mar 23, 2022 •

edited

Loading