-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create upstream_tasks
parameter for dependencies independent of data transfers
#585
Merged
Merged
Changes from 19 commits
Commits
Show all changes
28 commits
Select commit
Hold shift + click to select a range
94838b2
foo
dimberman e15754a
foo
dimberman 7022cd7
foo
dimberman 5f6155a
foo
dimberman ee81a81
foo
dimberman ba70ac8
fix
dimberman 332f22f
ASFda
dimberman 4f83102
Merge branch 'main' of https://github.com/astronomer/astro-sdk into s…
dimberman 9ca8e87
Fix
dimberman d53c08f
nit
dimberman c6d78f7
Merge branch 'main' into setup-dependencies
dimberman 25b477e
cleanup
dimberman 7c9fbef
lint
dimberman 90e5490
nit
dimberman 09ccab5
nit
dimberman 1c2e1c2
nit
dimberman bf6e87e
Merge branch 'main' into setup-dependencies
dimberman 2e28ccd
nit
dimberman 47fc155
nit
dimberman 321975c
Update src/astro/sql/operators/drop.py
dimberman b35cfeb
nit
dimberman c299ba9
nit
dimberman 366b431
merge with main
e9c6d6a
precommit
8975562
fix failing test
c90f611
fix failing test
1ae6629
fix docs
a5a6e9f
merge
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
from abc import ABC | ||
|
||
from airflow.models.baseoperator import BaseOperator | ||
|
||
from astro.sql.operators.upstream_task_mixin import UpstreamTaskMixin | ||
|
||
|
||
class AstroSQLBaseOperator(UpstreamTaskMixin, BaseOperator, ABC): | ||
pass |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
from airflow.exceptions import AirflowException | ||
from airflow.models.baseoperator import BaseOperator | ||
from airflow.models.xcom_arg import XComArg | ||
|
||
|
||
class UpstreamTaskMixin: | ||
def __init__(self, **kwargs): | ||
upstream_tasks = kwargs.pop("upstream_tasks", []) | ||
|
||
super().__init__(**kwargs) | ||
|
||
for task in upstream_tasks: | ||
if isinstance(task, XComArg): | ||
self.set_upstream(task.operator) | ||
elif isinstance(task, BaseOperator): | ||
self.set_upstream(task) | ||
else: | ||
raise AirflowException( | ||
"Cannot upstream a non-task, please only use XcomArg or operators for this" | ||
" parameter" | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
import pathlib | ||
|
||
import pytest | ||
|
||
from astro import sql as aql | ||
from astro.constants import Database | ||
from astro.files import File | ||
from astro.sql.table import Table | ||
from tests.sql.operators import utils as test_utils | ||
|
||
cwd = pathlib.Path(__file__).parent | ||
|
||
|
||
@pytest.mark.parametrize( | ||
"database_table_fixture", | ||
[ | ||
{"database": Database.SNOWFLAKE}, | ||
{"database": Database.BIGQUERY}, | ||
{"database": Database.POSTGRES}, | ||
{"database": Database.SQLITE}, | ||
], | ||
indirect=True, | ||
ids=["snowflake", "bigquery", "postgresql", "sqlite"], | ||
) | ||
def test_raw_sql_chained_queries(database_table_fixture, sample_dag): | ||
import pandas | ||
|
||
db, test_table = database_table_fixture | ||
|
||
@aql.run_raw_sql(conn_id=db.conn_id) | ||
def raw_sql_no_deps(new_table: Table, t_table: Table): | ||
""" | ||
Let' test without any data dependencies, purely using upstream_tasks | ||
Returns: | ||
|
||
""" | ||
return "CREATE TABLE {{new_table}} AS SELECT * FROM {{t_table}}" | ||
|
||
@aql.dataframe | ||
def validate(df1: pandas.DataFrame, df2: pandas.DataFrame): | ||
df1 = df1.sort_values(by=df1.columns.tolist()).reset_index(drop=True) | ||
df2 = df2.sort_values(by=df2.columns.tolist()).reset_index(drop=True) | ||
assert df1.equals(df2) | ||
|
||
with sample_dag: | ||
homes_file = aql.load_file( | ||
input_file=File(path=str(cwd) + "/../../data/homes.csv"), | ||
output_table=test_table, | ||
) | ||
generated_tables = [] | ||
last_task = homes_file | ||
for _ in range(5): | ||
n_table = test_table.create_similar_table() | ||
n_task = raw_sql_no_deps( | ||
new_table=n_table, t_table=test_table, upstream_tasks=[last_task] | ||
) | ||
generated_tables.append(n_table) | ||
last_task = n_task | ||
|
||
validated = validate( | ||
df1=test_table, df2=generated_tables[-1], upstream_tasks=[last_task] | ||
) | ||
for table in generated_tables: | ||
aql.drop_table(table, upstream_tasks=[validated]) | ||
|
||
test_utils.run_dag(sample_dag) | ||
all_tasks = sample_dag.tasks | ||
for t in all_tasks[1:]: | ||
assert len(t.upstream_task_ids) == 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the getting started tutorial the best place for this documentation?
I feel it could be best placed within our Sphinx documentation as a reference page. I believe the idea of the tutorial was to walk users through a first example of the Python SDK without introducing too many possibilities. Could you confirm this, @mikeshwe ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tatiana this is one of the most common questions we get asked (how can I use astro-sdk with traditional airflow operators) so it seemed like the getting started page was a good place to do it. Glad to discuss other options though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fair. 👍