-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create upstream_tasks
parameter for dependencies independent of data transfers
#585
Conversation
cb6d3ef
to
ba70ac8
Compare
…etup-dependencies
Codecov Report
@@ Coverage Diff @@
## main #585 +/- ##
==========================================
- Coverage 93.26% 93.12% -0.14%
==========================================
Files 43 45 +2
Lines 1855 1877 +22
Branches 232 237 +5
==========================================
+ Hits 1730 1748 +18
- Misses 97 100 +3
- Partials 28 29 +1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dimberman, I believe this PR does not close the original problem reported by #585, which is the comments at the beginning of the SQL statement were not working.
It does solve the issue of chaining run_raw_sql
statements and operators which return None
in the XComArg
. We should probably log a Github issue for this and link it to this PR.
@@ -392,3 +392,41 @@ or | |||
|
|||
In all scenarios, even if the user gives a non-temporary table, only temporary | |||
tables will actually be deleted. | |||
|
|||
## Tying Astro SDK decorators to traditional Airflow Operators |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the getting started tutorial the best place for this documentation?
I feel it could be best placed within our Sphinx documentation as a reference page. I believe the idea of the tutorial was to walk users through a first example of the Python SDK without introducing too many possibilities. Could you confirm this, @mikeshwe ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tatiana this is one of the most common questions we get asked (how can I use astro-sdk with traditional airflow operators) so it seemed like the getting started page was a good place to do it. Glad to discuss other options though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fair. 👍
Co-authored-by: Felix Uellendall <feluelle@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Description
What is the current behavior?
Currently there is "native" way to set dependencies between tasks without passing either a table or a dataframe between said tasks. This is especially apparent when chaining raw_sql functions such as here:
#569
In order to have a solution that works we need somehting that feels pythonic and doesn't rely on traditional airflow
>>
operatorscloses: #569
What is the new behavior?
This PR adds in the ability to set dependencies using a parameter
upstream_tasks
. This parameter can be used at runtime to set deps without passing data. Notably we don't want to actually render the data (imagine a use-case where a task relies on 10 dataframe tasks)Does this introduce a breaking change?
No
Checklist
upstream_tasks