Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix logic to cancel the external job if the TaskInstance is not in a running or deferred state for DataprocSubmitJobOperator #39447

Merged
merged 5 commits into from
May 8, 2024

Conversation

sunank200
Copy link
Collaborator

PR #39230 introduces a method for handling asyncio.CancelledError in a try/except block. However, this method is deemed unsafe, and it affects DataprocSubmitJobOperator operators, which enables external job cancellation if the triggerer restarts or crashes. This can cause weird behaviour like rescheduling deferred operators, as Airflow remains unaware of job cancellations.

As a workaround, capturing asyncio.CancelledError cancels the job only if the TaskInstance is not in a running or deferred state. This prevents premature external job termination.

More details at: #36090 (comment)


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@Lee-W
Copy link
Member

Lee-W commented May 7, 2024

Just have a quick discussion with @sunank200. We'll yield the event if CancelError was raised , and handle the TaskInstance state check and cancelation in execute_complete

@sunank200 sunank200 force-pushed the DataprocSubmitJobOperatorFix branch from fa0d5b3 to 8e90e15 Compare May 7, 2024 14:00
@sunank200
Copy link
Collaborator Author

Just have a quick discussion with @sunank200. We'll yield the event if CancelError was raised , and handle the TaskInstance state check and cancelation in execute_complete

This approach won't work as execute_on_complete won't be called when task is cancelled.

@sunank200 sunank200 requested a review from Lee-W May 7, 2024 15:00
@sunank200
Copy link
Collaborator Author

#39447 (comment)

@Lee-W changed it

@sunank200 sunank200 force-pushed the DataprocSubmitJobOperatorFix branch from afa4406 to 0e8663c Compare May 7, 2024 15:01
@sunank200 sunank200 force-pushed the DataprocSubmitJobOperatorFix branch from 0e8663c to f8cc232 Compare May 8, 2024 06:47
@Lee-W Lee-W merged commit 387acd0 into apache:main May 8, 2024
39 checks passed
@Lee-W Lee-W deleted the DataprocSubmitJobOperatorFix branch May 8, 2024 08:55
pateash pushed a commit to pateash/airflow that referenced this pull request May 13, 2024
…running or deferred state for DataprocSubmitJobOperator (apache#39447)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers provider:google Google (including GCP) related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants