Add `deferrable` param in `EmrContainerSensor` #30945

pankajastro · 2023-04-28T16:17:39Z

Add the deferrable param in EmrContainerSensor.
This will allow running EmrContainerSensor in an async way
that means we only submit a job from the worker to run a job
then defer to the trigger for polling and wait for a job the job status
and the worker slot won't be occupied for the whole period of
task execution.

^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

airflow/providers/amazon/aws/sensors/emr.py

airflow/providers/amazon/aws/waiters/emr-containers.json

syedahsn

Looks good! Just a few minor comments.

airflow/providers/amazon/aws/sensors/emr.py

airflow/providers/amazon/aws/triggers/emr.py

airflow/providers/amazon/aws/waiters/emr-containers.json

airflow/providers/amazon/aws/triggers/emr.py

airflow/providers/amazon/provider.yaml

dstandish · 2023-06-05T16:38:10Z

airflow/providers/amazon/aws/triggers/emr.py

+        async with self.hook.async_conn as client:
+            waiter = self.hook.get_waiter("container_job_complete", deferrable=True, client=client)
+            attempt = 0
+            while attempt < self.max_attempts:


Though I have seen it elsewhere, to me, putting "max_attempts" as a trigger parameter doesn't make sense.

The "right" way to set a time limit on a trigger is the deferral timeout. IMO, if an existing operator has max_attempts param, then we should just calculate the deferral timeout based on that number and use that.

This avoids the added complexity in the trigger, the extra signature param, and it avoids the odd fact that, if max attempts is used, then if a triggerer dies and the trigger is picked up again on another machine, it will start from zero and then get killed by the timeout anyway.

What do you think about this @pankajastro ?

This is a good point. thank you for raising it.

honestly, I didn't think about the possibility that a trigger can restart while writing this. I'll fix this.

This was a point that was brought up in a previous PR (by @dstandish :D), and the solution we went with was to essentially use both. Set a timeout on the operator level that is computed from the given parameters (as well as a 60 second buffer), but also use the number of attempts as a metric. I think it is beneficial to have the number of attempts used as a metric in the Triggers, but it is definitely a good idea to have the timeout set on the operator level in case of a Trigger restart, as mentioned above.

Adeed the timeout while deferring to handle

if a triggerer dies and the trigger is picked up again on another machine

removed max_attempt to reduce the code maintenance

Add the deferrable param in EmrContainerSensor. This will allow running EmrContainerSensor in an async way that means we only submit a job from the worker to run a job then defer to the trigger for polling and wait for a job the job status and the worker slot won't be occupied for the whole period of task execution.

boring-cyborg bot added area:providers provider:amazon-aws AWS/Amazon - related issues labels Apr 28, 2023

pankajastro marked this pull request as ready for review April 28, 2023 20:26

pankajastro requested review from eladkal and o-nikolas as code owners April 28, 2023 20:26

vincbeck approved these changes May 1, 2023

View reviewed changes

pankajkoti reviewed May 3, 2023

View reviewed changes

airflow/providers/amazon/aws/sensors/emr.py Outdated Show resolved Hide resolved

pankajkoti reviewed May 3, 2023

View reviewed changes

airflow/providers/amazon/aws/waiters/emr-containers.json Outdated Show resolved Hide resolved

pankajkoti mentioned this pull request May 3, 2023

Add deferrable param in BatchOperator #30865

Merged

syedahsn reviewed May 4, 2023

View reviewed changes

pankajastro marked this pull request as draft May 26, 2023 08:49

pankajastro force-pushed the async_emr_container_sensor branch 2 times, most recently from 40c947f to 619ab51 Compare May 26, 2023 09:37

pankajastro marked this pull request as ready for review May 26, 2023 12:32

pankajkoti reviewed May 26, 2023

View reviewed changes

airflow/providers/amazon/aws/triggers/emr.py Outdated Show resolved Hide resolved

pankajastro commented May 26, 2023

View reviewed changes

airflow/providers/amazon/aws/triggers/emr.py Outdated Show resolved Hide resolved

hussein-awala mentioned this pull request May 31, 2023

Add deferrable mode to EcsRunTaskOperator #31636

Closed

2 tasks

pankajastro force-pushed the async_emr_container_sensor branch 2 times, most recently from e25f107 to fd0db14 Compare June 5, 2023 10:00

pankajastro commented Jun 5, 2023

View reviewed changes

airflow/providers/amazon/provider.yaml Outdated Show resolved Hide resolved

dstandish reviewed Jun 5, 2023

View reviewed changes

Lee-W mentioned this pull request Jun 6, 2023

Add default_deferrable config #31712

Merged

pankajastro force-pushed the async_emr_container_sensor branch 5 times, most recently from 1244bd9 to b1edbf7 Compare June 8, 2023 10:17

pankajastro mentioned this pull request Jun 12, 2023

Add deferrable mode to BatchSensor #30279

Merged

ephraimbuddy approved these changes Jun 13, 2023

View reviewed changes

pankajastro force-pushed the async_emr_container_sensor branch from bda9e41 to 5ed07df Compare June 13, 2023 15:05

pankajastro force-pushed the async_emr_container_sensor branch 5 times, most recently from a56de48 to 1bcbf8e Compare June 19, 2023 10:15

pankajastro closed this Jun 19, 2023

pankajastro reopened this Jun 19, 2023

pankajastro closed this Jun 19, 2023

pankajastro reopened this Jun 19, 2023

pankajastro force-pushed the async_emr_container_sensor branch from 1bcbf8e to df24dc2 Compare June 19, 2023 11:19

Fix static check

a260e1e

pankajastro merged commit f0b91ac into apache:main Jun 19, 2023

pankajastro deleted the async_emr_container_sensor branch June 19, 2023 19:38

eladkal mentioned this pull request Jun 20, 2023

Status of testing Providers that were prepared on June 20, 2023 #32030

Closed

86 tasks

pankajastro mentioned this pull request Jun 20, 2023

Add deferrable mode in EMR operator and sensor #32029

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `deferrable` param in `EmrContainerSensor` #30945

Add `deferrable` param in `EmrContainerSensor` #30945

pankajastro commented Apr 28, 2023

syedahsn left a comment

dstandish Jun 5, 2023

pankajastro Jun 5, 2023

syedahsn Jun 5, 2023

pankajastro Jun 7, 2023

Add deferrable param in EmrContainerSensor #30945

Add deferrable param in EmrContainerSensor #30945

Conversation

pankajastro commented Apr 28, 2023

syedahsn left a comment

Choose a reason for hiding this comment

dstandish Jun 5, 2023

Choose a reason for hiding this comment

pankajastro Jun 5, 2023

Choose a reason for hiding this comment

syedahsn Jun 5, 2023

Choose a reason for hiding this comment

pankajastro Jun 7, 2023

Choose a reason for hiding this comment

Add `deferrable` param in `EmrContainerSensor` #30945

Add `deferrable` param in `EmrContainerSensor` #30945