Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add deferrable mode to BeamRunPythonPipelineOperator #31471

Merged
merged 4 commits into from
Jul 13, 2023

Conversation

VladaZakharova
Copy link
Contributor

This PR adds the ability to start BeamRunPythonOperator asynchronously using deferrable mode. The documentation was updated with the description of how to work with deferrable parameter.

Additionally, the existing implementation was fixed:

  • for the BeamRunPythonOperator unused call of dataflow_hook.wait_for_done was removed, since the waiting process for the pipeline to be successfully finished was covered by the beam_hook.start_python_pipeline itself: this method creates the process to execute command in Apache Beam and waits for its end of execution in parallel with outputting logs. The same approach was used while implementing deferrable mode for the Operator, but using asynchronous calls.
  • version of Apache-beam package in system tests was updated

^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@VladaZakharova
Copy link
Contributor Author

Hi @potiuk !
Could you please take a look on this PR or add someone here to review changes?
Thanks!

@potiuk
Copy link
Member

potiuk commented Jun 8, 2023

Some docstrings need fixing as we already added D400 check for docstrings.

@VladaZakharova
Copy link
Contributor Author

@potiuk
Hi there :)
I have rebased and fixed static checks, could you please verify? Thanks!

@eladkal eladkal changed the title Add deferrable mode to BeamRunPythonPipelineOperator Add deferrable mode to BeamRunPythonPipelineOperator Jul 9, 2023
@potiuk
Copy link
Member

potiuk commented Jul 9, 2023

There is one more "default deferrable" to fix.

@VladaZakharova
Copy link
Contributor Author

@potiuk
Hi there! Could you please recheck this PR after fixing the "deferrable" parameter? Thanks a lot!

@VladaZakharova
Copy link
Contributor Author

@potiuk @pankajastro
Hi! Do we have anything else to fix here?

@pankajastro
Copy link
Member

@potiuk @pankajastro
Hi! Do we have anything else to fix here?

Hi just posted one small question otherwise looks good,

@VladaZakharova
Copy link
Contributor Author

@pankajastro
I have fixed this small mistake, could you please check? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants