Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extra operator links for EMR Serverless #34225

Merged
merged 25 commits into from
Feb 13, 2024

Conversation

dacort
Copy link
Contributor

@dacort dacort commented Sep 8, 2023

closes: #31620

  • Includes Dashboard UI, Spark stdout logs, S3 and CloudWatch consoles
  • Only shows links relevant to the job
  • Disables Dashboard UI and Spark stdout logs by default as the generated link is a pre-signed url
  • Passes in aws_conn_id from the operator hook so the proper credentials are used

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

- Includes Dashboard UI, S3 and CloudWatch consoles
- Only shows links relevant to the job
@boring-cyborg boring-cyborg bot added area:providers provider:amazon-aws AWS/Amazon - related issues labels Sep 8, 2023
airflow/providers/amazon/aws/links/emr.py Outdated Show resolved Hide resolved
airflow/providers/amazon/aws/links/emr.py Outdated Show resolved Hide resolved
@eladkal
Copy link
Contributor

eladkal commented Oct 24, 2023

@dacort are you still working on it?

@dacort
Copy link
Contributor Author

dacort commented Oct 24, 2023

@eladkal Yes, let me check out the failing tests.

@dacort
Copy link
Contributor Author

dacort commented Oct 24, 2023

@eladkal Ah yes, that's right, I was trying to figure out a better way to pass the operator in, but I think I'll just pass in the aws_conn_id as a string instead.

@dacort dacort marked this pull request as ready for review October 25, 2023 05:00
Copy link
Contributor

@o-nikolas o-nikolas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this Damon, looking good! Left some comments

airflow/providers/amazon/aws/links/emr.py Outdated Show resolved Hide resolved
airflow/providers/amazon/aws/links/emr.py Outdated Show resolved Hide resolved
airflow/providers/amazon/aws/links/emr.py Outdated Show resolved Hide resolved
airflow/providers/amazon/aws/links/emr.py Outdated Show resolved Hide resolved
airflow/providers/amazon/aws/links/emr.py Outdated Show resolved Hide resolved
Copy link

github-actions bot commented Jan 7, 2024

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Jan 7, 2024
@o-nikolas
Copy link
Contributor

Hey @dacort, this one looks like it was close but is getting marked as stale. Do you still have time to finish it off?

@dacort
Copy link
Contributor Author

dacort commented Jan 8, 2024

Yes, let me try to resolve existing conflicts and get it over the finish line. :)

@dacort
Copy link
Contributor Author

dacort commented Jan 17, 2024

How does this work in deferrable mode?

persist_links is called before the self.defer call is triggered for EmrServerlessStartJobTrigger. I'm not too familiar with deferrable mode - would this allow the necessary link context to be populated before deferring?

I'll test this locally as well. Tested locally and seems to work fine - the links are populated and work while the job is running.

Copy link
Contributor

@syedahsn syedahsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work! LGTM

@eladkal
Copy link
Contributor

eladkal commented Jan 19, 2024

@dacort @o-nikolas is anything else needed or can we merge this PR?

@o-nikolas
Copy link
Contributor

@dacort @o-nikolas is anything else needed or can we merge this PR?

@eladkal, the last remaining item is adding some unit testing for this code (thread here)

@dacort
Copy link
Contributor Author

dacort commented Jan 22, 2024

adding some unit testing

ack - will see if I can get that pushed up today!

@dacort
Copy link
Contributor Author

dacort commented Jan 22, 2024

@eladkal @o-nikolas Added some additional unit tests to cover the various cases and merge in latest main.

Several of the DB tests are failing, but seem unrelated to my changes.

@o-nikolas
Copy link
Contributor

Several of the DB tests are failing, but seem unrelated to my changes.

Main is broken right now, you'll at least need to wait for this PR to be merged and then rebase your branch.

@dacort
Copy link
Contributor Author

dacort commented Jan 23, 2024

Sweet, rebased and all checks have passed!

@o-nikolas
Copy link
Contributor

Only one conversation left open that needs resolution and then I think we're ready to merge!

@eladkal
Copy link
Contributor

eladkal commented Feb 10, 2024

@dacort can you rebase and resolve conflicts? I'll merge after

@eladkal
Copy link
Contributor

eladkal commented Feb 12, 2024

@dacort static checks are failing due to adding D401 validation

@dacort
Copy link
Contributor Author

dacort commented Feb 12, 2024

@eladkal Ah, thank you, had missed that failure reason. Will get that addressed today!

@Taragolis Taragolis merged commit b5b452b into apache:main Feb 13, 2024
58 checks passed
@dacort dacort deleted the feature/emr-serverless-extra-links branch February 13, 2024 13:55
sunank200 pushed a commit to astronomer/airflow that referenced this pull request Feb 21, 2024
* Add extra operator links for EMR Serverless

- Includes Dashboard UI, S3 and CloudWatch consoles
- Only shows links relevant to the job

* Fix imports and add context mock to tests

* Move TYPE_CHECK

* Remove unused variables

* Pass in connection ID string instead of operator

* Use mock.MagicMock

* Disable application UI logs by default

* Update doc lints

* Update airflow/providers/amazon/aws/links/emr.py

Co-authored-by: Andrey Anshin <Andrey.Anshin@taragol.is>

* Support dynamic task mapping

* Lint/static check fixes

* Update review comments

* Configure get_dashboard call for EMR Serverless to only retry once

* Whitespace

* Add unit tests for EMRS link generation

* Address D401 check

* Refactor get_serverless_dashboard_url into its own method, add link tests

* Fix lints

---------

Co-authored-by: Andrey Anshin <Andrey.Anshin@taragol.is>
abhishekbhakat pushed a commit to abhishekbhakat/my_airflow that referenced this pull request Mar 5, 2024
* Add extra operator links for EMR Serverless

- Includes Dashboard UI, S3 and CloudWatch consoles
- Only shows links relevant to the job

* Fix imports and add context mock to tests

* Move TYPE_CHECK

* Remove unused variables

* Pass in connection ID string instead of operator

* Use mock.MagicMock

* Disable application UI logs by default

* Update doc lints

* Update airflow/providers/amazon/aws/links/emr.py

Co-authored-by: Andrey Anshin <Andrey.Anshin@taragol.is>

* Support dynamic task mapping

* Lint/static check fixes

* Update review comments

* Configure get_dashboard call for EMR Serverless to only retry once

* Whitespace

* Add unit tests for EMRS link generation

* Address D401 check

* Refactor get_serverless_dashboard_url into its own method, add link tests

* Fix lints

---------

Co-authored-by: Andrey Anshin <Andrey.Anshin@taragol.is>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add extra links for EMR Serverless
5 participants