Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow impersonation_chain to be set on Google Cloud connection #33715

Merged
merged 1 commit into from
Aug 30, 2023

Conversation

melugoyal
Copy link
Contributor

@melugoyal melugoyal commented Aug 25, 2023

allow impersonation_chain to be configured as part of a Google Cloud connection. this will allow DAG authors to configure this once as part of their connection and have it be used by all tasks, rather than the current behavior of having to specify it when instantiating each operator. impersonation_chain set on operator instantiation takes precedence over the one configured on the connection.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added area:providers provider:google Google (including GCP) related issues labels Aug 25, 2023
@melugoyal melugoyal force-pushed the impersonation_chain_gcp_connection branch from 0403831 to ba87136 Compare August 25, 2023 04:08
@potiuk
Copy link
Member

potiuk commented Aug 25, 2023

I don't see a big problem with it (though it misses tests and they should be added).

I wonder if it won't have side effects. Currently Impersonation chain is really "per-operator" feature not "per-connection" one. Whichever operator supports impersonation chain, they pass it down to a relevant hook and then it is passed down to the hook that uses it. Indeed we started where impersonation chain was supported by just a handful of operators and services but apparently now - more of them have it and we could make it a connection feature indeed.

There are however some interesting back-compatibility problems. Users for example might expect that when they set it in the connection, it will be used for all services, but they might already have impersonation chain set in the operator (this is currently the only way they can do it.

So this "connection setting" might be misleading for past users and transition will be a bit painful because what the users will really have to do, in case they want to change it to a connection-driven impersonation chain - they will have to modify all the DAGs they currently have, otherwise this will be confusing because some of their dags might use the old way and some the new way.

So maybe - if we introduce we could come up with some ideas on how to make such migration easier. Quite for sure in order to add it, you will have to add a migration documentation - i.e. explain the users what they should do in case they need to convert to this new method, maybe even also some warnings and inforamation in logs that the "connection" impersonation is not being used and is overwritten. Not sure if that is that important, though.

I wonder what @VladaZakharova and @bkossakowska and maybe and other from the team of google operators think about it. Maybe also @hussein-awala who was implementing some Google connection improvements.

@melugoyal melugoyal force-pushed the impersonation_chain_gcp_connection branch from f59d182 to 4ab434b Compare August 28, 2023 15:37
@potiuk
Copy link
Member

potiuk commented Aug 28, 2023

You need to rebase to account for incompatible azure dependncy changes.

@melugoyal melugoyal force-pushed the impersonation_chain_gcp_connection branch from f1bac90 to 4750ab6 Compare August 28, 2023 18:47
@potiuk potiuk force-pushed the impersonation_chain_gcp_connection branch from 4750ab6 to d8fce0f Compare August 29, 2023 08:02
@melugoyal
Copy link
Contributor Author

@potiuk where would such a migration doc be added?

@potiuk
Copy link
Member

potiuk commented Aug 29, 2023

@potiuk
Copy link
Member

potiuk commented Aug 29, 2023

You can click "Suggest Changes on this page" to see where it comes from - you will get PR open in the right file when you use it

@melugoyal melugoyal force-pushed the impersonation_chain_gcp_connection branch from d8fce0f to aa11de4 Compare August 29, 2023 21:19
@melugoyal
Copy link
Contributor Author

@potiuk thanks, i was already modifying that file in this PR. i've added a note there about precedence

@potiuk
Copy link
Member

potiuk commented Aug 30, 2023

@potiuk thanks, i was already modifying that file in this PR. i've added a note there about precedence

Yeah. That's what I was about. The thing is that we are not only do a "docstring" information but we have "Howto" one where we describe more of "high-level" options you have without diving into specific parameters. The sentence you added was precisely what I was after, thank you!

@potiuk potiuk merged commit 075afe5 into apache:main Aug 30, 2023
@melugoyal melugoyal deleted the impersonation_chain_gcp_connection branch August 30, 2023 21:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers provider:google Google (including GCP) related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants