-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pod identity support for namespaces and per-resource scoped auth #3187
Conversation
Codecov Report
@@ Coverage Diff @@
## main #3187 +/- ##
==========================================
+ Coverage 54.42% 54.46% +0.04%
==========================================
Files 1428 1446 +18
Lines 609210 616083 +6873
==========================================
+ Hits 331582 335576 +3994
- Misses 223364 225541 +2177
- Partials 54264 54966 +702
|
AZURE_TENANT_ID: "$AZURE_TENANT_ID" | ||
AZURE_CLIENT_ID: "$IDENTITY_CLIENT_ID" | ||
USE_POD_IDENTITY_AUTH: "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already use a boolean for USE_WORKLOAD_IDENTITY
for global secret and omit that for scoped credentials. Suggestions on using the same here for consistency and making forcing the Managed Identity to be a default in absence of ClientSecret?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thoughts on not doing the above :
- Users using Workload Identity would have to update their scoped secrets to add a boolean
- Pod Identity is deprecated, so did not want to use it as default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with having multiple boolean configurations like this is that we end up with oddities - what should happen if both USER_POD_IDENTITY_AUTH
and USE_WORKLOAD_IDENTITY
are true at the same time?
I'd like us to consider using a single new configuration setting and deprecating USE_WORKLOAD_IDENTITY
.
What if we introduced IDENTITY_AUTH
with permitted values of pod
, workload
or scoped
, with scoped
as the default if not set.
For back compat, USE_WORKLOAD_IDENTITY: "true"
would change the default to workload
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with having multiple boolean configurations like this is that we end up with oddities - what should happen if both USER_POD_IDENTITY_AUTH and USE_WORKLOAD_IDENTITY are true at the same time?
AFAIK we don't actually support USE_POD_IDENTITY_AUTH
and USE_WORKLOAD_IDENTITY_AUTH
on the same secrets.
USE_POD_IDENTITY_AUTH
is allowed only in the scoped credential secret (namespace or per-resource). It's not allowed at the global secret level from what I can tell.
USE_WORKLOAD_IDENTITY
is only at the global secret level, it's not supported at the per-resource/per-namespace secret (@super-harsh correct me if I am wrong here). Instead, it's always inferred if the user omits AZURE_CLIENT_SECRET
that they want workload identity.
So there's no technical ambiguity here even if there may be some user-ambiguity. It would be better if things were exactly the same between the different secret formats probably, but I do think how we got here is somewhat reasonable because:
- The global secret used the "if no
AZURE_CLIENT_SECRET
, assume AAD Pod Identity" approach, since it previously supported AAD Pod Identity. When we added workload identity support we needed a way to differentiate between the existing supported AAD Pod Identity and Workload Identity. We possibly should've added an enum instead of a boolean here, but the choice of "evolve without breaking" seems right to me. - The per-namespace and per-resource secrets didn't yet exist, so didn't support AAD Pod Identity, so didn't need a boolean until now. Now, we want to add support for AAD Pod Identity in a nonbrekaing way, so we need to differentiate between workload identity (already supported, chosen by omitting the
AZURE_CLIENT_SECRET
) and this new (legacy) mode of AAD Pod Identity.
I think the direction we want to head (eventually) is that AAD Pod Identity is fully deprecated and we can drop all these flags and just have the "if client_secret omitted, workload identity" - though maybe we want an enum just to allow future evolution. I don't see how we get:
- Global and per-namespace/per-resource secrets are consistently shaped (support the same fields/flags).
- No breaking changes from what we have today.
Without doing something like what @super-harsh has done here (though agree maybe boolean worse than enum)
AZURE_TENANT_ID: "$AZURE_TENANT_ID" | ||
AZURE_CLIENT_ID: "$IDENTITY_CLIENT_ID" | ||
USE_POD_IDENTITY_AUTH: "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with having multiple boolean configurations like this is that we end up with oddities - what should happen if both USER_POD_IDENTITY_AUTH and USE_WORKLOAD_IDENTITY are true at the same time?
AFAIK we don't actually support USE_POD_IDENTITY_AUTH
and USE_WORKLOAD_IDENTITY_AUTH
on the same secrets.
USE_POD_IDENTITY_AUTH
is allowed only in the scoped credential secret (namespace or per-resource). It's not allowed at the global secret level from what I can tell.
USE_WORKLOAD_IDENTITY
is only at the global secret level, it's not supported at the per-resource/per-namespace secret (@super-harsh correct me if I am wrong here). Instead, it's always inferred if the user omits AZURE_CLIENT_SECRET
that they want workload identity.
So there's no technical ambiguity here even if there may be some user-ambiguity. It would be better if things were exactly the same between the different secret formats probably, but I do think how we got here is somewhat reasonable because:
- The global secret used the "if no
AZURE_CLIENT_SECRET
, assume AAD Pod Identity" approach, since it previously supported AAD Pod Identity. When we added workload identity support we needed a way to differentiate between the existing supported AAD Pod Identity and Workload Identity. We possibly should've added an enum instead of a boolean here, but the choice of "evolve without breaking" seems right to me. - The per-namespace and per-resource secrets didn't yet exist, so didn't support AAD Pod Identity, so didn't need a boolean until now. Now, we want to add support for AAD Pod Identity in a nonbrekaing way, so we need to differentiate between workload identity (already supported, chosen by omitting the
AZURE_CLIENT_SECRET
) and this new (legacy) mode of AAD Pod Identity.
I think the direction we want to head (eventually) is that AAD Pod Identity is fully deprecated and we can drop all these flags and just have the "if client_secret omitted, workload identity" - though maybe we want an enum just to allow future evolution. I don't see how we get:
- Global and per-namespace/per-resource secrets are consistently shaped (support the same fields/flags).
- No breaking changes from what we have today.
Without doing something like what @super-harsh has done here (though agree maybe boolean worse than enum)
|
||
// IdentityAuthMode enum is used to determine if we're using Pod Identity or Workload Identity | ||
//authentication for namespace and per-resource scoped credentials | ||
IdentityAuthMode = "IDENTITY_AUTH_MODE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if it would make more sense to call this:
AUTH_MODE
and have the values be workloadidentity
, podidentity
, or (possibly in the future) sp
or cert
? Calling it IDENTITY_AUTH_MODE
might limit us?
workloadIdentity IdentityAuthModeOption = "workload" | ||
|
||
// IdentityAuthMode enum is used to determine if we're using Pod Identity or Workload Identity | ||
//authentication for namespace and per-resource scoped credentials |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//authentication for namespace and per-resource scoped credentials | |
// authentication for namespace and per-resource scoped credentials |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not actually updated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nitpick on validation of configuration, but otherwise looks good. Thanks for making the changes, much appreciated.
helm upgrade --install --devel aso2 aso2/azure-service-operator \ | ||
--create-namespace \ | ||
--namespace=azureserviceoperator-system \ | ||
--set azureSubscriptionID=$AZURE_SUBSCRIPTION_ID \ | ||
--set aadPodIdentity.enable=true \ | ||
--set aadPodIdentity.azureManagedIdentityResourceId=${IDENTITY_RESOURCE_ID} \ | ||
--set azureClientID=${IDENTITY_CLIENT_ID} \ | ||
--set crdPattern='resources.azure.com/*;containerservice.azure.com/*;keyvault.azure.com/*;managedidentity.azure.com/*;eventhub.azure.com/*' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this helm command present below this point - should it have been removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have moved it above on line 455
func authModeOrDefault(mode string) AuthModeOption { | ||
if mode == string(podIdentity) { | ||
return podIdentity | ||
} | ||
return workloadIdentity | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This treats anything other than podIdentity
as workloadIdentity
- including foo
, bang
, and oh-oh
.
A misconfigured value should at least trigger a log warning, if not an actual error.
I'm also a fan of being case insensitive for configuration values, though @matthchr doesn't always agree.
func authModeOrDefault(mode string) AuthModeOption { | |
if mode == string(podIdentity) { | |
return podIdentity | |
} | |
return workloadIdentity | |
} | |
func authModeOrDefault(mode string) (AuthModeOption, error) { | |
if strings.EqualFold(mode, string(podIdentity)) { | |
return podIdentity, nil | |
} | |
if strings.EqualFold(mode, string(workloadIdentity)) { | |
return workloadIdentity, nil | |
} | |
return "", errors.Errorf("authorization mode %q not valid", mode) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also a fan of being case insensitive for configuration values, though @matthchr doesn't always agree.
I don't have a philosophical objection here one way or the other. I do think that @theunrepentantgeek is correct we should have a warning or error if the user gets it wrong. I think from a consistency perspective, being case-sensitive here might make sense given other fields (such as enums in the CRDs themselves) are also case-sensitive, and case-sensitivity seems to be "the standard" in the Kubernetes world (look at labels which are case-sensitive, or names which only allow lowercase and thus are case-sensitive).
My dislike of case-insensitivity comes primarily from REST, where attempting to be case-insensitive is awkward because JSON keys are stereotypically case-sensitive (so the expectation is not necessarily case-insensitivity) and REST entities are both set by the user and returned to the user. In the "returned to the user" scenario, case insensitivity gets awkward because you want to preserve what they've sent you, but internally you want to store it in a single canonical form. Accomplishing this ends up being:
- More work in your backend (need to remember their case)
- Confusing. Which of their cases do you remember? The latest PUT? Or the first PUT? Or something else?
Postel's law ends up not really applying because usually in REST you have to return exactly what you accept (the expectation is that what you PUT is what you GET, basically), so you can't actually be liberal in what you accept but conservative in what you return. They are one and the same. You have to pick a winner and I tend to favor picking the easier to implement (and simpler to explain) approach and just give an error in all other cases.
type AuthModeOption string | ||
|
||
const ( | ||
podIdentity AuthModeOption = "pod" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to call it podidentity
and workloadidentity
?
@@ -34,6 +34,17 @@ const ( | |||
FederatedTokenFilePath = "/var/run/secrets/tokens/azure-identity" | |||
) | |||
|
|||
type AuthModeOption string | |||
|
|||
const ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could these constants be made available to import like #3171?
@@ -300,3 +326,15 @@ func (c *credentialProvider) getSecret(ctx context.Context, namespace string, se | |||
func getSecretNameFromAnnotation(credentialFrom string, resourceNamespace string) types.NamespacedName { | |||
return types.NamespacedName{Namespace: resourceNamespace, Name: credentialFrom} | |||
} | |||
|
|||
func authModeOrDefault(mode string) (config.AuthModeOption, error) { | |||
if strings.EqualFold(mode, string(config.WorkloadIdentityAuthMode)) || mode == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think it's more consistent if we're case-sensitive (see my other comment), but approving as-is as not gonna block it on this.
Closes #3167
What this PR does / why we need it:
This PR adds support for allowing single-operator multi-tenant environments to authenticate with different Managed Identities through AAD Pod Identity by adding another secret variable(USE_POD_IDENTITY_AUTH) for scoped credentials to instruct the operator to use Pod Identity authentication.
If applicable: