Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TriggerAuthentication - spec.podIdentity.identityId as a pointer - breaking change in 2.12.0 #5109

Open
radekfojtik opened this issue Oct 20, 2023 · 27 comments
Labels
bug Something isn't working stale-bot-ignore All issues that should not be automatically closed by our stale bot

Comments

@radekfojtik
Copy link

Report

After upgrading to version 2.12.0, you need to recreate the TriggerAuthentication resource (if you are using PodIdentityProviderAzure \ PodIdentityProviderAzureWorkload with an unspecified identityId). Because the identityId field is a pointer. Otherwise you will get this error "IdentityID of PodIdentity should not be empty".

Before version 2.12.0 - if you omit the identityId field, your resource will look like this

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  annotations:
    meta.helm.sh/release-name: my-release-name
    meta.helm.sh/release-namespace: my-namespace
  creationTimestamp: xxx
  finalizers:
  - finalizer.keda.sh
  labels:
    app.kubernetes.io/managed-by: Helm
  name: my-name
  namespace: my-namespace
spec:
  podIdentity:
    identityId: ""
    provider: azure-workload

The identityId field is an empty string, although it has been omitted.

The solution is to recreate TriggerAuthentication. This is a breaking change that should at least be pointed out, or should be in version 3.*?

Expected Behavior

Only minor changes

Actual Behavior

Breaking change

Steps to Reproduce the Problem

  1. Use TriggerAuthentication - provider PodIdentityProviderAzure/PodIdentityProviderAzureWorkload and omit the identityId (optional)
  2. Update KEDA to version 2.12.0
  3. You will get this error "IdentityID of PodIdentity should not be empty"

Logs from KEDA operator

No response

KEDA Version

2.12.0

Kubernetes Version

1.26

Platform

Microsoft Azure

Scaler Details

No response

Anything else?

No response

@radekfojtik radekfojtik added the bug Something isn't working label Oct 20, 2023
@JorTurFer
Copy link
Member

@tomkerkhove @SpiritZhou

@tomkerkhove
Copy link
Member

Can you elaborate on how the trigger authentication resource was created please? Are you saying you are specifying "" or you are not and it was added by default? If the latter, in I presume you mean in v2.11?

@radekfojtik
Copy link
Author

Sure.

In v2.11

  • the identityId field is omitted (not specified) - see the dryRun output

image

  • it is added by default (maybe later by operator?) - the resource looks like this
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  annotations:
    meta.helm.sh/release-name: addresslookup-uat-jobrunner
    meta.helm.sh/release-namespace: acquisition-uat
  creationTimestamp: "2023-10-20T18:49:01Z"
  finalizers:
  - finalizer.keda.sh
  generation: 2
  labels:
    app.kubernetes.io/managed-by: Helm
  name: keda-operator-pod-identity-addresslookup-uat-jobrunner
  namespace: acquisition-uat
  resourceVersion: xxx
  uid: xxx
spec:
  podIdentity:
    identityId: ""
    provider: azure-workload

This means that after upgrading KEDA to v2.12 it is necessary to re-create TriggerAuthentication. Otherwise you will get this error "IdentityID of PodIdentity should not be empty".

@tomkerkhove
Copy link
Member

Hm I was not aware of that and it is unfortunate.

Here is what I think we should do:

  1. Revert the change we made
  2. Change existing code to no longer use "" but use nil by default
  3. Ship hotfix in to 2.12.1
  4. Re-apply original change

That way, end-users will no longer face this issue when we ship 2.13.0 and this validation can still be used. If end-users face this issue with 2.13.0 they need to migrate to 2.12.1 first before going to 2.13.0.

THoughts @zroubalik @JorTurFer @SpiritZhou?

@radekfojtik
Copy link
Author

radekfojtik commented Oct 24, 2023

@tomkerkhove I am not sure if this helps, but I could be wrong. Once the identityId is set to "" (v2.11), it will never be nil again until the TriggerAuthentication object is recreated.

@SpiritZhou
Copy link
Contributor

@tomkerkhove @JorTurFer In 2.11, the triggerauth object will be updated with an "" value when setting Finalizer in EnsureAuthenticationResourceFinalizer. I think the better way is to check if identityId is "" and then update it with nil value on reconcile. We don't need to revert the change, as the webhook will validate new requests and the nil value update will only affect the triggerauth objects during the KEDA upgrade.

@tomkerkhove
Copy link
Member

What do you think @JorTurFer?

@radekfojtik
Copy link
Author

I thought of a slightly different alternative (there would be no need for the operator to update the value to nil)

  1. Keep the validation only in webhook (remove from operator)
  2. Webhook - edit ValidateUpdate func - possibly validateSpec

image

This should ensure that TAs created before version 2.12 will not be validated, but newer ones will.

@radekfojtik
Copy link
Author

guys, here is fix for a related problem with identityId validation kedacore/charts#553

@SpiritZhou
Copy link
Contributor

I thought of a slightly different alternative (there would be no need for the operator to update the value to nil)

  1. Keep the validation only in webhook (remove from operator)
  2. Webhook - edit ValidateUpdate func - possibly validateSpec

image

This should ensure that TAs created before version 2.12 will not be validated, but newer ones will.

I think this is also a good solution.

@tomkerkhove tomkerkhove moved this from To Triage to In Progress in Roadmap - KEDA Core Oct 26, 2023
@zroubalik
Copy link
Member

guys, here is fix for a related problem with identityId validation kedacore/charts#553

nice, could you please give an explanation for this change?

@radekfojtik
Copy link
Author

guys, here is fix for a related problem with identityId validation kedacore/charts#553

nice, could you please give an explanation for this change?

If KEDA is deployed using helm, the configuration for the validation webhook is not created (TriggerAuthentication/ClusterTriggerAuthentication). So the identityId validation on the webhook side will not work.

This fix adds the configuration that is needed for TriggerAuthentication/ClusterTriggerAuthentication validation.

@zroubalik
Copy link
Member

zroubalik commented Oct 27, 2023

@radekfojtik cool, does it fully fix this issue then?

@radekfojtik
Copy link
Author

@radekfojtik cool, does it fully fix this issue then?

Unfortunately no, it only solves the related problem that the webhook configuration (for validation of TriggerAuthentication/ClusterTriggerAuthentication) was not added to helmchart.

This issue is about breaking change in v2.12 releated to this identityId validation. In short - if user has a TA defined where the provider is azure | azure-workload and the identityId is not specified (omitted) then scaling stops working (after the upgrade to v2.12) - keda operator error "IdentityID of PodIdentity should not be empty"

Two possible solutions are currently proposed

@JorTurFer
Copy link
Member

I think that we should remove validations from operator. IMO, if we are validating the item on create/update, we don't need to validate it again as on operator. We have introduced admission webhooks, lets use them.

I get the point that forcing users to use the admission is "ugly", but that's the goal of the admission webhooks, detecting issues during the admission. If users don't care about validations, they can remove the webhooks, but it also means that they do need to validate the things from their side, it's a trade off of their decision.
No webhooks? no validations

@radekfojtik
Copy link
Author

@JorTurFer please take a look 63dca49 (based on the branch release/v2.12)

@JorTurFer
Copy link
Member

@JorTurFer please take a look 63dca49 (based on the branch release/v2.12)

It looks good to me

@JorTurFer
Copy link
Member

But, please open the PR to main branch, to the the release/v2.12 branch, we will cherry pick it

@radekfojtik
Copy link
Author

radekfojtik commented Oct 31, 2023

But, please open the PR to main branch, to the the release/v2.12 branch, we will cherry pick it

@JorTurFer

here is the PR #5142

please take a look also kedacore/charts#553

Copy link

stale bot commented Jan 1, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale All issues that are marked as stale due to inactivity label Jan 1, 2024
Copy link

stale bot commented Jan 9, 2024

This issue has been automatically closed due to inactivity.

@stale stale bot closed this as completed Jan 9, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Ready To Ship in Roadmap - KEDA Core Jan 9, 2024
@zroubalik
Copy link
Member

@JorTurFer what shall we do about this?

@zroubalik zroubalik reopened this Jan 10, 2024
@github-project-automation github-project-automation bot moved this from Ready To Ship to Proposed in Roadmap - KEDA Core Jan 10, 2024
@stale stale bot removed the stale All issues that are marked as stale due to inactivity label Jan 10, 2024
@tusharsappal
Copy link

Any updates here when this will be tackled ?

Copy link

stale bot commented Apr 1, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale All issues that are marked as stale due to inactivity label Apr 1, 2024
@amirschw
Copy link
Contributor

amirschw commented Apr 2, 2024

Not stale

@stale stale bot removed the stale All issues that are marked as stale due to inactivity label Apr 2, 2024
@tomkerkhove tomkerkhove added the stale-bot-ignore All issues that should not be automatically closed by our stale bot label Apr 3, 2024
@aamarnath12
Copy link

We are still facing this issue in keda 2.14.3 and AKS 1.28

@radekfojtik
Copy link
Author

Guys @JorTurFer @zroubalik @tomkerkhove , why is this not merged yet? The fix has been prepared for a long time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale-bot-ignore All issues that should not be automatically closed by our stale bot
Projects
Status: Proposed
Development

Successfully merging a pull request may close this issue.

8 participants