Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit Events on THC (mis)configuration #2111

Merged
merged 2 commits into from
Jun 6, 2023

Conversation

DamianSawicki
Copy link
Contributor

The purpose of the PR is to emit Events on Service/Ingress objects when the UHC health check for the Service has been successfully configured with the Transparent Health Checks (THC) feature or such a configuration has been attempted.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 10, 2023
@k8s-ci-robot k8s-ci-robot requested review from aojea and code-elinka May 10, 2023 17:26
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 10, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @DamianSawicki. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 10, 2023
@DamianSawicki
Copy link
Contributor Author

/assign kl52752

if sp.BackendConfig != nil && sp.BackendConfig.Spec.HealthCheck != nil {
return
}

THCEnabled, err := annotations.FromService(svc).ShouldEnableTHC()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this line at the beginning of the function. I'm not sure if this is buggy or just confusing. And if you are not doing early returns looks like you don't need those defer at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, without early returns defer is not really needed. I haven't removed it yet because if we are asked to introduce another flag, then defer will be useful and the flag will be checked between defer and reading the annotation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -54,6 +64,7 @@ type ServicePort struct {
L4RBSEnabled bool
L7ILBEnabled bool
THCEnabled bool
THCEvents THCEvents
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of THCEnabled and THCEvents can you do THCConfiguration and put it there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@DamianSawicki DamianSawicki force-pushed the event-THC-creation branch 2 times, most recently from 0598811 to 1248b94 Compare May 12, 2023 13:38
@code-elinka
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 12, 2023
@DamianSawicki
Copy link
Contributor Author

/test all

@DamianSawicki DamianSawicki requested a review from kl52752 May 15, 2023 09:31
@DamianSawicki DamianSawicki changed the title Emit Events on THC (mis)configuration [WIP] Emit Events on THC (mis)configuration May 19, 2023
@DamianSawicki DamianSawicki marked this pull request as ready for review May 19, 2023 14:27
@DamianSawicki
Copy link
Contributor Author

/assign aojea

@aojea
Copy link
Member

aojea commented Jun 1, 2023

I have some concerns about the supportability of this approach in the future because we have to modify several functions signatures just to cascade down and up the information, I can see two possible problems:

  • problem to scale for new features and we keep adding new parameters and return values, maybe it can be solved using an object that we can add new fields without modifying the methods?
  • risk of breaking the feature, this can be alleviated adding unit test that covers the feature end to end

This is basically using events to validate the Ingress object against a flag, I spent some time with @DamianSawicki trying to look for another approaches but we couldn't come up with good alternatives, maybe a more modular approach adding a new validation method and validate the Ingress and all the Services there before calling TranslateIngress() ? Since everything is from the cache I can't think this can be a performance problem, but I'm not very familiar with this controller

@DamianSawicki @swetharepakula @kl52752 from the functional pov this lgtm, I defer to you how to proceed, I'm fine with whatever you decide

@DamianSawicki
Copy link
Contributor Author

Thanks again @aojea for looking at it with me!

problem to scale for new features and we keep adding new parameters and return values, maybe it can be solved using an object that we can add new fields without modifying the methods?

You’re right, we need to be frugal with new parameters and return values. I’ve decreased the number of parameters of healthChecks.sync by using the type utils.THCConfiguration instead of the types of its two fields. After this change, the present PR does not increase the number of parameters of any function.

The PR does increase the number of return values of translator.(*Translator).TranslateIngress and a non-exported method called by TranslateIngress, namely (*Translator).getServicePort. The extra return value indicates a warning condition because I prefer not to put a warning in a slice of errors (because then we couldn’t simply test the emptiness of the slice). This solution should scale well: with more features, the warning return value can become more complex, but TranslateIngress should always return 1) a positive-scenario value, 2) an error value, 3) a warning value. If you prefer, I can put the warning information in the error slice or replace the error slice with a struct with separate fields for errors and warnings.

I spent some time with @DamianSawicki trying to look for another approaches but we couldn't come up with good alternatives, maybe a more modular approach adding a new validation method and validate the Ingress and all the Services there before calling TranslateIngress() ?

I had another look at the code and it seems that TranslateIngress and getServicePort already serve for validation purposes: getServicePort can return errors.ErrBadSvcType, errors.ErrSvcAppProtosParsing, errors.ErrSvcBackendConfig, or errors.ErrBackendConfigValidation. So I guess it makes sense to keep validation inside TranslateIngress.

risk of breaking the feature, this can be alleviated adding unit test that covers the feature end to end

The risk is not severe as the feature is only about giving feedback to the User.

Still, the PR has a unit test translator.TestSetThcOptInOnSvc for translator.setThcOptInOnSvc, which contains the core logic. There is also healthchecks.TestNotifyAboutTHC testing Event emission on Services. However, in our discussions with @aojea we focused on event emissions on Ingresses, and while the above-mentioned translator.TestSetThcOptInOnSvc tests the core logic, I do not see an easy way to inject a fake Recorder to controller.LoadBalancerController to really check Event emission in controller.(*LoadBalancerController).sync. What seems feasible is testing TranslateIngress, which is the step right at the top of controller.(*LoadBalancerController).sync in the call stack, so it would cover the forwarding setThcOptInOnSvc -> getServicePort and the aggregation getServicePort -> TranslateIngress.

@DamianSawicki DamianSawicki changed the title [WIP] Emit Events on THC (mis)configuration Emit Events on THC (mis)configuration Jun 2, 2023
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 2, 2023
@DamianSawicki
Copy link
Contributor Author

/unassign aojea
/assign swetharepakula

@k8s-ci-robot k8s-ci-robot assigned swetharepakula and unassigned aojea Jun 2, 2023
Copy link
Member

@swetharepakula swetharepakula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR LGTM. Mostly nits and suggestions

pkg/controller/translator/translator.go Outdated Show resolved Hide resolved
pkg/controller/translator/translator.go Outdated Show resolved Hide resolved
pkg/controller/translator/translator_test.go Outdated Show resolved Hide resolved
pkg/controller/translator/translator_test.go Outdated Show resolved Hide resolved
pkg/healthchecks/healthchecks.go Outdated Show resolved Hide resolved
pkg/healthchecks/healthchecks_test.go Outdated Show resolved Hide resolved
@DamianSawicki
Copy link
Contributor Author

DamianSawicki commented Jun 6, 2023

This PR LGTM. Mostly nits and suggestions

Thanks for reviewing @swetharepakula! Suggestions implemented, manual tests look fine.

Ready for merge @kl52752.

@kl52752
Copy link
Contributor

kl52752 commented Jun 6, 2023

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 6, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: DamianSawicki, kl52752

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 6, 2023
@k8s-ci-robot k8s-ci-robot merged commit bae2cdb into kubernetes:master Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants