Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set healthcheck from BackendConfig #1029

Merged
merged 4 commits into from
Feb 23, 2020
Merged

Conversation

bowei
Copy link
Member

@bowei bowei commented Feb 17, 2020

    Take healthcheck configuration overrides from the backendconfig

    This enables taking the healthcheck configuration from a backendconfig
    associated with the given Service.

    As part of this PR, we also clean up the override handling to be easier
    to understand where the settings come from.

    For a given ServicePort `sp` and its associated healthcheck `hc`:

    1. `sp` has some default settings depending on its type (e.g. IG, NEG, ILB).
    2. SyncServicePort() will override those settings if the Probe exists.
         Path, Host, TimeoutSec, CheckIntervalSec.
    3. We then sync with what exists in GCE:
        3.1 If the healthcheck is does not exist, it will be CREATED.
            3.1.1 If it has a backendconfig, then the following settings are used:
                    CheckIntervalSec, TimeoutSec, HealthyThreshold,
                    UnhealthyThreshold, Type, RequestPath
                  We have `Port` in the struct, but it does not seem to be a valid
                  field to use so it is currently ignored.
        3.2 Otherwise the healthcheck will need to be UPDATED. The criteria
            for update can be found in `calculateDiff()`:
            3.2.1 Services w/out backendconfig will only update when `Protocol`,
                  `PortSpecification` changes.
            3.2.2 Services w/ backendconfig will also consider any fields
                  covered by the backendconfig (see above)
            3.2.3 We merge in the settings from the existing healthcheck. As
                  users have (in the past) done edits to their HCs directly, we
                  do not want to break any previous behavior. Thus we merge in
                  all of the settings on HTTPHealthCheck as well as any fields on
                  HealthCheck itself, with the exception of: `Port`, `PortSpecification`
            3.2.2 We now consider the backendconfig to be a final override.
                  See list above for settings from the backendconfig.

    Several situations do not seem to be handled currently:

    * There is no way to change HC based on probe after it is created.
      This is because there is no way to tell if the difference is due
      to a user changing the setting directly in GCE or the HC should
      be updated.
    * Changing types to/from NEG, ILB will leak the old healthcheck.
      User healthchecksettings will not propagate to the new healthcheck.
    * backendconfig seems to be missing `Host`
TestSyncServicePort does the following:

1. Setup any preexisting objects (tc.setup) that should exist in the
   mocked GCE.
2. Run one sync given a ServicePort.
3. Attempt to re-run the sync with the same ServicePort, checking that
   no update will be needed.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 17, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bowei

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 17, 2020
@bowei bowei changed the title Hc from bc Implement getting healthcheck from BackendConfig Feb 17, 2020
Copy link
Contributor

@rramkumar1 rramkumar1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple questions, will do a more in-depth review once this is rebased on the dependent PRs

if needToUpdate(existingHC, hc) {
err = h.update(existingHC, hc)
return existingHC.SelfLink, err
// First, merge in the configuration from the existing healthcheck to cover
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to ensure a smooth migration path to BackendConfig?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is behavior that we have to preserve (unfortunately)

}
return false
// TODO: why don't we check Port?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We omit quite a few fields when checking for update, which is intentional but not documented (nor reasoning)

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 17, 2020
@bowei
Copy link
Member Author

bowei commented Feb 17, 2020

/assign @rramkumar1 @freehan @spencerhance

@k8s-ci-robot
Copy link
Contributor

@bowei: GitHub didn't allow me to assign the following users: spencerhance.

Note that only kubernetes members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @rramkumar1 @freehan @spencerhance

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bowei bowei changed the title Implement getting healthcheck from BackendConfig Set healthcheck from BackendConfig Feb 17, 2020
@bowei
Copy link
Member Author

bowei commented Feb 17, 2020

This should be ready for review.

@bowei bowei force-pushed the hc-from-bc branch 2 times, most recently from 2337b2f to 897edfa Compare February 17, 2020 21:36
return "", err
}

return h.getHealthCheckLink(hc.Name, hc.Version(), scope)
// TODO(bowei) -- we don't need to fetch the self-link here as it is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can use NewHealthChecksResourceID in k8s_cloud_provider to generate this link.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do this in a follow-up PR

// GKE.
premergeHC := hc
hc = mergeUserSettings(existingHC, hc)
klog.V(3).Infof("mergeHealthcheck(%v,%v) = %v", existingHC, premergeHC, hc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%+v?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this kind of on purpose to make the log line shorter...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actuall, let me fix this.

func (h *HealthChecks) create(hc *HealthCheck) error {
func (h *HealthChecks) create(hc *HealthCheck, bchcc *backendconfigv1.HealthCheckConfig) error {
if bchcc != nil {
klog.V(2).Infof("Health check %q has backendconfig override (%+v)", hc.Name, bchcc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You log this same thing in line 165 as well. Maybe remove one of them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not the same. This is on the create path, that instance is on the update path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might make sense to differentiate the log lines

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can distinguish from the source line in klog.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, seems reasonable to leave it as is then


if existing.HealthCheck.LogConfig != nil {
l := *existing.HealthCheck.LogConfig
hc.HealthCheck.LogConfig = &l
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an Alpha-only field? If that is the case, where do we configure this HC to be Alpha?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing this handles all resource versions due to how version conversion via JSON drops fields when it gets converted.
This is proactive. I don't expect to see this field right now.

// Cannot specify both portSpecification and port field.
if hc.PortSpecification != "" {
hc.Port = 0
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this added here?

Copy link
Member Author

@bowei bowei Feb 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done on all paths that ended up at merge() previously.
I consolidated it here and remove the other instances.
Really, this should not exist, but people have been writing imprecise code, so I am keeping it for now.

if c.RequestPath != nil {
hc.RequestPath = *c.RequestPath
}
if c.Port != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused by this? Shouldn't setting port be allowed for IG backends?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't make sense. Port is set from NodePort.
NEG should take port from the neg itself.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to hear about a use case though...

Copy link
Contributor

@spencerhance spencerhance left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of TODOs and this is a pretty large diff, but overall lgtm. Have you run any e2e tests against this?

}

// fieldDiffs encapsulate which fields are different between health checks.
type fieldDiffs struct {
Copy link
Contributor

@spencerhance spencerhance Feb 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be easier to read if in a new file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will send a PR to split the file.
The diff gets very hard to read when I did the split...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

@bowei
Copy link
Member Author

bowei commented Feb 18, 2020

Some of the TODOs can't be done right now (esp the ones around bugs).
I left them so we can file issues around them and track them long-term.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 20, 2020
This allows us to check for HTTPStatusCode == 404 directly without
guarding it with a nil check.
This enables taking the healthcheck configuration from a backendconfig
associated with the given Service.

As part of this PR, we also clean up the override handling to be easier
to understand where the settings come from.

For a given ServicePort `sp` and its associated healthcheck `hc`:

1. `sp` has some default settings depending on its type (e.g. IG, NEG, ILB).
2. SyncServicePort() will override those settings if the Probe exists.
     Path, Host, TimeoutSec, CheckIntervalSec.
3. We then sync with what exists in GCE:
    3.1 If the healthcheck is does not exist, it will be CREATED.
	3.1.1 If it has a backendconfig, then the following settings are used:
		CheckIntervalSec, TimeoutSec, HealthyThreshold,
		UnhealthyThreshold, Type, RequestPath
	      We have `Port` in the struct, but it does not seem to be a valid
	      field to use so it is currently ignored.
    3.2 Otherwise the healthcheck will need to be UPDATED. The criteria
	for update can be found in `calculateDiff()`:
	3.2.1 Services w/out backendconfig will only update when `Protocol`,
	      `PortSpecification` changes.
	3.2.2 Services w/ backendconfig will also consider any fields
	      covered by the backendconfig (see above)
	3.2.3 We merge in the settings from the existing healthcheck. As
	      users have (in the past) done edits to their HCs directly, we
	      do not want to break any previous behavior. Thus we merge in
	      all of the settings on HTTPHealthCheck as well as any fields on
	      HealthCheck itself, with the exception of: `Port`, `PortSpecification`
	3.2.2 We now consider the backendconfig to be a final override.
	      See list above for settings from the backendconfig.

Several situations do not seem to be handled currently:

* There is no way to change HC based on probe after it is created.
  This is because there is no way to tell if the difference is due
  to a user changing the setting directly in GCE or the HC should
  be updated.
* Changing types to/from NEG, ILB will leak the old healthcheck.
  User healthchecksettings will not propagate to the new healthcheck.
* backendconfig seems to be missing `Host`

merge
TestSyncServicePort does the following:

1. Setup any preexisting objects (tc.setup) that should exist in the
   mocked GCE.
2. Run one sync given a ServicePort.
3. Attempt to re-run the sync with the same ServicePort, checking that
   no update will be needed.
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 22, 2020
@bowei
Copy link
Member Author

bowei commented Feb 23, 2020

I just tested this by manually running the e2e test and am getting a PR together for the e2e test.
Going to merge this for now...

@bowei bowei merged commit 9c53eaf into kubernetes:master Feb 23, 2020
@mark-church
Copy link

@bowei based off of how users are desiring to configure health checks for istio-ingressgateway, we'll need to support the host field as well.

@bowei bowei deleted the hc-from-bc branch April 21, 2020 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants