Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix panic on multiple ingress mess up upstream is primary or not #4506

Merged
merged 1 commit into from
Sep 7, 2019

Conversation

nic-6443
Copy link
Contributor

What this PR does / why we need it:
Using the following Ingress as input will cause the IngressController to crash because of nil pointer panic:

# Need at least six ingress(canary and noncanary half) for three location on same server and reference two service
ingress-a1:   location: /a  backend: svc-1
ingress-a2:  location: /a  backend: svc-2  canary

ingress-b1:   location: /b  backend: svc-1  canary
ingress-b2:  location: /b  backend: svc-2

ingress-c1:   location: /c  backend: svc-1
ingress-c2:  location: /c  backend: svc-2  canary

Related code snippet: controller.go mergeAlternativeBackends()

merged := false

for _, loc := range server.Locations {
    priUps := upstreams[loc.Backend]

    //if priUps == altUps,will return false
    if canMergeBackend(priUps, altUps) && loc.Path == path.Path {
        klog.V(2).Infof("matching backend %v found for alternative backend %v",
            priUps.Name, altUps.Name)

        merged = mergeAlternativeBackend(priUps, altUps)
    }
}

if !merged {
    klog.Warningf("unable to find real backend for alternative backend %v. Deleting.", altUps.Name)

    // May be delete alternative upstream  referenced by other ingress as primary upstream
    delete(upstreams, altUps.Name)
}

When IngressController process ingress-b1 will delete svc-1 upstream,then process ingress-c2 will panic in canMergeBackend(),because it's primary upstream already deleted.

func canMergeBackend(primary *ingress.Backend, alternative *ingress.Backend) bool {
        //primary is nil
	return alternative != nil && primary.Name != alternative.Name && primary.Name != defUpstreamName && !primary.NoServer
}

example crash log:

W0808 18:03:29.303362       7 controller.go:1218] unable to find real backend for alternative backend friday-app-88dc7d494c22e7bef8e2a4860bc526b3a7fa4c97-3-14004. Deleting.
E0808 18:03:29.307183       7 runtime.go:69] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/k8s.io/ingress-nginx/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:76
/go/src/k8s.io/ingress-nginx/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/go/src/k8s.io/ingress-nginx/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/usr/local/go/src/runtime/panic.go:522
/usr/local/go/src/runtime/panic.go:82
/usr/local/go/src/runtime/signal_unix.go:390

This PR will protect primary upstream not be delete, even if it as alternative upstream in other ingress.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 29, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @ProNic-QY. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Aug 29, 2019
@aledbf
Copy link
Member

aledbf commented Aug 29, 2019

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 29, 2019
@aledbf
Copy link
Member

aledbf commented Aug 29, 2019

@ProNic-QY please add an e2e test for this scenario

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Aug 29, 2019
@nic-6443
Copy link
Contributor Author

@ProNic-QY please add an e2e test for this scenario

@aledbf Done. And add same protect logic in merge catch-all alternative backends .

@codecov-io
Copy link

codecov-io commented Aug 29, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@8740c1b). Click here to learn what that means.
The diff coverage is 81.25%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #4506   +/-   ##
=========================================
  Coverage          ?   60.13%           
=========================================
  Files             ?       89           
  Lines             ?     7476           
  Branches          ?        0           
=========================================
  Hits              ?     4496           
  Misses            ?     2509           
  Partials          ?      471
Impacted Files Coverage Δ
internal/ingress/controller/controller.go 52.73% <81.25%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8740c1b...435377f. Read the comment docs.

@nic-6443
Copy link
Contributor Author

/retest

@aledbf
Copy link
Member

aledbf commented Aug 29, 2019

@ProNic-QY run gofmt -s -w ./internal/ingress/controller/controller_test.go

@nic-6443
Copy link
Contributor Author

@ProNic-QY run gofmt -s -w ./internal/ingress/controller/controller_test.go

@aledbf Get it

@nic-6443
Copy link
Contributor Author

nic-6443 commented Sep 2, 2019

@aledbf @ElvinEfendi Hi,please review this pr.This crash problem detected in our production environment ,so I need fix it as soon as possible. I waiting for your feedback on this PR.Thanks.

@aledbf
Copy link
Member

aledbf commented Sep 7, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 7, 2019
@aledbf
Copy link
Member

aledbf commented Sep 7, 2019

@ProNic-QY thanks!

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aledbf, ProNic-QY

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 7, 2019
@k8s-ci-robot k8s-ci-robot merged commit 76e2a5d into kubernetes:master Sep 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants