Skip to content
This repository has been archived by the owner on May 6, 2022. It is now read-only.

Fix panic when retry duration is exceeded #2259

Merged
merged 1 commit into from
Aug 3, 2018
Merged

Fix panic when retry duration is exceeded #2259

merged 1 commit into from
Aug 3, 2018

Conversation

nilebox
Copy link
Contributor

@nilebox nilebox commented Aug 3, 2018

Fixes #2253

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 3, 2018
@duglin
Copy link
Contributor

duglin commented Aug 3, 2018

To be really safe, why not check for nil at https://github.com/kubernetes-incubator/service-catalog/blob/3d0602afb81d872521e527c3aae83e26676b184b/pkg/controller/controller_instance.go#L2392 as well? Or if not, at least add a comment to that func mandating that the param MUST NOT be nil.

@nilebox
Copy link
Contributor Author

nilebox commented Aug 3, 2018

why not check for nil

I don't think there's a need for that. Panic prints a stack trace, which is very useful for debugging. Also, you will never miss a panic since it kills the controller pod.

at least add a comment to that func mandating that the param MUST NOT be nil.

I don't think anyone will read this comment :) Also such comments become outdated quickly. In other languages I would use null-safety syntax features, but Golang doesn't have one :(

@duglin
Copy link
Contributor

duglin commented Aug 3, 2018

IMO programs should never panic.

@nilebox
Copy link
Contributor Author

nilebox commented Aug 3, 2018

@duglin exactly, panic is a sign of a bug. And this PR fixes the bug and removes a panic.

But in most cases it's better to panic in the unexpected situation than mask the error. Once the situation is known, you either make the value non-nil, or add extra check with a meaningful error message. In this particular case we could only add a message "some SC developer messed up and didn't provide a status condition", which is not useful for the end user, but complicates discovering such bugs and debugging for ourselves.

Happy to discuss if there are alternative arguments though.

@duglin
Copy link
Contributor

duglin commented Aug 3, 2018

To me its not about "masking the error", log it so if someone wants to dig into the code they can. But then allow the end-user to continue to make progress because a panic does them no good when all they want to do is continue to get their work done. They're not Kube/SC devs. I did offer 2 suggestions - its up to you if you want to do either.

@nilebox
Copy link
Contributor Author

nilebox commented Aug 3, 2018

The problem is that Golang is not a null-safe language. Almost anywhere you can run into nil value, including struct fields (since there are no proper constructors in Go). Adding nil checks in every case where you mandate the value to be non-nil to avoid panic would pollute the code with ugly

if a == nil || b == nil || c == nil {
  return nil, errors.New("some input arguments were nil")
}

at the beginning of every single function. It's not common to do that in Go; Go prefers simplicity over redundant safety, unless you are handling a valid case where null value is allowed.

Another way of preventing panic is a test coverage. This particular case was not easy to run into (that's why it wasn't discovered for months), but it's possible to write an integration test reproducing it. But once we ran into it, we immediately discovered it and easily found the cause. With a logged error I doubt we would ever discovered this bug - everything would work with missing information in status.

@jpeeler
Copy link

jpeeler commented Aug 3, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 3, 2018
@jboyd01
Copy link
Contributor

jboyd01 commented Aug 3, 2018

/lgtm

@jboyd01
Copy link
Contributor

jboyd01 commented Aug 3, 2018

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jboyd01

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 3, 2018
@nilebox
Copy link
Contributor Author

nilebox commented Aug 3, 2018

@jpeeler I'm adding an LGTM2 on your behalf, hope you don't mind.

@nilebox nilebox added the LGTM2 label Aug 3, 2018
@k8s-ci-robot k8s-ci-robot merged commit a4d7fde into kubernetes-retired:master Aug 3, 2018
@nilebox nilebox deleted the fix-duration-exceeded-panic branch August 4, 2018 06:46
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm Indicates that a PR is ready to be merged. LGTM1 LGTM2 size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Panic when async operation exceeds the ReconciliationRetryDuration
5 participants