Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OperatorSDK issue after restarting neon-cluster-operator? #1852

Closed
jefflill opened this issue Aug 19, 2023 · 3 comments
Closed

OperatorSDK issue after restarting neon-cluster-operator? #1852

jefflill opened this issue Aug 19, 2023 · 3 comments
Assignees
Labels
bug Identifies a bug or other failure cluster-operators Related to one of our cluster operators investigate Needs further investigation neon-kube Related to our Kubernetes distribution

Comments

@jefflill
Copy link
Collaborator

jefflill commented Aug 19, 2023

It looks like the OperatorSDK may be having problems reestablishing webhooks after restarting the operator.

I restarted neon-cluster-operator after setting LOG_LEVEL=trace when trying to debug the performance issue. The API Server immediately has fairly high CPU usage and the API Server looks like it's unable to send webhook requests to the new neon-cluster-operator pod (you can also see the neon-acme OpenAPIs intermixed as well #1847):

{"ts":1692403829344.051,"caller":"openapi/controller.go:116","msg":"loading OpenAPI spec for \"v1alpha1.acme.neoncloud.io\" failed with: OpenAPI spec does not exist\n"}
{"ts":1692403829344.0842,"caller":"openapi/controller.go:129","msg":"OpenAPI AggregationController: action for item v1alpha1.acme.neoncloud.io: Rate Limited Requeue.\n","v":0}
{"ts":1692403836935.4392,"caller":"mutating/dispatcher.go:180","msg":"Failed calling webhook, failing open deployment-policy.neonkube.io: failed calling webhook \"deployment-policy.neonkube.io\": failed to call webhook: Post \"https://neon-cluster-operator.neon-system.svc:443/apps/v1/deployments/deploymentwebhook/mutate?timeout=5s\": dial tcp 10.253.74.44:443: connect: connection refused\n","v":0}
{"ts":1692403836935.4954,"caller":"mutating/dispatcher.go:184","msg":"failed calling webhook \"deployment-policy.neonkube.io\": failed to call webhook: Post \"https://neon-cluster-operator.neon-system.svc:443/apps/v1/deployments/deploymentwebhook/mutate?timeout=5s\": dial tcp 10.253.74.44:443: connect: connection refused\n"}
{"ts":1692403846959.5413,"caller":"mutating/dispatcher.go:180","msg":"Failed calling webhook, failing open deployment-policy.neonkube.io: failed calling webhook \"deployment-policy.neonkube.io\": failed to call webhook: Post \"https://neon-cluster-operator.neon-system.svc:443/apps/v1/deployments/deploymentwebhook/mutate?timeout=5s\": dial tcp 10.253.74.44:443: connect: connection refused\n","v":0}
{"ts":1692403846959.5977,"caller":"mutating/dispatcher.go:184","msg":"failed calling webhook \"deployment-policy.neonkube.io\": failed to call webhook: Post \"https://neon-cluster-operator.neon-system.svc:443/apps/v1/deployments/deploymentwebhook/mutate?timeout=5s\": dial tcp 10.253.74.44:443: connect: connection refused\n"}
{"ts":1692403856987.9785,"caller":"mutating/dispatcher.go:180","msg":"Failed calling webhook, failing open deployment-policy.neonkube.io: failed calling webhook \"deployment-policy.neonkube.io\": failed to call webhook: Post \"https://neon-cluster-operator.neon-system.svc:443/apps/v1/deployments/deploymentwebhook/mutate?timeout=5s\": dial tcp 10.253.74.44:443: connect: connection refused\n","v":0}
{"ts":1692403856988.0146,"caller":"mutating/dispatcher.go:184","msg":"failed calling webhook \"deployment-policy.neonkube.io\": failed to call webhook: Post \"https://neon-cluster-operator.neon-system.svc:443/apps/v1/deployments/deploymentwebhook/mutate?timeout=5s\": dial tcp 10.253.74.44:443: connect: connection refused\n"}
{"ts":1692403867015.4243,"caller":"mutating/dispatcher.go:180","msg":"Failed calling webhook, failing open deployment-policy.neonkube.io: failed calling webhook \"deployment-policy.neonkube.io\": failed to call webhook: Post \"https://neon-cluster-operator.neon-system.svc:443/apps/v1/deployments/deploymentwebhook/mutate?timeout=5s\": dial tcp 10.253.74.44:443: connect: connection refused\n","v":0}
{"ts":1692403867015.4587,"caller":"mutating/dispatcher.go:184","msg":"failed calling webhook \"deployment-policy.neonkube.io\": failed to call webhook: Post \"https://neon-cluster-operator.neon-system.svc:443/apps/v1/deployments/deploymentwebhook/mutate?timeout=5s\": dial tcp 10.253.74.44:443: connect: connection refused\n"}
{"ts":1692403877039.0762,"caller":"mutating/dispatcher.go:180","msg":"Failed calling webhook, failing open deployment-policy.neonkube.io: failed calling webhook \"deployment-policy.neonkube.io\": failed to call webhook: Post \"https://neon-cluster-operator.neon-system.svc:443/apps/v1/deployments/deploymentwebhook/mutate?timeout=5s\": dial tcp 10.253.74.44:443: connect: connection refused\n","v":0}
{"ts":1692403877039.1125,"caller":"mutating/dispatcher.go:184","msg":"failed calling webhook \"deployment-policy.neonkube.io\": failed to call webhook: Post \"https://neon-cluster-operator.neon-system.svc:443/apps/v1/deployments/deploymentwebhook/mutate?timeout=5s\": dial tcp 10.253.74.44:443: connect: connection refused\n"}
{"ts":1692403887063.5889,"caller":"mutating/dispatcher.go:180","msg":"Failed calling webhook, failing open deployment-policy.neonkube.io: failed calling webhook \"deployment-policy.neonkube.io\": failed to call webhook: Post \"https://neon-cluster-operator.neon-system.svc:443/apps/v1/deployments/deploymentwebhook/mutate?timeout=5s\": dial tcp 10.253.74.44:443: connect: connection refused\n","v":0}
{"ts":1692403887063.643,"caller":"mutating/dispatcher.go:184","msg":"failed calling webhook \"deployment-policy.neonkube.io\": failed to call webhook: Post \"https://neon-cluster-operator.neon-system.svc:443/apps/v1/deployments/deploymentwebhook/mutate?timeout=5s\": dial tcp 10.253.74.44:443: connect: connection refused\n"}
{"ts":1692403897103.4758,"caller":"mutating/dispatcher.go:180","msg":"Failed calling webhook, failing open deployment-policy.neonkube.io: failed calling webhook \"deployment-policy.neonkube.io\": failed to call webhook: Post \"https://neon-cluster-operator.neon-system.svc:443/apps/v1/deployments/deploymentwebhook/mutate?timeout=5s\": dial tcp 10.253.74.44:443: connect: connection refused\n","v":0}
{"ts":1692403897103.534,"caller":"mutating/dispatcher.go:184","msg":"failed calling webhook \"deployment-policy.neonkube.io\": failed to call webhook: Post \"https://neon-cluster-operator.neon-system.svc:443/apps/v1/deployments/deploymentwebhook/mutate?timeout=5s\": dial tcp 10.253.74.44:443: connect: connection refused\n"}
@jefflill jefflill added bug Identifies a bug or other failure neon-kube Related to our Kubernetes distribution cluster-operators Related to one of our cluster operators labels Aug 19, 2023
@jefflill jefflill changed the title OperatorSDK issue after restarting neon-cluster-operator OperatorSDK issue after restarting neon-cluster-operator? Aug 19, 2023
@jefflill jefflill added the investigate Needs further investigation label Aug 19, 2023
@marcusbooyah
Copy link
Member

Was this a single node cluster? Did it not go away once the operator started up?

@marcusbooyah
Copy link
Member

I don't think this is an issue

@jefflill
Copy link
Collaborator Author

Yeah, it was probably a single node cluster. This is an example of the sort of thing I've been seeing in logs that seemed a bit weird, so I'm creating issues.

...not sure it's a problem either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Identifies a bug or other failure cluster-operators Related to one of our cluster operators investigate Needs further investigation neon-kube Related to our Kubernetes distribution
Projects
None yet
Development

No branches or pull requests

2 participants