Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbus hang when call godbus client in kubelet #221

Open
smileusd opened this issue Dec 29, 2020 · 5 comments
Open

dbus hang when call godbus client in kubelet #221

smileusd opened this issue Dec 29, 2020 · 5 comments

Comments

@smileusd
Copy link

We use 1.15 kubelet and find the dbus will hang when lots of containers create. Other containers will wait the lock release.

Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: k8s.io/kubernetes/vendor/github.com/godbus/dbus.(*Object).Call(0xc002129a40, 0x42d9356, 0x33, 0xc000d4e300, 0xc001545b40, 0x4, 0x4, 0x40fa18)
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/godbus/dbus/object.go:27 +0xbb
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: k8s.io/kubernetes/vendor/github.com/coreos/go-systemd/dbus.(*Conn).startJob(0xc000d4e380, 0x0, 0x42d9356, 0x33, 0xc001545b40, 0x4, 0x4, 0x0, 0x0, 0x0)
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/coreos/go-systemd/dbus/methods.go:48 +0xbd
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: k8s.io/kubernetes/vendor/github.com/coreos/go-systemd/dbus.(*Conn).StartTransientUnit(...)
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/coreos/go-systemd/dbus/methods.go:138
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: k8s.io/kubernetes/vendor/github.com/opencontainers/runc/libcontainer/cgroups/systemd.UseSystemd(0xc000438d00)
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/opencontainers/runc/libcontainer/cgroups/systemd/apply_systemd.go:109 +0x276
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: k8s.io/kubernetes/pkg/kubelet/cm.(*libcontainerAdapter).newManager(0xc000576c00, 0xc00189a4b0, 0x0, 0x1, 0x76a98a0, 0x38f2520, 0xc000438db0)
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/cm/cgroup_manager_linux.go:156 +0x189
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: k8s.io/kubernetes/pkg/kubelet/cm.(*cgroupManagerImpl).Create(0xc000576c10, 0xc00183ad00, 0x0, 0x0)
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/cm/cgroup_manager_linux.go:477 +0x1e2
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: k8s.io/kubernetes/pkg/kubelet/cm.(*podContainerManagerImpl).EnsureExists(0xc000d07180, 0xc00083f500, 0x0, 0x7)
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/cm/pod_container_manager_linux.go:105 +0x222
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: k8s.io/kubernetes/pkg/kubelet.(*Kubelet).syncPod(0xc000a0c900, 0x0, 0xc00083f500, 0x2, 0xc000836380, 0x0, 0xc002b16240, 0xc001e12570)
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet.go:1627 +0x28e4
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: k8s.io/kubernetes/pkg/kubelet.(*podWorkers).managePodLoop.func1(0xc00202ded8, 0xc0008a40e0, 0xc0020edec0, 0xc00283d6c0, 0x43302b)
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/pod_workers.go:174 +0x2ab
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: k8s.io/kubernetes/pkg/kubelet.(*podWorkers).managePodLoop(0xc0008a40e0, 0xc0019d7f80)
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/pod_workers.go:183 +0x13f
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: k8s.io/kubernetes/pkg/kubelet.(*podWorkers).UpdatePod.func1(0xc0008a40e0, 0xc0019d7f80)
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/pod_workers.go:221 +0x62
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: created by k8s.io/kubernetes/pkg/kubelet.(*podWorkers).UpdatePod
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/pod_workers.go:219 +0x395
...
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: sync.(*Mutex).Lock(0x76c7ad8)
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: /usr/local/go/src/sync/mutex.go:134 +0x109
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: k8s.io/kubernetes/vendor/github.com/opencontainers/runc/libcontainer/cgroups/systemd.UseSystemd(0xc000438d00)
Dec 29 02:14:19 tess-node-s4j5n-tess94.stratus.rno.ebay.com kubelet[112554]: /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/opencontainers/runc/libcontainer/cgroups/systemd/apply_systemd.go:95 +0x59
@jhenstridge
Copy link
Contributor

The lock mentioned in this stack trace doesn't seem to belong to godbus: it looks to be from a vendored copy of runc's libcontainer. What do you believe godbus is doing wrong?

Searching for some keywords from the trace turned up kubernetes/kubernetes#92855, which is marked fixed. So does the problem go away if you upgrade kubernetes?

@smileusd
Copy link
Author

smileusd commented Feb 1, 2021

Thanks for response @jhenstridge . 92855 is not the same issue. We find the client is hang and not response when call github.com/godbus/dbus/object.go:27 libcontainer uses this library to call dbus in kubelet. I am not sure whether it is dbus hang or client hang. After we call "kill 1", the system is recovered.

@jhenstridge
Copy link
Contributor

I suspect you're probably hitting the bus daemon's max_replies_per_connection limit. The godbus library doesn't do anything to limit the number of pending method calls it sends, so it is quite possible that it could exceed the daemon's limits if you had many goroutines performing slow method calls.

On a quick code search, it isn't clear to me that the daemon will synthesise error messages when this happens (it is a DoS protection, after all), which could lead to your code waiting forever. One way to test this would be to set max_replies_per_connection to some high number and see if you get different behaviour.

If this is the issue, I guess that leaves open the question of whether godbus should rate limit method calls on a connection. If other D-Bus client bindings to so, then I can see an argument for doing that. If not, then perhaps kubelet should do something to limit its parallelism here?

@jhenstridge
Copy link
Contributor

On further inspection, it looks like you should get a org.freedesktop.DBus.Error.LimitsExceeded error response back in this case. Is there anything of interest in dbus-daemon's log file when you trigger this problem?

@guelfey
Copy link
Member

guelfey commented Jan 6, 2022

Can you determine which version of godbus is actually used there? Also k8s 1.15 is EOL as far as I know

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants