Skip to content
This repository has been archived by the owner on Sep 4, 2021. It is now read-only.

why is docker.service not automatically restarting #682

Open
itajaja opened this issue Sep 21, 2016 · 5 comments
Open

why is docker.service not automatically restarting #682

itajaja opened this issue Sep 21, 2016 · 5 comments

Comments

@itajaja
Copy link

itajaja commented Sep 21, 2016

https://github.com/coreos/coreos-kubernetes/blob/master/multi-node/aws/pkg/config/templates/cloud-config-worker#L9

here it seems like docker service doesn't have a restart policy. I am sure I am missing something, and I apoligize if this isn't the best place to discuss this, but I did experience some problem in my cluster when, for any reason, docker died, and I had to automatically restart it.

@spacepluk
Copy link
Contributor

spacepluk commented Sep 21, 2016

It's probably because of this: systemd/systemd#1312

I'm using this workaround and it's been working fine so far: https://gist.github.com/spacepluk/a14f10cfed3756c0f1f079e73cdc6c9a

@itajaja
Copy link
Author

itajaja commented Sep 21, 2016

@spacepluk why do you consider it a workaround? Are there reasons why that patch couldn't be pushed upstream?

(also, update your link to remove the /edit otherwise it 404s)

@spacepluk
Copy link
Contributor

Oops, fixed the link.

I don't know the details to be honest, it looks like some change of behavior in systemd caused the issue. I believe @colhom is working on a proper solution for coreos-kubernetes.

@cgag
Copy link
Contributor

cgag commented Sep 21, 2016

@itajaja we're discussing a similar issue here: #675. I think there's a good chance we'll end up doing @spacepluk did for all the normal services (the discussion in 675 is around a oneshot service, and those are a little weird).

I don't believe docker itself should need a restart policy because it's started on-demand (socket activated). It should have been restarted by any dependency that tried to use it. Maybe systemd stops activating it if it failed due a dependency issue? I'll try to test that out later, but I agree we should probably just add restart logic regardless.

@itajaja
Copy link
Author

itajaja commented Sep 21, 2016

I don't believe docker itself should need a restart policy because it's started on-demand (socket activated). It should have been restarted by any dependency that tried to use it.

You are right, good point.

mumoshu added a commit to mumoshu/coreos-kubernetes that referenced this issue Sep 29, 2016
On top of coreos#682 (comment), replaces a dependency from kubelet to the oneshot unit `decrypt-tls-assets` with an ExecStartPre in the kubelet service because systemd doesn't seem to restart failed `decrypt-tls-assets` services that way.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants