why is docker.service not automatically restarting #682

itajaja · 2016-09-21T17:55:46Z

https://github.com/coreos/coreos-kubernetes/blob/master/multi-node/aws/pkg/config/templates/cloud-config-worker#L9

here it seems like docker service doesn't have a restart policy. I am sure I am missing something, and I apoligize if this isn't the best place to discuss this, but I did experience some problem in my cluster when, for any reason, docker died, and I had to automatically restart it.

spacepluk · 2016-09-21T18:01:01Z

It's probably because of this: systemd/systemd#1312

I'm using this workaround and it's been working fine so far: https://gist.github.com/spacepluk/a14f10cfed3756c0f1f079e73cdc6c9a

itajaja · 2016-09-21T18:07:05Z

@spacepluk why do you consider it a workaround? Are there reasons why that patch couldn't be pushed upstream?

(also, update your link to remove the /edit otherwise it 404s)

spacepluk · 2016-09-21T18:15:26Z

Oops, fixed the link.

I don't know the details to be honest, it looks like some change of behavior in systemd caused the issue. I believe @colhom is working on a proper solution for coreos-kubernetes.

cgag · 2016-09-21T18:23:17Z

@itajaja we're discussing a similar issue here: #675. I think there's a good chance we'll end up doing @spacepluk did for all the normal services (the discussion in 675 is around a oneshot service, and those are a little weird).

I don't believe docker itself should need a restart policy because it's started on-demand (socket activated). It should have been restarted by any dependency that tried to use it. Maybe systemd stops activating it if it failed due a dependency issue? I'll try to test that out later, but I agree we should probably just add restart logic regardless.

itajaja · 2016-09-21T18:25:59Z

I don't believe docker itself should need a restart policy because it's started on-demand (socket activated). It should have been restarted by any dependency that tried to use it.

You are right, good point.

On top of coreos#682 (comment), replaces a dependency from kubelet to the oneshot unit `decrypt-tls-assets` with an ExecStartPre in the kubelet service because systemd doesn't seem to restart failed `decrypt-tls-assets` services that way.

mumoshu mentioned this issue Sep 29, 2016

aws: workaround for systemd #1312 #697

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why is docker.service not automatically restarting #682

why is docker.service not automatically restarting #682

itajaja commented Sep 21, 2016

spacepluk commented Sep 21, 2016 •

edited

Loading

itajaja commented Sep 21, 2016 •

edited

Loading

spacepluk commented Sep 21, 2016

cgag commented Sep 21, 2016

itajaja commented Sep 21, 2016

why is docker.service not automatically restarting #682

why is docker.service not automatically restarting #682

Comments

itajaja commented Sep 21, 2016

spacepluk commented Sep 21, 2016 • edited Loading

itajaja commented Sep 21, 2016 • edited Loading

spacepluk commented Sep 21, 2016

cgag commented Sep 21, 2016

itajaja commented Sep 21, 2016

spacepluk commented Sep 21, 2016 •

edited

Loading

itajaja commented Sep 21, 2016 •

edited

Loading