Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support restarting Nomad without restarting nspawn containers #17

Closed
mateuszlewko opened this issue Oct 27, 2020 · 4 comments · Fixed by #18
Closed

Support restarting Nomad without restarting nspawn containers #17

mateuszlewko opened this issue Oct 27, 2020 · 4 comments · Fixed by #18

Comments

@mateuszlewko
Copy link
Contributor

It seems that restarting Nomad service (for example when upgrading Nomad or reloading configuration) restarts jobs run by nspawn driver. Docker jobs stay alive and are not restarted when restarting Nomad.

I observed the following errors in logs:

2020-10-26T19:28:26.592+0100 [ERROR] client.alloc_runner.task_runner: error recovering task; cleaning up: alloc_id=2fb194a7-5964-07f6-e9da-b3c09abfb3a5 task= error="rpc error: code = Unknown desc = failed to decode driver config: EOF" task_id=2fb194a7-5964-07f6-e9da-b3c09abfb3a5//0af76d99
2020-10-26T19:28:26.592+0100 [WARN] client.alloc_runner.task_runner: error destroying unrecoverable task: alloc_id=2fb194a7-5964-07f6-e9da-b3c09abfb3a5 task= error="rpc error: code = Unknown desc = task not found for given id" task_id=2fb194a7-5964-07f6-e9da-b3c09abfb3a5//0af76d99

Failed jobs are then reallocated and run fine, however, it's undesirable that they are restarted.
Would it be hard to support that?

@JanMa
Copy link
Owner

JanMa commented Oct 28, 2020

Hello @mateuszlewko ,
normally it should be fine to restart Nomad and tasks started via this driver should also keep running.
I suspect you are running into an issue which I fixed in 9a578ff. Can you please update the driver to the latest release 0.4.0 and check if it is still not working?

If not, this is definitely a bug and I'll fix it!
Kind regards,
Jan

@JanMa
Copy link
Owner

JanMa commented Oct 28, 2020

I double-checked it and it seems the issue is also present in the latest version. I will make sure to fix this 👍

JanMa added a commit that referenced this issue Oct 29, 2020
When `RecoverTask` is called we initially tried to recover the
`TaskConfig` for a given task. This was blindly copied from the
nomad-driver-skeleton project and it turns out we make no use of it at
all.

Since this also caused Issue #17, we simply get rid of it. Recovering
tasks when a Nomad client is restarted now works again.
JanMa added a commit that referenced this issue Oct 29, 2020
When `RecoverTask` is called we initially tried to recover the
`TaskConfig` for a given task. This was blindly copied from the
nomad-driver-skeleton project and it turns out we make no use of it at
all.

Since this also caused Issue #17, we simply get rid of it. Recovering
tasks when a Nomad client is restarted now works again.
@JanMa JanMa closed this as completed in #18 Oct 29, 2020
JanMa added a commit that referenced this issue Oct 29, 2020
When `RecoverTask` is called we initially tried to recover the
`TaskConfig` for a given task. This was blindly copied from the
nomad-driver-skeleton project and it turns out we make no use of it at
all.

Since this also caused Issue #17, we simply get rid of it. Recovering
tasks when a Nomad client is restarted now works again.
@JanMa
Copy link
Owner

JanMa commented Oct 29, 2020

@mateuszlewko I just published a new release 0.4.1 which contains a fix for this issue

@mateuszlewko
Copy link
Contributor Author

Confirmed that it's working now. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants