Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use monotonic time for delays, etc. #912

Closed
wants to merge 1 commit into from

Conversation

flixr
Copy link

@flixr flixr commented Apr 3, 2017

Using the system clock for checking timeouts, delays, etc.. is not robust as the system time might jump (e.g. due to NTP, PTP or other time updates)

So make sure we use the monotonic clock to compute durations (where we don't care about the actual time anyway).

Uses monotonic.py from https://github.com/atdt/monotonic

@mnaberez
Copy link
Member

mnaberez commented Apr 3, 2017

Please see discussion in #468. I'd rather not pull in a whole third party lib to get this, but we can probably support monotonic using Python 3's built-in module if you want to submit a patch for that.

@mnaberez mnaberez closed this Apr 3, 2017
@flixr
Copy link
Author

flixr commented Apr 3, 2017

It actually uses time.monotonic if available
But using this it will also work for python versions < 3.3

@mnaberez
Copy link
Member

mnaberez commented Apr 3, 2017

Yeah, what I'm saying in #468 (comment) is that we can probably just import it from Python 3 if available, otherwise work as it does now. It's probably not worth bringing in a whole third party library, and the Python community seems to be moving towards Python 3 anyway.

@flixr
Copy link
Author

flixr commented Apr 3, 2017

Ok... not an option for me, as I need it for python2.7
Will roll my own patched package then...

@mnaberez
Copy link
Member

mnaberez commented Apr 3, 2017

I spent some time working on issues related to time changes (here, here, and here) without requiring monotonic time. If the time changes right in the middle of a process start or stop, delays can be affected (startsecs, stopwaitsecs). Otherwise, supervisord should generally work fine. If you find something more serious like the issues linked above, please report it.

@flixr
Copy link
Author

flixr commented Apr 3, 2017

I was already using 3.3.1 and had issues.

IMHO it is pretty serious if supervisor can't deal with system time changes when processes are started...
I'm using it on a device which does not have a battery and starts up processes at boot time in a specific order and NTP usually become available while some processes are still starting and then the time jumps.

@mnaberez
Copy link
Member

mnaberez commented Apr 3, 2017

I was already using 3.3.1 and had issues.

If you found problems other than startsecs and stopwaitsecs, I would appreciate if you could report the specifics of those issues so they can be looked at.

IMHO it is pretty serious if supervisor can't deal with system time changes when processes are started...

I see these two problems:

  • If a process enters the STARTING state, then the time changes before it enters the RUNNING state, the startsecs delay may be affected.
  • If a process enters the STOPPING state, then the time changes before it enters the STOPPED state, the stopwaitsecs delay may be affected.

Normally, the system time changes only by a second or two, and it's not likely to happen right in the middle of these events unless you have a high rate of processes starting/stopping. In most configurations I have seen, startsecs and stopsecs are large values like 30, so losing a second or two probably isn't a calamity. Larger changes like daylight savings are definitely an issue, and I think that's bad, but again should only be if it happens right in the middle of process start/stop.

I do think it's a good idea to use the built-in monotonic module on Python 3 if it's available and I would merge a patch for that. I'm not enthusiastic about bundling a third party library that uses a bunch of ctypes hacking to try and get monotonic time on Python 2. It seems risky and might cause more issues than it solves.

I'm using it on a device which does not have a battery and starts up processes at boot time in a specific order and NTP usually become available while some processes are still starting and then the time jumps.

supervisord doesn't actually have a way to start up processes in a specific order to my knowledge. It might be worth investigating a replacement for supervisord like systemd that has an actual dependency system, where your processes that depend on NTP can be defined to start after NTP.

@flixr
Copy link
Author

flixr commented Apr 3, 2017

As I said my device (Nvidia TK1 board) does not have a battery, so system time jumps by a couple of years as soon as NTP is available...
So at least in my specific case it's not only a couple of seconds... and in my case time can later also jump backwards when syncing time with the device using PTP.

What I noticed is that this happens during the start process and supervisorctl hence shows some process being up for a couple of thousand days.
One program also didn't start up, but I didn't investigate further and immediately started fixing it to use monotonic time...

To start things in order I'm using an event listener: https://github.com/flixr/ordered-startup-supervisord
And none of the programs depends on NTP or a correct system time, just on each other...

Also since Nvidia only provides drivers for Ubuntu 14.04, using systemd is very unfortunately not an option for me...

I can understand that you are reluctant to merge this with the ctypes implementation...
I'll try with this fix for now and maybe look at using python3 when I have more time.
Btw. would I need to base it on master then or would it work on 3.3 as well?
I'm not actually using this PR, but rather applied the changes to 3.3 as in my 3.3-time_monotonic branch...

@mnaberez
Copy link
Member

mnaberez commented Apr 3, 2017

What I noticed is that this happens during the start process and supervisorctl hence shows some process being up for a couple of thousand days.

Ah, that makes sense, yeah. The uptime is informational only shouldn't affect supervisord's operation but I can see that being very annoying.

Also since Nvidia only provides drivers for Ubuntu 14.04, using systemd is very unfortunately not an option for me...

Maybe you could delay supervisord from starting in your init script. If your system without battery starts up with some long-ago date, your init script can busy-loop until date shows after year 2017 or something similar. It's a hack, for sure, but might be easier than rolling your own package.

To start things in order I'm using an event listener: https://github.com/flixr/ordered-startup-supervisord

Oh, cool, I didn't know about that. It's not on the plugins list yet.

Btw. would I need to base it on master then or would it work on 3.3 as well?

The release versions are the 3.x branches and are Python 2 only. The master branch will become Supervisor 4 and will support Python 2.6+ and 3.2+. There are are a few python 3 issues remaining but it's getting closer. I would definitely take a patch to add monotonic support on Python 3 like discussed in #468 (comment).

@flixr
Copy link
Author

flixr commented Apr 3, 2017

Ah, that makes sense, yeah. The uptime is informational only shouldn't affect supervisord's operation but I can see that being very annoying.

Yes. But also at least one process also didn't come up and I suspected startsecs...
Since this is rather timing dependent and probably hard to reproduce in practice I didn't look further and decided to make sure this can't happen in the first place ;-)

Maybe you could delay supervisord from starting in your init script. If your system without battery starts up with some long-ago date, your init script can busy-loop until date shows after year 2017 or something similar. It's a hack, for sure, but might be easier than rolling your own package.

Thanks for the tips, but this is unfortunately also not an option as it has to work regardless if there is NTP (or PTP) available...

Oh, cool, I didn't know about that. It's not on the plugins list yet.

It's really simple and originally not by me... that is just my fork of https://github.com/jasoncorbett/ordered-startup-supervisord with some additional fixes and features...

The release versions are the 3.x branches and are Python 2 only. The master branch will become Supervisor 4 and will support Python 2.6+ and 3.2+. There are are a few python 3 issues remaining but it's getting closer. I would definitely take a patch to add monotonic support on Python 3 like discussed in #468 (comment).

Ok... will try that when I have some spare time and make a new PR... right now I just need to get this to work somehow ASAP

@flixr
Copy link
Author

flixr commented Apr 4, 2017

In case someone is interested in a Debian package for Ubuntu 14.04 (trusty) with the monotonic fixes:
https://launchpad.net/~flixr/+archive/ubuntu/supervisor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants