Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to pass SIGUSR1 & SIGHUP to main process #195

Closed
javabean opened this issue Jul 30, 2016 · 11 comments
Closed

Ability to pass SIGUSR1 & SIGHUP to main process #195

javabean opened this issue Jul 30, 2016 · 11 comments

Comments

@javabean
Copy link

Unless I am confused, there is currently no way to pass SIGUSR1 & SIGHUP to the main process (SIGUSR1 being used to put the service node in maintenance mode, and SIGHUP to reload ContainerPilot's configuration).

Could we please have an optional configuration to disable this, and pass all signals to the main application? (in which case one can still able to enable maintenance mode with a docker exec xxx kill -SIGUSR1 1)

@tgross
Copy link
Contributor

tgross commented Aug 2, 2016

Unless I am confused, there is currently no way to pass SIGUSR1 & SIGHUP to the main process

Right, not from outside the container. You'd need to use docker exec pkill to send those signals or include them in an onChange handler. The convention for SIGHUP is to do a config reload which you'd normally want to do in an onChange or task handler, but I know for example HAProxy uses it to dump status (which you'd hit w/ a sensor handler instead). Is there a specific use case you have in mind here that I'm not considering?

@tgross tgross added the proposal label Aug 2, 2016
@javabean
Copy link
Author

javabean commented Aug 2, 2016

I think we are considering ContainerPilot (CP) from a different point of view. My vision is that CP has infrastructure qualities, and is not part of my application. It should therefore stay transparent. When sending an UNIX signal via Docker, I mean to interact with my application, not CP, hence the need to forward all signals from CP to the application.
(We can't presume not a single application on earth will need SIGUSR1 and SIGHUP for useful purposes!)

That said, putting the service node in maintenance mode can be frightfully needed in some situations (e.g. draining currently executing long requests before shutting down the container).

I can't think of any other solution than to putting into CP configuration which (if any) signal should mean "maintenance mode" and "configuration reload" (with current default values; empty signal == disable the functionality).

I do not have a specific use case in mind, but I've seen such diverse behaviors in my professional life that I'm really looking to use flexible solutions. I know I'll have to send one of those signals one day, and I want to be ready when that's the case. CP fits my current needs, this is why I'm pushing for it! :-)

@tgross
Copy link
Contributor

tgross commented Aug 2, 2016

The thing is that if you're sending that signal to do a reconfiguration then you already have either a) entered the container to replace a configuration file, or b) have just triggered one of the event hooks. Otherwise you're HUP'ing the application w/o any changes for it to pick up. So why not take advantage of those options?

I'm trying to avoid configuration sprawl here. "One more config flag" is a genuine usability problem and ContainerPilot already has a lot of configuration options so each new one we add has that much larger of a hurdle to overcome.

@javabean
Copy link
Author

javabean commented Aug 2, 2016

Very true, let's try to keep CP configuration is simple as possible and options to a minimum.

If we consider that reloading CP configuration (SIGHUP) is an idempotent operation, I would be more than happy if this signal is then simply transmitted back to the application (no additional CP config flag). This way, if the application decides to do something else than reload, it will be fine (in which case we can simply restart the container for CP configuration changes).

I have more strong feelings about SIGUSR1 which really has no standard behavior. Hijacking it for service maintenance mode is a noble cause, but that prevents from easily sending it to the application. This makes me nervous.

@tgross
Copy link
Contributor

tgross commented Aug 2, 2016

I would be more than happy if this signal is then simply transmitted back to the application

If the application has no SIGHUP handler then it'll crash. So we really can't always re-transmit that either.

I have more strong feelings about SIGUSR1 which really has no standard behavior. Hijacking it for service maintenance mode is a noble cause, but that prevents from easily sending it to the application.

Ok but think about the combinations here:

  • ContainerPilot accepts SIGUSR1 and stops is propagation (current behavior)
  • ContainerPilot ignores SIGUSR1 and stops its propagation
  • ContainerPilot accepts SIGUSR1 and re-transmits to the application
  • ContainerPilot ignores SIGUSR1 and re-transmits to the application

What do we then expect the behavior is for all other subprocesses? i.e. the 7 user-defined hooks. Should they also receive the SIGUSR1? In all 4 of the configuration combinations described above?

I've suggested workarounds using the user-defined hooks ("So why not take advantage of those options?", above). Was there a use case in transmitting these signals that prevents us from using the existing hooks as I've asked here?

@javabean
Copy link
Author

javabean commented Aug 4, 2016

This is getting hairy… :-)

The main "problem" here is that we have a single canal of communication (UNIX signals) for 2 different purposes:

  • sending signals to the main application
  • controlling the "infrastructure" (CP)

If we have to extend CP's commands panel, I would suggest we use the already-existing HTTP server (used for telemetry) to send orders to CP.

In the meantime, as you previously exposed, and unless we change configuration options, it should be documented that all signals are passed-throu except SIGHUP and SIGUSR1. Let's put this into task #197!

@misterbisson
Copy link
Contributor

@javabean can you say more about why you think of ContainerPilot as an infrastructure component? We're developing it and I certainly think of it as part of the application. That question is probably part of the confusion here, but I'd love to learn more about it.

@javabean
Copy link
Author

javabean commented Aug 4, 2016

@misterbisson very simple: I see CP having the same kind of functions as Docker: orchestrate, manage and handle my applications, by opposition to servicing user requests. The services offered by CP could be integrated into Docker without looking shocking.

IMHO the frontier between infrastructure and applications is in the code we write to bind them together (e.g. healthchecks, onChange handlers), and which could be considered both as part of the application, and/or part of infrastructure.

@tgross
Copy link
Contributor

tgross commented Aug 8, 2016

@misterbisson @javabean I honestly think trying to ascertain philosophical purity for an application like ContainerPilot which is really designed to fill the cracks in application/infra behaviors is not terribly constructive.

The main "problem" here is that we have a single canal of communication (UNIX signals) for 2 different purposes

Let's start with the understanding that without reaching into the container process space (via docker exec), application containers can only handle signals at PID1. The intended level of abstraction for application containers is that the container can be treated as it were a single application, even if it is not in practice. Mega-orchestrators like Marathon/Mesos add difficulty to this by having PID1 be /bin/sh, so this problem isn't unique to ContainerPilot.

So the only things under consideration really are (in order):

  • Can the desired behavior be handled by existing lifecycle hooks? You still haven't provided a specific use case @javabean so until we have one it's going to be hard to answer this.
  • If not, should we pass-thru signals to subprocesses even though this behavior is unusual/unexpected?
  • If so, what should the configuration look like?

If we have to extend CP's commands panel, I would suggest we use the already-existing HTTP server (used for telemetry) to send orders to CP.

Handling HTTP POST expands the ContainerPilot surface area considerably. How would we authenticate and authorize these requests? I would very much like to avoid this kind of expansion of scope.


it should be documented that all signals are passed-throu except SIGHUP and SIGUSR1.

That's not accurate. Unhandled signals are not passed-thru but instead hit ContainerPilot, which generally means killing the container as it does with any other process that receives an unhandled signal.

@tgross
Copy link
Contributor

tgross commented Nov 17, 2016

#244 has been opened as a proposal to cover this issue as a 3.x enhancement.

@tgross tgross added v3.0.0 and removed v2 labels Nov 17, 2016
@tgross
Copy link
Contributor

tgross commented Mar 23, 2017

I'm going to close this issue as #244 really supersedes this concept entirely. Refer to RFD86 and the v3 roadmap in #283 for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants