Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove SIGTERM/SIGINT handling #72

Closed
wants to merge 1 commit into from

Conversation

cgwalters
Copy link

The previous code had bugs, see:
62fbff0
The version in RHEL doesn't have this patch, and we're hitting
shutdown bugs in RHEL CoreOS.

There's no actual good reason though to have a SIGTERM handler
at all; doing things like pthread_join() is pointless becuase
the kernel will take care of removing all resources.

In fact it's really an anti-pattern to do this, because if every
process has these types of things, it makes shutdowns much less
efficient as the process needs to be fully scheduled in rather
than just having the kernel take care of it.

The previous code had bugs, see:
62fbff0
The version in RHEL doesn't have this patch, and we're hitting
shutdown bugs in RHEL CoreOS.

There's no actual good reason though to have a `SIGTERM` handler
at all; doing things like `pthread_join()` is pointless becuase
the kernel will take care of removing all resources.

In fact it's really an anti-pattern to do this, because if every
process has these types of things, it makes shutdowns much less
efficient as the process needs to be fully scheduled in rather
than just having the kernel take care of it.
@cgwalters
Copy link
Author

@nhorman nhorman self-requested a review November 6, 2019 18:14
@nhorman
Copy link
Owner

nhorman commented Nov 6, 2019

I don't agree in principle with the notion of not having an application gracefully release its resources on shutdown.

Instead of removing that code, why not just open a bug with RHEL to backport the needed change to fix the problem? You're going to have to do that anyway to get any change made here today into RHEL

@nhorman nhorman closed this Nov 6, 2019
@cgwalters
Copy link
Author

I don't agree in principle with the notion of not having an application gracefully release its resources on shutdown.

Can you elaborate more?

I think this isn't just bikeshedding, it's gets to a really fundamental principle of systems design. See also e.g. https://en.wikipedia.org/wiki/Crash-only_software - that strongly influenced ostree. I spent a lot of time designing it so that it's totally safe to be interrupted at any point. It was only recently that I was forced to add a SIGTERM handler for a corner case, see ostreedev/ostree#1049

@cgwalters
Copy link
Author

You're going to have to do that anyway to get any change made here today into RHEL

Well...we could also change the systemd unit easily in RHCOS to use TimeoutStopSec=1 too.

@nhorman
Copy link
Owner

nhorman commented Nov 7, 2019

Sure, we could do that too, but thats not my point. My point is that this is already fixed here in the upstream project, and regardless of what you want the fix to be, it will need to be backported into RHEL. So why not take the path of least resistance and open a RHEL bz requesting that the existing fix be backported?

@cgwalters
Copy link
Author

@nhorman
Copy link
Owner

nhorman commented Nov 7, 2019

thank you, I'll take care of it ASAP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants