Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable mono runtime to handle SIGTERM like CoreCLR #82806 #82813

Closed
wants to merge 6 commits into from

Conversation

nealef
Copy link
Contributor

@nealef nealef commented Mar 1, 2023

Provide fix for #81093 - "Mono does not emit ProcessExit event on SIGTERM"

  • src/mono/mono/mini/mini-posix.c

    • Add signal handler for SIGTERM
  • src/mono/mono/mini/mini-windows.c

    • Add signal handler for SIGTERM
  • src/mono/mono/mini/mini-runtime.c

    • Add mono_sigterm_signal_handler to process SIGTERM that will set a global variable to be monitored by the GC finalizer thread
  • src/mono/mono/mini/mini-runtime.h

    • Define prototype for mono_sigterm_signal_handler()
  • src/mono/mono/metadata/gc.c

    • Monitor for sigterm and kick off the shutdown process when encountered by calling mono_runtime_try_shutdown().
    • Exit with either the user set exitcode (System.Environment.ExitCode) or SIGTERM + 128.

…on SIGTERM"

* src/mono/mono/mini/mini-posix.c
  - Add signal handler for SIGTERM

* src/mono/mono/mini/mini-windows.c
  - Add signal handler for SIGTERM

* src/mono/mono/mini/mini-runtime.c
  - Add mono_sigterm_signal_handler to process SIGTERM that will set a global variable
    to be monitored by the GC finalizer thread

* src/mono/mono/mini/mini-runtime.h
  - Define prototype for mono_sigterm_signal_handler()

* src/mono/mono/metadata/gc.c
  - Monitor for sigterm and kick off the shutdown process when encountered by calling mono_runtime_try_shutdown().
  - Exit with either the user set exitcode (System.Environment.ExitCode) or SIGTERM + 128.
@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Mar 1, 2023
@vargaz
Copy link
Contributor

vargaz commented Mar 1, 2023

Users expect the process to terminate when it's sent a SIGTERM, and adding a handler for that could cause the process to get stuck if the managed code gets stuck etc.

@nealef
Copy link
Contributor Author

nealef commented Mar 2, 2023

Users expect the process to terminate when it's sent a SIGTERM, and adding a handler for that could cause the process to get stuck if the managed code gets stuck etc.

What does coreCLR do in this situation? It fields the SIGTERM and invokes managed code. How does it prevent getting stuck?

@tmds
Copy link
Member

tmds commented Mar 13, 2023

What does coreCLR do in this situation? It fields the SIGTERM and invokes managed code. How does it prevent getting stuck?

It doesn't.

The user code may prevent the application from terminating, for example by blocking the ProcessExit event.

SIGTERM is a friendly request for an app to terminate.

If it doesn't terminate in a timely fashion, the user can send it SIGKILL.

That's what a control process, like systemd will do.

@nealef can you remove the following line, which skips the SigTermExitCode test on Mono:

[ActiveIssue("https://github.com/dotnet/runtime/issues/31656", TestRuntimes.Mono)]

@tmds
Copy link
Member

tmds commented Mar 14, 2023

The SigTermExitCode test is failing, because Environment.ExitCode changes aren't getting picked up.

I think the fix may be something like: call mono_environment_exitcode_set(128+SIGTERM) at the start of mono_sigterm_signal_handler, and call exit(mono_environment_exitcode_get() after mono_runtime_try_shutdown.

  - Set a default exit code before starting the SIGTERM processing
src/mono/mono/metadata/gc.c Outdated Show resolved Hide resolved
nealef added 2 commits March 14, 2023 04:29
  - Set a default exit code before setting the term_signaled variable that gets checked in gc
  - Simplify use of exit code now that a default is being set
@nealef
Copy link
Contributor Author

nealef commented Mar 14, 2023 via email

@tmds
Copy link
Member

tmds commented Mar 14, 2023

Thanks.

Let's see if the test passes for all cases with these changes.

@tmds
Copy link
Member

tmds commented Mar 14, 2023

The test is passing for all cases. So it has the right behavior now concerning the ExitCode.

I'm not familiar enough with mono to approve the implementation.

Copy link
Member

@lambdageek lambdageek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. couple minor nits.

src/mono/mono/mini/mini-runtime.c Outdated Show resolved Hide resolved
src/mono/mono/mini/mini-windows.c Outdated Show resolved Hide resolved
src/mono/mono/metadata/gc.c Outdated Show resolved Hide resolved
@lambdageek
Copy link
Member

Not sure what's up with (Build mono minijit Pri0 Runtime Tests Run windows x64 release) - might be related?

ping @lateralusX

  - Rename term_signaled to match mono style
  - Remove volatile attribute
  - Move testing of shutdown until after the sem wait

* src/mono/mono/mini/mini-runtime.c
  - Rename term_signaled to mono_term_signaled

* src/mono/mono/mini/mini-windows.c
  - Use the correct signal for handler
@@ -189,6 +189,7 @@ mono_runtime_install_handlers (void)
win32_seh_set_handler(SIGFPE, mono_sigfpe_signal_handler);
win32_seh_set_handler(SIGILL, mono_crashing_signal_handler);
win32_seh_set_handler(SIGSEGV, mono_sigsegv_signal_handler);
win32_seh_set_handler(SIGTERM, mono_sigterm_signal_handler);
Copy link
Member

@lateralusX lateralusX Mar 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is going to work on Windows we also need to handle that signal in win32_seh_set_handler as well as react to the exception that will be generated and passed to seh_vectored_exception_handler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the case SIGTERM and the corresponding term_handler variable in exceptions_[x86|amd64] appears straightforward but how are they then used? I see there is exception handling for EXCEPTION_ILLEGAL_INSTRUCTION etc. where the handler is picked up and called, but how and where would this particular signal be fielded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lateralusX @lambdageek Looking for further guidance on the outstanding Windows' issue.

Copy link
Member

@lateralusX lateralusX Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On windows there is no direct mapping with "SIGTERM", the closest we have is the probably SetConsoleCtrlHandler and mapping different scenarios like done by:

This is how its handled by CoreCLR:

::SetConsoleCtrlHandler(DbgCtrlCHandler, TRUE/*add*/);

So for Mono Windows handling of "SIGTERM" it probably need to setup a console ctrl handler and then react on the event as part of that handler and can't be handled through existing vectorized exception handling logic.

@@ -892,6 +896,12 @@ finalizer_thread (gpointer unused)
}
wait = TRUE;

/* Just in case we've received a SIGTERM */
if (mono_term_signaled) {
Copy link
Member

@lateralusX lateralusX Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is updated from a different thread without any locking, maybe we would need a read memory barrier to make sure we see updated value, especially when wait == FALSE,. There might be one hidden in the mono_coop_sem_timedwait, looks like at least sem_trywait seems to issue memory barriers, and other implementations probably do as well (like WaitForSingleObjectEx), so maybe not needed in the end, but we depend on implementation details. I guess we can leave it as is for now, just wanted to make a comment/note around potential but adding a read barrier in that code path shouldn't be too dramatic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that at one point, this PR had mono_term_signaled as volatile.

Maybe that (or something else) is needed to ensure these lines don't reorder, causing the wrong exit code to be picked up.

This is how the exit code is set:

mono_environment_exitcode_set(128+SIGTERM);	/* Set default exit code */
mono_term_signaled = TRUE;

Copy link
Member

@lateralusX lateralusX Apr 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always do acquire/release semantics on mono_term_signaled and we would make sure this is not reordered by compiler or CPU. Just using volatile on mono_term_signaled will only prevent compiler to reorder load/store but CPU is still free to do load/store reorder.

@nealef
Copy link
Contributor Author

nealef commented Mar 21, 2024

Closing this as a new PR was generated that includes Windows has been created: #100056

@nealef nealef closed this Mar 21, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Apr 20, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-VM-meta-mono community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants