Join worker threads on pool shutdown #3794

armanbilge · 2023-08-27T22:22:42Z

This (should) fix an issue that @antoniojimeneznieto and I discovered in armanbilge/fs2-io_uring#78.

We were observing segfaults in the io_uring native calls when e.g. terminating an http4s Ember server via Ctrl-C, crashing the entire JVM.

The problem is that during shutdown of the WorkStealingThreadPool after interrupting each worker thread, it immediately proceeds to close the poller for that thread, even though the thread may still be actively using it. The proposed solution is to join() each thread to await its termination before closing its resources. This should work in happy-paths but there are some caveats in pathological scenarios ... inline comments incoming.

armanbilge · 2023-08-27T22:23:41Z

build.sbt

-    Test / fork := true,
+    fork := true,


For easily running the IOAppSpec example apps individually.

armanbilge · 2023-08-27T22:25:35Z

core/jvm/src/main/scala/cats/effect/unsafe/WorkStealingThreadPool.scala

      while (i < threadCount) {
-        workerThreads(i).interrupt()
+        val workerThread = workerThreads(i)
+        if (workerThread ne currentThread) {
+          workerThread.interrupt()
+          workerThread.join()
+          // wait to stop before closing pollers
+        }
        system.closePoller(pollers(i))
        i += 1
      }


We might consider putting a timeout on this join, in case the thread is really stuck (e.g. non-interruptible blocking call on compute pool or deadlock or spinloop). But if we do that, we should probably only close the poller if the thread is known to be terminated, and otherwise just leak it? Better than a segfault.

Yeah, I think a final timeout would be good. About leaking vs. segfault, I'm not sure (see below).

armanbilge · 2023-08-27T22:27:56Z

tests/jvm/src/main/scala/catseffect/examplesplatform.scala

    val run: IO[Unit] =
-      IO(System.exit(0)).uncancelable
+      IO.blocking(System.exit(0)).uncancelable


This is an example of a deadlock that came up in our test suite. Actually, System.exit(...) is technically blocking, because it blocks waiting for shutdown hooks to complete.

If we run it on the compute pool, then it gets stuck waiting for the runtime to shutdown, which is stuck waiting for this thread to complete, ... deadlock.

armanbilge · 2023-08-28T16:26:33Z

Can confirm this fixes the segfault. Here's the fixed version, to repro the error remove the explicit CE dependency.

//> using repository https://s01.oss.sonatype.org/content/repositories/releases/
//> using repository https://s01.oss.sonatype.org/content/repositories/snapshots/
//> using dep org.typelevel::cats-effect::3.6-8a37117
//> using dep com.armanbilge::fs2-io_uring::0.3-4e332b9-SNAPSHOT
//> using dep org.http4s::http4s-ember-server::0.23.23

import cats.effect.*
import org.http4s.ember.server.EmberServerBuilder
import fs2.io.uring.net.UringSocketGroup
import fs2.io.uring.unsafe.UringSystem

object App extends IOApp.Simple:
  override protected def pollingSystem = UringSystem
  def run = EmberServerBuilder.default[IO].withSocketGroup(UringSocketGroup[IO]).build.useForever

durban · 2023-09-05T12:33:33Z

It's not entirely clear to me, what is segfaulting here. The JVM itself? Is that a JVM bug?

armanbilge · 2023-09-05T12:34:27Z

It's not entirely clear to me, what is segfaulting here. The JVM itself? Is that a JVM bug?

Ah sorry, should have clarified. In fs2-io_uring we are calling native code via JNI. That is segfaulting. It's not a JVM bug.

durban · 2023-09-05T21:05:34Z

Hm, okay. So, in my opinion, segfault: very bad, should be very much avoided. Leaking the poller: still bad, but probably less so. Waiting forever for a thread: also pretty bad. Overall, it seems to me that a generous timeout for a join and then leaking the poller (if not joined) would be the least bad solution. (I should actually read the fs2-io_uring code, because I have no idea what leaking a poller actually means. Is it some kind of io_uring data structure allocated by native code?)

However, I'm somewhat ocncerned that it's so easy to get a segfault here. (On the JVM we're generally well protected from these very bad things. Are we giving this up?) But it can be that this is a practically unavoidable consequence of using this Linux API... I should really read that code :-)

armanbilge · 2023-09-05T23:00:31Z

So, in my opinion, segfault: very bad, should be very much avoided. Leaking the poller: still bad, but probably less so. ... Overall, it seems to me that a generous timeout for a join and then leaking the poller (if not joined) would be the least bad solution.

Yep, agreed. We already have a timeout for shutdown (i.e. how long to wait for cancelation of the main fiber to complete). I wonder if we could re-use that, or it deserves its own configuration. Also since we have to wait for multiple threads, should we keep track of the accumulated shutdown duration and give each thread less and less time to finish up?

I have no idea what leaking a poller actually means. Is it some kind of io_uring data structure allocated by native code

Yes, exactly. This includes data structures both at the user- and kernel-levels (e.g. every io_uring is backed by a file descriptor). Similarly for epoll and kqueue.

So besides leaving some memory uncleared and unfreed, it also means not closing an fd. /shrug

However, I'm somewhat ocncerned that it's so easy to get a segfault here. (On the JVM we're generally well protected from these very bad things. Are we giving this up?)

The JVM also calls native APIs e.g. Selector is using epoll. So it's a fair question to ask, how is the JVM protecting us from this bad thing? The answer is simple: locks and synchronization. By synchronizing on Selector and its methods, we get a guarantee that a Selector won't be closed while somebody is in the middle of using it, and we can check a boolean flag if its open/closed before attempting to actually make that native call that would segfault if its indeed closed.

However, adding this sort of synchronization undoes all the optimizations we are going for with this I/O-integrated runtime 😁 so segfaults come with the territory.

durban · 2023-09-07T21:59:09Z

segfaults come with the territory

This sounds terrifying to me, but I'm not gonna take more of your time until I've actually looked at the code in question... :-)

armanbilge · 2023-09-07T22:39:47Z

As a concrete example:

You manually allocate some (off-heap) memory when creating a poller. Thus, closing the poller requires freeing that memory.

Then consider a situation where:

the thread that owns that poller/memory is writing to that memory e.g. submitting an I/O event
in parallel, that thread is being interrupted during runtime shutdown, the poller closed, and the memory freed.

Writing to freed memory can cause a segfault. This is pretty fundamental and the only way to avoid it is to use some form of synchronization to guarantee that memory cannot be freed while it is in use.

durban · 2023-09-20T22:52:35Z

Thank you for the explanation, I understand (more or less) now (I've also looked at the PollingSystem).

I think giving each thread less and less time only makes sense, if you interrupt all of them first, then join them one by one.

Leaking a file descriptor is not ideal, but it's still less wrong than using freed memory or other segfault/undefined behavior things like that.

This is pretty fundamental and the only way to avoid it is to use some form of synchronization to guarantee that memory cannot be freed while it is in use.

Yeah. In fact, this PR does use some form of synchronization: Thread#join. (If I understand correctly, the Pollers are only used by one worker thread + the one doing the close.) Actually, not closing a poller while it is in use should be in the contract of PollingSystem right?

About segfaults being terrible: what I was trying to say is that while this PR fixes this specific segfault, I'm concerned about other similar cases. (Although, now that I've looked at the API, I'm less concerned. The thread-locality of pollers should help.)

armanbilge · 2023-09-21T02:35:23Z

I think giving each thread less and less time only makes sense, if you interrupt all of them first, then join them one by one.

That sounds right.

Yeah. In fact, this PR does use some form of synchronization: Thread#join.

Exactly my point :)

Actually, not closing a poller while it is in use should be in the contract of PollingSystem right?

Yes, I suppose so. Or more generally: the PollingSystem API makes no promises of thread-safety, it's up to the caller to synchronize where necessary. This is one of those cases.

I'm concerned about other similar cases. (Although, now that I've looked at the API, I'm less concerned. The thread-locality of pollers should help.)

Fair enough, and yeah, we're just going to have to be careful. That's why for now we're only shipping SelectorSystem in CE core and incubating io_uring and stuff externally.

Thanks for taking a deeper look at these issues :)

djspiewak · 2023-09-25T23:30:21Z

Merge with head?

Join worker threads on pool shutdown

8a37117

armanbilge commented Aug 27, 2023

View reviewed changes

armanbilge added the 🪲 bug label Aug 31, 2023

armanbilge added 2 commits September 25, 2023 23:15

Merge branch 'series/3.x' into fix/wstp-shutdown-join

07cafdb

Add shutdownTimeout to WSTP

861e9d4

djspiewak approved these changes Sep 26, 2023

View reviewed changes

djspiewak merged commit e5f5880 into typelevel:series/3.x Sep 26, 2023
29 of 33 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Join worker threads on pool shutdown #3794

Join worker threads on pool shutdown #3794

armanbilge commented Aug 27, 2023

armanbilge Aug 27, 2023

armanbilge Aug 27, 2023

durban Sep 5, 2023

armanbilge Aug 27, 2023

armanbilge commented Aug 28, 2023

durban commented Sep 5, 2023

armanbilge commented Sep 5, 2023

durban commented Sep 5, 2023

armanbilge commented Sep 5, 2023

durban commented Sep 7, 2023

armanbilge commented Sep 7, 2023

durban commented Sep 20, 2023

armanbilge commented Sep 21, 2023

djspiewak commented Sep 25, 2023

Join worker threads on pool shutdown #3794

Join worker threads on pool shutdown #3794

Conversation

armanbilge commented Aug 27, 2023

armanbilge Aug 27, 2023

Choose a reason for hiding this comment

armanbilge Aug 27, 2023

Choose a reason for hiding this comment

durban Sep 5, 2023

Choose a reason for hiding this comment

armanbilge Aug 27, 2023

Choose a reason for hiding this comment

armanbilge commented Aug 28, 2023

durban commented Sep 5, 2023

armanbilge commented Sep 5, 2023

durban commented Sep 5, 2023

armanbilge commented Sep 5, 2023

durban commented Sep 7, 2023

armanbilge commented Sep 7, 2023

durban commented Sep 20, 2023

armanbilge commented Sep 21, 2023

djspiewak commented Sep 25, 2023