-
Notifications
You must be signed in to change notification settings - Fork 521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Join worker threads on pool shutdown #3794
Join worker threads on pool shutdown #3794
Conversation
build.sbt
Outdated
Test / fork := true, | ||
fork := true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For easily running the IOAppSpec
example apps individually.
while (i < threadCount) { | ||
workerThreads(i).interrupt() | ||
val workerThread = workerThreads(i) | ||
if (workerThread ne currentThread) { | ||
workerThread.interrupt() | ||
workerThread.join() | ||
// wait to stop before closing pollers | ||
} | ||
system.closePoller(pollers(i)) | ||
i += 1 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might consider putting a timeout on this join
, in case the thread is really stuck (e.g. non-interruptible blocking call on compute pool or deadlock or spinloop). But if we do that, we should probably only close the poller if the thread is known to be terminated, and otherwise just leak it? Better than a segfault.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think a final timeout would be good. About leaking vs. segfault, I'm not sure (see below).
val run: IO[Unit] = | ||
IO(System.exit(0)).uncancelable | ||
IO.blocking(System.exit(0)).uncancelable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an example of a deadlock that came up in our test suite. Actually, System.exit(...)
is technically blocking, because it blocks waiting for shutdown hooks to complete.
If we run it on the compute pool, then it gets stuck waiting for the runtime to shutdown, which is stuck waiting for this thread to complete, ... deadlock.
Can confirm this fixes the segfault. Here's the fixed version, to repro the error remove the explicit CE dependency. //> using repository https://s01.oss.sonatype.org/content/repositories/releases/
//> using repository https://s01.oss.sonatype.org/content/repositories/snapshots/
//> using dep org.typelevel::cats-effect::3.6-8a37117
//> using dep com.armanbilge::fs2-io_uring::0.3-4e332b9-SNAPSHOT
//> using dep org.http4s::http4s-ember-server::0.23.23
import cats.effect.*
import org.http4s.ember.server.EmberServerBuilder
import fs2.io.uring.net.UringSocketGroup
import fs2.io.uring.unsafe.UringSystem
object App extends IOApp.Simple:
override protected def pollingSystem = UringSystem
def run = EmberServerBuilder.default[IO].withSocketGroup(UringSocketGroup[IO]).build.useForever |
It's not entirely clear to me, what is segfaulting here. The JVM itself? Is that a JVM bug? |
Ah sorry, should have clarified. In fs2-io_uring we are calling native code via JNI. That is segfaulting. It's not a JVM bug. |
Hm, okay. So, in my opinion, segfault: very bad, should be very much avoided. Leaking the poller: still bad, but probably less so. Waiting forever for a thread: also pretty bad. Overall, it seems to me that a generous timeout for a join and then leaking the poller (if not joined) would be the least bad solution. (I should actually read the fs2-io_uring code, because I have no idea what leaking a poller actually means. Is it some kind of io_uring data structure allocated by native code?) However, I'm somewhat ocncerned that it's so easy to get a segfault here. (On the JVM we're generally well protected from these very bad things. Are we giving this up?) But it can be that this is a practically unavoidable consequence of using this Linux API... I should really read that code :-) |
Yep, agreed. We already have a timeout for shutdown (i.e. how long to wait for cancelation of the main fiber to complete). I wonder if we could re-use that, or it deserves its own configuration. Also since we have to wait for multiple threads, should we keep track of the accumulated shutdown duration and give each thread less and less time to finish up?
Yes, exactly. This includes data structures both at the user- and kernel-levels (e.g. every io_uring is backed by a file descriptor). Similarly for epoll and kqueue. So besides leaving some memory uncleared and unfreed, it also means not closing an fd. /shrug
The JVM also calls native APIs e.g. However, adding this sort of synchronization undoes all the optimizations we are going for with this I/O-integrated runtime 😁 so segfaults come with the territory. |
This sounds terrifying to me, but I'm not gonna take more of your time until I've actually looked at the code in question... :-) |
As a concrete example: You manually allocate some (off-heap) memory when creating a poller. Thus, closing the poller requires freeing that memory. Then consider a situation where:
Writing to freed memory can cause a segfault. This is pretty fundamental and the only way to avoid it is to use some form of synchronization to guarantee that memory cannot be freed while it is in use. |
Thank you for the explanation, I understand (more or less) now (I've also looked at the PollingSystem). I think giving each thread less and less time only makes sense, if you interrupt all of them first, then join them one by one. Leaking a file descriptor is not ideal, but it's still less wrong than using freed memory or other segfault/undefined behavior things like that.
Yeah. In fact, this PR does use some form of synchronization: About segfaults being terrible: what I was trying to say is that while this PR fixes this specific segfault, I'm concerned about other similar cases. (Although, now that I've looked at the API, I'm less concerned. The thread-locality of pollers should help.) |
That sounds right.
Exactly my point :)
Yes, I suppose so. Or more generally: the
Fair enough, and yeah, we're just going to have to be careful. That's why for now we're only shipping Thanks for taking a deeper look at these issues :) |
Merge with head? |
This (should) fix an issue that @antoniojimeneznieto and I discovered in armanbilge/fs2-io_uring#78.
We were observing segfaults in the io_uring native calls when e.g. terminating an http4s Ember server via Ctrl-C, crashing the entire JVM.
The problem is that during shutdown of the
WorkStealingThreadPool
after interrupting each worker thread, it immediately proceeds to close the poller for that thread, even though the thread may still be actively using it. The proposed solution is tojoin()
each thread to await its termination before closing its resources. This should work in happy-paths but there are some caveats in pathological scenarios ... inline comments incoming.