-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std.posix.accept does not return when interrupted. #19445
Comments
TL;DR: I think this is a duplicate of #6193. As I understand the current situation, almost all of the Zig POSIXy syscall wrappers retry on The cost of this approach is that blocking syscalls can become uninterruptible, so Zig programs will have to do something more drastic like close the blocked-on file descriptor (or kill the containing process and let the OS do the cleanup) when they want a blocked thread to give up on an operation. Or use non-blocking APIs. (Or perhaps Zig could support a timeout on blocking syscalls?) For your use case, does closing the file descriptor work? Zig could support a flag to say a syscall should be interruptible, but that gets complicated quickly. Its either an extra parameter on every syscall (and every wrapper), or its a global/thread flag that would apply to all syscall invocations (but that would break any invocations not ready for an interruption). In the short-term, as a work-around I believe you can link against glibc (or musl) to get the C Library's interruptible syscall semantics. Personally, I'm skeptical of Zig trying to repair broken POSIX semantics like this. So I suspect eventually Zig will be changed to return EINTR to callers (at least in the POSIXy APIs, but maybe not in the higher-level APIs?), but EINTR is a complicated and annoying behavior, so it seems worth a try. See also #2425. |
Abstractions are never perfect, it's a tradeoff between generality and simplicity. However, in this case the abstraction is opt-in. That being said, I don't see much downside to adding an |
FWIW, Python wrappers also retry on |
I actually tried closing the stream, but that turns out to be worse. The underlying system call returns with BADF which causes an "unreachable" panic. As a temporary work around I'm calling std.posix.poll (which also loops on INTR, but does supply a timeout which allow me an escape from the loop) and only do the accept if there is a connection waiting. I concur this issue is covered by #6193 |
I think the "EBADF causes unreachable" issue is a separate bug. See #6389. Generally patches motivated by real code, as in your case, are acceptable motivations to relax Zig's errno handling assumptions. The EBADF can either be returned or could be mapped to Unexpected. (E.g., see #3880.) The PEP 475 link is a solid description of the motivations for hiding and handling interruptions. I'd argue something like that belongs in Zig's OS-agnostic infrastructure, and the "posix wrapper" layer shouldn't quite get that involved with dispatching signals, but it really depends on what level of fidelity Zig wants to take for this layer. |
In this case this is exposing a bug in the program which is what we want.
Once you close a file descriptor, that descriptor is no longer valid and attempting to invoke a system call with it is the equivalent to undefined behavior at the OS level. That file descriptor could for example have already been re-used by opening a new file and now you're passing a different file to whatever system call you're invoking that you thought! This means the app now has undefined behavior that's hidden behind a race condition, one of the worst kind of bugs an program can have, and why in this case Zig does whatever it can to help you catch it and discourage its use. I'm not sure what your code looks like, but, if you're wanting a listen socket to pop, you can call shutdown on it. Or if you want to continue using a signal, you can call P.S. it's very difficult to use a signal to handle when to close your listen socket without introducing a race condition. It's very similar to multithreading. You would need to implement some some of synchronization mechanism between the code that uses the listen socket and the signal handler. For example, before you use the listen socket you could set a global "lock_listen_socket" to true and make sure that your signal handler doesn't close it if it's being used, then if the signal handler does execute, add more global state to let the non-signal handling code it's time to close the socket if the signal handler ran while it was being used. |
duplicate of #6193 |
Zig Version
0.12.0-dev.3352+95cb93944
aarch64 GNU/Linux
Steps to Reproduce and Observed Behavior
I have some code that uses std.net.server.accept to accept connections. If I send the process a SIGTERM (which I handle with my own sigaction) the accept does not return and the process remains blocked. This makes it difficult to gracefully clean things up.
Expected Behavior
the accept function should return an "Interrupted" error.
Other Information
The actual problematic behavior is in the accept function in /std/posix.zig. That function calls the system accept function in a loop whose whole purpose seem to be to cause this behavior.
If this behavior is desired in some cases, perhaps a flag could be added to toggle it on or off.
The text was updated successfully, but these errors were encountered: