-
Notifications
You must be signed in to change notification settings - Fork 450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make resource failure when cc parallel is enabled #962
Comments
To fix this error, we have to clear We can set the The error in the linked issue happens, because cc set It's a bit more expensive given that 2 more syscalls is issued but should be ok. Ideally I want to find a way to avoid setting Using |
I just realized that setting and clearing I've discovered So I'm crossing my fingers that newer version of Otherwise, I could submit a patch to jobserver-rs to always use the new named pipe, which does not trigger this bug, while passing the fds for backwards compatibility. E.g. |
Closing this in favor of jobserver issue |
Reopen this since I think it will be faster to apply the quick fix to cc, updating jobserver to use fifo will take quite some time. Even when jobserver is updated to fifo, it's still possible for make to fail if the jobserver is created by an older version of |
It's unfortunate that we don't have anyway to clone underlying file descriptor so that the new fd has an independent state from the original to let us set |
At least on Linux you can open /dev/fd/$FD to open a separate FD pointing to the same underlying file. |
@Amanieu If you are ok with it, I can add this as workaround on Linux and then fallback to setting |
Note that since gnu make 4.3 they did switch to nonblocking pipes. Though due to an optimization in the linux kernel blocking pipe reads should actually perform better. Edit: Pipes, not sockets |
Thanks, but according to @nadenf make 4.4.1 still doesn't work, which is strange Makes me suspect if I got this wrong. |
I verified in |
Hitting this issue after updating to |
cc 1.0.87 has released, which shall fix this issue. |
Hello @NobodyXu , unfortunately I still have the issue with cc 1.0.88 for both MacOS and iOS, the build progresses a bit more but eventually fails for the same reason. It only works when using |
@n-eq which crate are you building? I'd like to check its |
I'm working on a project I can't share, but it's using openssl and zstd. I'll try to wrap up a minimal reproducible example asap |
Thanks! If it is openssl, then I'm pretty sure it's fixed, since others also confirmed it. For zstd, it only uses cc and doesn't use make at all, so I don't think that's the cause either. @n-eq I think there're some other crates in your dependency graphs that also depends on cc and has a build.rs Recommend to do a |
Hello @NobodyXu |
Yes, 1.0.87 also has the bug, it is fixed in 1.0.88 |
Hello @NobodyXu, here's the result of
|
Hmmm @n-eq can you show me the build failure output? Should contain the crate which failed. |
Sure, here's the full error log below (building for Detailserror: failed to run custom build command for Caused by: *** OpenSSL has been successfully configured *** *** If you encounter a problem while building, please open an *** *** perl configdata.pm --dump *** *** (If you are new to OpenSSL, you might want to consult the *** running cd "/Users/neq/Dev/private/target/debug/build/openssl-sys-e386d53d1da68088/out/openssl-build/build/src" && "make" "depend" --- stderr Error building OpenSSL: |
Thanks, I'm suspecting the Can you try #974 and see if there's any line in the Otherwise, I might just change the implementation to spawn a thread for jobserver and avoid setting non-blocking. |
Erratum: It works when switching to |
It's probably due to the resolution order of The dir containing latest |
@NobodyXu Yes I did that... but as it turns out the |
This sounds like worth opening a separate issue for this. |
Don't mind the GHA runner issue. I think it's better to focus on why the
build fails with make < 4 versions
|
Thanks, confirmed that I can reproduce this on my M2 |
Here is how to override default
|
I patched my build using the commit from #974 and got the following output which is the same error as before and it doesn't show the message you asked about:
|
Just piling up, this also happened on macOS when we upgraded Firefox to use cc 1.0.88. |
Working on a solution to not set BTW, I wonder does it happen on Linux? Or is it just macOS? What about freeBSD which IIRC macOS is based on? |
@NobodyXu no it doesn't happen on Linux. In my case it's MacOS/iOS targets |
That's really, really strange, because we use exactly the same code for all unix. |
I think it might not be It uses Openssl does call So it is really strange at the first place that it gets affected at the first place. |
I also tried adding logging to jobtoken client initialisation code, but nothing gets printed. |
I figured I should add this information: within the Firefox build, we actually have make invoke cargo (and we make make pass its pipes to cargo). And it's that outer make that prints the |
At that point, why keep the custom jobserver implementation? Wasn't the main reason to avoid using a thread in the first place? Wouldn't it make sense to just go back to using the jobserver crate? |
Yes it would definitely be sensible. However I discovered that, at least in openssl's case, its build.rs does not invoke compile_objects or any jobserver related code.
Combined with this, I start to suspect if this is a macOS bug or something, because cc-rs never touches other processes' fd and try to set From the info I gathered so far, this bug now only happens on macOS, not Linux. This seems to suggest something in macOS is at fault to me. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Confirmed on manually that |
Is it really a bug, though, when the expected behavior is under-defined? (like, really, I couldn't find anything in the POSIX spec that would qualify this behavior as a bug, but maybe I didn't dig deep enough) |
This comment has been minimized.
This comment has been minimized.
Confirmed by @the8472 , which seems to be a semantics of This is really strange and this is the first time I found this out, so @glandium you are right, thank you for correcting me. |
BTW, your script has the same output on Linux as the one you pasted. |
I've opened #985 to fix this, verified locally by building Can someone double-check it on their system/CI to see if it fixed the issue please? |
Since cc >=1.0.80, we set the jobserver inherited to
O_NONBLOCK
, make probably isn't designed for this, which is why it failed.I will open a new ticket in cc and fix it.
Originally posted by @NobodyXu in rust-lang/cargo#13476 (comment)
The text was updated successfully, but these errors were encountered: