-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nondeterminism from codegen-units #128675
Comments
Arch Linux is currently also having issues with the rustc and rustdoc binaries being unreproducible: |
This produces binaries that are better optimized and deterministic. This may become obsolete when rust-lang/rust#128675 is resolved.
CGU partitioning is very deliberately designed to be deterministic. |
@bmwiedemann It's unclear to me exactly what you are trying to report. In llvm/llvm-project#72206 you just generally refer to " different binaries in every build, even when trying to run builds as similar as possible." But in pop-os/launcher#230 you are pointing out that changing the build environment but not the build settings changes the build. That has a fair chance of being a different rustc bug than the issue you filed on LLVM. So I do not think it is accurate to say this started in 1.73. @kpcyrd What are you trying to report? All you linked is a diff from a system that is completely unfamiliar, with no enlightening context. I can't tell whether you're just leaving a "seeing this too" comment (which is entirely unnecessary), or if you're trying to provide additional data for debugging, or if you're trying to report a third variation. |
Based on the response to @bmwiedemann's original report, binaries are reproducible based on codegen units if you use the same number of codegen units. Thus, closing as already-completed. @kpcyrd "I'm having trouble making these binaries reproducible" doesn't distinguish your report from any other extant A-reproducibility issue on the tracker, so please open a new issue with an actual report. Build logs are not an issue report. Even if they were, each distribution uses a wildly different format so it is not possible for any lessons to be learned and applied from distro to distro. The day SUSE and Arch and Debian and everyone else standardize on using exactly the same tooling with exactly the same configuration is nowhere in sight. |
@bmwiedemann If I have made a mistake in identifying what you are reporting and you wish to reopen this, then there is one key improvement that is required as a baseline: Please start by linking to the exact Rust source and build instructions that you are working from, including the patches you are applying, or where they would be if they existed. Make sure they can be accessed without authentication, instead of behind a JIRA tracker that requires logging in to read. You see, in order for us to determine whether the issue is in rustc's source code, we have to be able to read your patches. At least two distributions have patched rustc's source and then reported issues caused or aggravated by the patches. This is quite unfortunate, as it reduces our confidence that updating our source code will actually fix issues in distributed rustcs, as distros may prefer to simply continue patching out the fixed code instead, leaving the problem behind. |
On the openSUSE rust side, atm we have exactly one tiny downstream patch "to ignore GCC incompatible flag" The https://github.com/bmwiedemann/openSUSE/blob/master/packages/p/pop-launcher/pop-launcher.spec is also not heavily patched and it did vary between builds on 1-core-VMs and 4-core-VMs, which stopped with That contradicts "CGU partitioning is very deliberately designed to be deterministic", so I still think this bug is valid. |
Then, can you precisely describe the variance that you are seeing, with fresh builds using rust 1.80 and the current rust nightly, and what is different between which build, and if the result is deterministic for a given rustc between VMs that have the same amount of cores? So that we at least have eliminated that much possibility? Can you replicate this variance without using both the Just build system and cargo? We have no special knowledge of Just and it is possible it is doing things that disturb this fact. Can you replicate the variance with a simpler program that does not require |
Judging by how the LLVM issue has played out, it seems that some of the variance that is being described in the LLVM issue is a result of the source code that you are packaging, which has included nondeterminism in its build: llvm/llvm-project#72206 (comment) This has little to do with the Rust toolchain. If you call |
The degree to which this remains a concern is probably a complete duplicate with #126976 |
Looking at Instagram/LibCST#1213 it seems there is still work left to do in rust or llvm. That was with rust 1.81.0 |
I found that the issue in pop-launcher and python-libcst is not with codegen-units, but these become reproducible with Edit: they even build reproducibly without any changes now with rust-1.81.0 Edit2: they still vary, but only sometimes. |
While working on reproducible builds for openSUSE (sponsored by the NLnet NGI0 fund), I found that
compiling various rust applications with the default of codegen-units=16, it produced nondeterministic variations in resulting binaries. e.g.
pop-os/launcher#230
IMHO, the default settings should be made to behave deterministically. Either make codegen-units=1 the default or ensure that codegen-units=16 can produce deterministic results.
In the unlikely event that this is not possible, it should clearly be documented how to achieve reproducible builds, not only in
https://doc.rust-lang.org/rustc/codegen-options/index.html#codegen-units
This is about rust1.79 but started much earlier. At least with 1.73
The text was updated successfully, but these errors were encountered: