-
Notifications
You must be signed in to change notification settings - Fork 574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workaround a miscompilation in clang 12 (XCode 13) #2803
Conversation
e0212ca
to
b0182cc
Compare
Codecov Report
@@ Coverage Diff @@
## master #2803 +/- ##
==========================================
- Coverage 92.33% 92.32% -0.01%
==========================================
Files 568 568
Lines 63091 63098 +7
Branches 6179 6178 -1
==========================================
+ Hits 58255 58256 +1
- Misses 4804 4810 +6
Partials 32 32
Continue to review full report at Codecov.
|
… fixes the problem.
b0182cc
to
cb078d8
Compare
There should be a comment about this as there is for the SHA-2 BMI2 versions. Basically we just compile the same code twice once with BMI2 enabled and that is sufficient for GCC/Clang to do the right thing. |
First: thanks for tracking down this issue as well as reporting it upstream to LLVM. Compiler bugs like this are never fun. I am not so happy adding this complicated mess but unfortunately I cannot find a cleaner way outside of the very big hammer of |
In the context of the LLVM ticket, they found that Nevertheless, do we need this back ported for 2.18.2, as well? I suppose people on latest macOS building botan via conan would be grateful... Other clang 12 users as welll, of course. |
@reneme Yes we should backport this. Can you create a PR for this? I'd like to include this in 2.18.2 if possible. |
This works around the issue investigated in #2802. For optimization levels
-O2
and higher, clang 12 (and XCode 13) miscompile the SHA-3 implementation. Anecdotal evidence suggests that this happens only on macOS.The workaround pulls the initial XOR operations in
SHA3_round()
into an extra function and blocks inlining for the affected compilers. Without the__attribute__(noinline)
the issue would persist, despite the refactoring. Therefore, I hope that other compilers won't see a performance penalty due to this workaround.We cannot rule out that other platforms suffer from the same (presumed) clang bug, so I suggest to apply the workaround solely based on the clang version and not the platform.
Side note: To me, the implementation in
sha3_bmi2.cpp
looks exactly equal to the standard implementation. I'm I missing something?