-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instcombine incorrectly transforms store i64 -> store double #44497
Comments
Citation needed? |
I think the only target LLVM supports that has loads that canonicalize floats is 32-bit x86 (using non-SSE operations). How we want to deal with that case is sort of complicated. On the one hand, we don't want to forbid hoisting loads with float type. On the other hand, it's impossible for a function with a double return type to return an SNaN in the standard x86 calling convention, so we'd need some special rule for that. I'd prefer to say that we have to preserve SNaN patterns across non-arithmetic operations, and make the x86 backend deal with whatever complexity results from that. |
I'm fairly sure I've seen instcombine rewrites that take advantage of NaN bits being don't care. That's why I implemented this semantics in Alive2. |
This seems like a bad bug, and I've seen something like it trigger before in GCC (funnily enough, GCC miscompiled clang): It seems like this transform of i64 -> double shouldn't occur if the target canonicalizes NaNs. I don't think we need a particularly complex solution for this. |
I guess there are two models you could reason about. The model I was thinking about works something like the following:
I guess there's an alternative model, like you're proposing:
Really, the semantic results of either rule is pretty similar. It's just a question of when the NaN pattern is fixed: when an instruction produces the value, or when a store/bitcast forces it to be fixed. It's probably worth noting that any rule that means a "load" is allowed to raise an FP exception is going to make strict fp support a lot more complicated. |
I agree with your summary, Eli. I would just add that in semantics 1), if an operation produces NaN when the operands are not, it produces an unspecified NaN bit-pattern. The main reason I prefer to consider NaN to have an unspecified bit-pattern is because it makes it easier to support chips where load of an integer is not the same as load of a float. If this is not an issue, then we should document it and change Alive2 to match the semantics. And then remove any optimization that doesn't respect those semantics. Bottom line: is the optimization present in this bug report important or can it be removed? |
This particular optimization probably isn't that valuable on its own; I mean, we want to avoid bitcasts where we can, but there are other ways of addressing this particular situation on most targets. I have three concerns with making NaN values indeterminate in registers:
|
Ok, give me some time and I'll compile a list of optimizations that are broken if we assume that specific NaN patterns get propagated. |
Ok, I've implemented the semantics we discussed here in Alive2 for testing. This keeps all values in bit-vectors all the time, except when there's a float operation. Operands are converted from a bit-vector into float type, the operation is performed, and the result converted back to a bit-vector pattern. The result is that there is only 1 regression in the LLVM test suite. On the other hand, only 2 tests get fixed. The new test failure is this:
So the -1 bit pattern disappears. So not sure there's a way to distinguish a bit-pattern that happens to represent NaN from a float NaN (which we interpret as any bit-pattern that represents NaN). Anyway, assuming we reach a conclusion on what to do with this test, the question is then about the backends. Are all backends ok for LLVM to assume that read/write of data into float registers doesn't change the bits? |
Probably under any model where floating-point "operations" are special, bitcast shouldn't count as a floating-point operation; otherwise, we lose the equivalence between bitcast and store+load. Again, I think the only in-tree target with non-bit-preserving load/store operations is 32-bit x86 (using x87 operations to load float/double values; oddly, long double load/store are bit-preserving). |
There's a related test failure in Transforms/InstCombine/minmax-fold.ll:
In source, there's only a bitcast between integers, while in target there's bitcast of int->float->int. If the bit representation in NaN, then this roundtrip may change the bits (e.g. canonicalize the NaN). |
In Rust, we are also struggling with what exactly LLVM's NaN semantics are: rust-lang/rust#73328. Would be good to get a more precise documentation of those semantics -- though it seems that's still work-in-progress if I understand the discussion here correctly?
What was the semantics you implemented first, i.e., how is it different from the adjusted one you described later? Is it what Eli calls the "alternative model"? So, float/double LLVM variables actually have a different range of possible values than a memory location, and conversion happens on load/store/bitcast?
Note that this means that these operations are non-deterministic! So, duplicating them would be an illegal transformation, similar to "freeze". Is that something LLVM is treating properly? (In the "alternative model", likewise stores/bitcasts would be non-deterministic and must not be duplicated.) |
I see 2 models:
Each has pros and cons, as usual. The first one is required if people care about such processors. The second one allows some optimizations like the ones shown in this bug report. |
(The store would be the problem with duplication here, not the load.) The second one seems to disallow e.g. turning "float f = x+x; if ((int)f == (int)f)" into "if ((int)(x+x) == (int)(x+x))" as that would duplicate the non-determinism. This particular transformation likely makes little sense, but there might be other conditions under which recomputing a result could be beneficial. |
This issue is causing soundness problems for Rust today: rust-lang/rust#114479 (comment). That's an example with no NaNs and no unsafe, which results in a segfault. |
@llvm/issue-subscribers-backend-x86 Author: Nuno Lopes (nunoplopes)
| | |
| --- | --- |
| Bugzilla Link | [45152](https://llvm.org/bz45152) |
| Version | trunk |
| OS | All |
| CC | @DMG862,@efriedma-quic,@ecnelises,@aqjune,@LebedevRI,@jfbastien,@RKSimon,@nikic,@RalfJung,@programmerjake,@regehr,@rotateright,@yuanfang-chen |
Extended DescriptionThe unit test "test/Transforms/InstCombine/bitcast-phi-uselistorder.ll" shows an incorrect transformation from load+store i64 into load/store double. These are not equivalent because NaN values can be canonicalized by the CPU so the store double can write a different bit-pattern than store i64. Alive2's counterexample:
|
Marking this as an X86 backend issue, as I believe the consensus is that the load/store operations on the IR level are value-preserving (that is, they do not count as "floating-point operations") and the fact that this is not true for x87 needs to be mitigated there (together with the whole host of other problems with similar root cause). |
I think the M68k backend will probably have similar issues when they start adding hardware floating-point support, since iirc float/double load/stores also convert from/to an internal 80-bit format. bug for adding fp support: #61744 @llvm/issue-subscribers-backend-m68k |
The manual at https://cache.nxp.com/docs/en/reference-manual/M68000PM.pdf says the following on page 3-26:
That is, setting the precision control on the m68k FPU actually makes it comply with the appropriate IEEE 754 format despite its capacity, unlike the x87 FPU.
EDIT 2: Quoth the manual regarding the FMOVE instruction:
And later on it says "Refer to 1.6.5 Not-A-Numbers" about its SNAN behavior. So it quiets signaling NaNs? EDIT 3: This is confusing:
I'm assuming it's only talking about quiet NaNs, but who knows... |
@llvm/issue-subscribers-backend-m68k Author: Nuno Lopes (nunoplopes)
| | |
| --- | --- |
| Bugzilla Link | [45152](https://llvm.org/bz45152) |
| Version | trunk |
| OS | All |
| CC | @DMG862,@efriedma-quic,@ecnelises,@aqjune,@LebedevRI,@jfbastien,@RKSimon,@nikic,@RalfJung,@programmerjake,@regehr,@rotateright,@yuanfang-chen |
Extended DescriptionThe unit test "test/Transforms/InstCombine/bitcast-phi-uselistorder.ll" shows an incorrect transformation from load+store i64 into load/store double. These are not equivalent because NaN values can be canonicalized by the CPU so the store double can write a different bit-pattern than store i64. Alive2's counterexample:
|
assuming qemu is correct (which is not necessarily the case), compiling and running the following program with m68k-linux-gnu-g++ makes me think that m68k actually preserves sNaN bits when using fmoved, which is a nice surprise: #include <stdint.h>
#include <stdio.h>
union U {
double f;
uint64_t i;
};
double g(const uint64_t &v) __attribute__((noinline));
uint64_t f(uint64_t v) {
return (U){.f = g(v)}.i;
}
double g(const uint64_t &v) {
return (U){.i = v}.f;
}
int main() {
uint64_t v = 0xFFFF0000abcdef01;
printf("0x%llx 0x%llx\n", (unsigned long long)f(v), (unsigned long long)v);
} it prints:
disassembly of
|
@nunoplopes I think it's wrong for a Rust relies on that. I also don't see anything in the LangRef docs that would say it's allowed for load/store to alter the value. |
Extended Description
The unit test "test/Transforms/InstCombine/bitcast-phi-uselistorder.ll" shows an incorrect transformation from load+store i64 into load/store double. These are not equivalent because NaN values can be canonicalized by the CPU so the store double can write a different bit-pattern than store i64.
Alive2's counterexample:
The text was updated successfully, but these errors were encountered: