-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
repr(simd) is unsound #44367
Comments
Two thoughts I've had in the past how to fix this:
Given that |
It's worth noting that, when talking about LLVM features (which rust target features currently map directly to) this also affects floats (which are stable). I.e., on x86 with current safe rust if you compile one crate with --target-feature=+soft-float" and one without you have an issue. This can also be solved as Alex mentions though. |
@alexcrichton I would prefer to start with a hard error, and if we need it, add a way to opt-in to the shim generation (*). The only thing that concerns me about the hard error, is that we will probably need to emit this during monomorphization. I think that this is not acceptable, and we should only do this if it's either temporary or there is no other way. @eddyb pointed out that currently stable rust has no monomorphization-time errors, so this solution might block stabilization. @alexcrichton you mentioned that this would mean that SIMD types must be then banned from FFI because we don't know the calling convention of the caller. Could you elaborate on this? As I see it, FFI is already unsafe, so it would be up-to-the-user to make sure that the callee is using the appropriate calling convention. (*) I haven't thought this through, but I imagine getting a hard error for a particular call site, and then wanting to opt-in for that particular call site only, into the shim generation. Anyways, we don't need to think this all the way through now. |
I'd personally be totally ok with a hard error, but yes I think we'd have to do it during monomorphization. It's true that we don't have many monomorphization errors today but I don't think we have absolutely 0, and I'd personally also think that we should at least get to a workable state and try it out to evaluate before possibly stabilization. I, personally again, would be fine mostly likely stabilizing with monomorphization errors.
Oh right yeah! Right now we've got a lint in the compiler for "this type is unsafe in FFI", and for example it lints about bare structs that are not @parched I don't believe we're considering a |
@alexcrichton I've recently reviewed all non-ICE errors from |
While
The above code actually crashes in LLVM on the playground, because that's x86_64 where SSE2 is always present. However, on a 32 bit x86 target, |
How about we make the target spec define the ABI regardless of what extra features are enabled or disabled by attributes or the commandline. That would avoid the need for shims and monomorphization errors. So for example on x86_64 which has 128-bit vectors by default:
|
@rkruppe Thanks for that example. Two small comments:
I personally wouldn't like to have to re-open this topic in the future when people start filling bugs due to random segfaults because some crate in the middle of their dependency graph decided that it was a good idea to add |
@parched some Ideally, if I have an SSE dynamic library that exposes some functions on its ABI for SSE...AVX2, I would like to be able to add some new AVX3/4 functions to its interface, recompile, and produce a library that is ABI compatible with the old one, so that all my old clients can continue to work as is by linking to the new library, but newer code is able to call the new AVX3/4 functions. That is, adding those new AVX3/4 functions should not break the ABI of my dynamic library as long as my global target is still |
@gnzlbg yes 512-bit and 1024-bit vectors would have to be treated the same way but I don't believe adding more would be an issue.
For that case you would just have to make your new functions |
@parched I think I misunderstood your comment then.
Do you think that @alexcrichton @BurntSushi I've slept over this a bit, and I think the following is a common idiom that we need to enable: #[target_feature = "sse"]
fn foo(v: f32x8) -> f32x8 {
// f32x8 has SSE ABI here
let u = if std::host_feature(AVX) {
foo_avx(v) // mismatched ABI: hard error (argument)
// mismatched ABI: hard error (return type)
} else {
/* SSE code */
}
/* do something with u */
u
}
#[target_feature = "avx"]
fn foo_avx(arg: f32x8) -> f32x8 { ... } Here we have some mismatching ABIs. I am still fine with making these mismatching ABIs hard errors as long as there is an opt-in way to make this idiom work. What do you think about using #[target_feature = "sse"]
fn foo(v: f32x8) -> f32x8 {
// f32x8 has SSE ABI here
let u = if std::host_feature(AVX) {
// foo_avx(v) // ERROR: mismatched ABIs (2x arg and ret type)
// foo_avx(v as f32x8) // ERROR: mismatched ABIs (1x ret type)
foo_avx(v as f32x8) as f32x8 // OK
} else {
/* SSE code */
}
/* do something with u */
u
} That is, an Do you think we can extend this to make function pointers work?: #[target_feature = "+sse"] fn foo(f32x8) -> f32x8;
static mut foo_ptr: fn(f32x8) -> f32x8 = foo;
unsafe {
// foo_ptr = foo_avx; // ERROR: mismatched ABI
foo_ptr = foo_avx as fn(f32x8) -> f32x8; // OK
}
// assert_eq!(foo_ptr, foo_avx); // ERROR: mismatched ABIs
assert_eq!(foo_ptr, foo_avx as fn(f32x8) -> f32x8); // OK I was thinking that in this case, I think that pursuing this would require us to track the ABI of I think that if we can lift these errors to type-checking:
Thoughts? EDIT: even if we never stabilize EDIT2: That is, this issue would be resolved by making the original example fail with a type error, and adding the |
For the record, that's not true, one just needs a target that doesn't have SSE enabled by default (or defaults to soft-float), such as the (tier 2) I do agree that we should find a proper solution right now, especially since the "cheap fixes" that I'm aware of (monomorphization-time error, or strongarm the ABIs into being compatible by explicitly passing problematic types on the stack) permit code that probably wouldn't work unmodified under a more principled solution. Unfortunately I don't have the time to dive into solutions right now, so I can't contribute anything but nagging at the moment |
@gnzlbg specifically you think that passing arguments like I'm also not sure we can ever get function pointers to "truly work" unless we declare "one true ABI" for these types, otherwise we have no idea what the actual abi of the function pointer is. (this bit about function pointers is pushing me quite a bit into the camp of "just declare everything unsafe and document why") |
Of course it happens, see this SO question, but users get warnings and undefined behavior pretty quickly and learn to work around this (that is, "don't do that", pass a An important point is that this can only happen when you have ABI incompatible vector types. That is, if you are using from SSE to SSE4.2, then you never run into these issues, because they are introduced by AVX which is relatively recent, and by AVX512 which is very rare (EDIT: on ARM you only have NEON so this does not happen, and the new SVE completely works around this issue).
Why do we need one true ABI for these types? For example: #[target_feature = "+sse"]
fn foo() {
let a: fn(f32x8) -> f32x8; // has type fn(f32x8["sse"]) -> f32x8["sse"]
}
#[target_feature = "+avx"]
fn bar() {
let a: fn(f32x8) -> f32x8; // has type fn(f32x8["avx"]) -> f32x8["avx"]
}
static a: fn(f32x8) -> f32x8; // has type fn(f32x8["CRATE"]) -> f32x8["CRATE"]
// where CRATE is replaced with whatever feature the crate is compiled with That is, two function pointers, compiled on different crates, or functions, with different features, would just be different types and generate a type error. |
Another workaround in C is to do something like this: First we need a way to merge two 128bit registers into a 256bit register (or a "no op" in SSE): #[target_feature = "+sse"]
fn merge_sse(x: (f32x4, f32x4)) -> f32x8; // no op?
#[target_feature = "+avx"]
fn merge_avx(x: (f32x4, f32x4)) -> f32x8;
// ^^^^ copy 2x128bit registers to 1x256register then we need its inverse, that is, a function that takes a 256bit value (or two in SSE) and returns 2 128 bit registers: #[target_feature = "+sse"]
fn split_sse(f32x8) -> (f32x4, f32x4); // no op?
#[target_feature = "+avx"]
fn split_avx(f32x8) -> (f32x4, f32x4);
// ^^^^ copy the parts of a 256bit register into 2x128bit registers then we add some macros to communicate macro_rules! from_sse_to_avx { ($x:expr) => (merge_avx(split_sse($x)) }
macro_rules! from_avx_to_sse { ($x:expr) => (merge_sse(split_avx($x)) }
macro_rules! from_sse_to_avx_and_back {
($f:expr, $x:expr) => (from_avx_to_sse!($f(from_sse_to_avx!($x))))
} and then we can safely write the code above as: #[target_feature = "sse"]
fn foo(v: f32x8) -> f32x8 {
// f32x8 has SSE ABI here
let u = if std::host_feature(AVX) {
// foo_avx(v) // mismatched ABI: hard error (argument)
from_avx_to_sse_and_back!(foo_avx, v); // OK
} else {
/* SSE code */
}
/* do something with u */
u
}
#[target_feature = "avx"]
fn foo_avx(arg: f32x8) -> f32x8 { ... } |
It could, but if you did that you wouldn't be able to call that function from another without |
@gnzlbg everything you're saying seems plausible? You're thinking that passing types like |
@alexcrichton the
What exactly do you propose to call/make After exploring all these options, I'd like to propose a path forward.
@eddyb said above that stable Rust has zero monomorphization time errors. I don't think that introducing one will cut it for stabilization. To produce a type-checking error, we need to "somehow" propagate the Once we are there users that want to convert between incompatible ABIs can do so with this idiom, which can also be extended to make function pointers work. We could provide procedural macros in a crate that do this automatically, and if that becomes a pain point in practice we could re-evaluate adding language support for that (e.g. something along the lines of the (*) If I understand this correctly, we would need to require that all |
@gnzlbg what I mean is that yes, I personally think that at this point it's not worth trying to push this back into typechecking. That sounds like quite a lot of work for not necessarily a lot of gain. Additionally I'd be worried that it'd expose hidden costs and/or complexities that are very difficult to get right. For example if we had: static mut FOO: fn(u8x32) = default;
#[target_feature = "+avx2"]
unsafe fn bar() {
FOO = foo;
}
#[target_feature = "+avx2"]
unsafe fn foo(a: u8x32) {
}
unsafe fn default(a: u8x32) {
}
fn main() {
bar();
FOO(Default::default());
} How do we rationalize that? Does the Note that this doesn't even start to touch on trait methods. For example how do we also rationalize These just seem like really difficult questions to answer and I'm not personally sold on there being any real benefit to going to all this effort to statically verify these things. All of SIMD is unsafe anyway now, so trying to add static guarantees on top of something that we've already declared as fundamentally unsafe would be nice but to me doesn't seem necessary. |
I'll try to explain what I am proposing better because I think we are misunderstanding each other. First, we need to declare the #[repr(simd)]
struct u8x32(u8, u8, ...); but what I am proposing is not to make #[repr(simd)]
struct<ABI> u8x32(u8, u8, ...); When the user uses a type CRATE_ABI = /* from target features of the crate */; Now we proceed with the example. First, the user writes: static mut FOO: fn(u8x32) = default; // OK, compiles
fn default(a: u8x32) { } That compiles and type checks, because implicitly, the code looks like this: static mut FOO: fn(u8x32<CRATE_ABI>) = default; // OK, compiles
fn default(a: u8x32<CRATE_ABI>) { } So the types unify just fine. Note that Now let's get a bit more messier. The user writes: #[target_feature = "+avx2"] unsafe fn foo(a: u8x32) {}
#[target_feature = "+avx2"]
unsafe fn bar() {
FOO = foo;
} but what this does is the following: #[target_feature = "+avx2"] unsafe fn foo(a: u8x32<AVX2_ABI>) {}
#[target_feature = "+avx2"]
unsafe fn bar() {
FOO = foo; // OK or Error?
} So is this code ok or is in an error? From the information provided, we cannot say. It depends on what the So at this point we are ready to move to trait methods. What should this do? fn main() {
FOO(Default::default());
} Well the same thing it does for any other type. It is just type-checking at work. If it can unify the type parameters then everything is ok, and otherwise, it does not compile. Obviously we need to nail the ABI types so that code only compile when it is safe, and breaks otherwise.
I hope this has become clear, but just to be crystal clear: we never produce shims, either the ABI matches, or it doesn't. The users can manually write the shims if they need to by using this idiom.
I hope this has become clear.
I think I am misunderstanding what you mean here. Right now, Also, how are you exactly proposing to make |
@gnzlbg So if I understand correctly, you would extend the type system in a way that would be visible to users in type mismatches? Would this also apply to float types? |
To solve the -sse issue that would need to apply to floats as well.
…On Sat 9. Sep 2017 at 22:24, Robin Kruppe ***@***.***> wrote:
@gnzlbg <https://github.com/gnzlbg> So if I understand correctly, you
would extend the type system in a way that would be visible to users in
type mismatches? Would this also apply to float?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#44367 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA3NpiFDkpBDLi6rer_cY8lYCcxxb7gpks5sgvPlgaJpZM4POc13>
.
|
Okay, thanks. My impression after only very little thought is that I agree with @alexcrichton that a proper automatic solution to the ABI problems seems pretty complicated. If functions tagged with Letting unsafe code do its thing would mean that a strict solution such as the one @gnzlbg proposed can't be introduced later. However, it would still be possible to start generating shims based on the caller's and callee's set of target features. |
I want to add, though: ABI mismatches are potentially very subtle and annoying bugs, so there should be a warning at least for the cases that can be easily detected statically. |
@gnzlbg I think that makes sense, yeah, but I can't really comment on whether I'd think it's feasible or not. It sounds pretty complicated and subject to subtly nasty interactions, my worry would be that we'd spend all our effort chasing along tail of bugs to make this airtight. Is this really a common enough idiom to warrant the need to provide a static error instead of discovering this at runtime?
I haven't though too too hard about this, admittedly. If we expose APIs like |
@alexcrichton Why would Edit: On second thought, since the functions that use And of course, there's the issue that |
FYI the shims are required to make
Here, since |
SIMD code tends to use |
Won't |
@glaebhoerl Whether
Adding shims adds more code which increases compile-time, the question is by how much? Most applications I know of have only a tiny fraction of explicit SIMD code, so this might not be even measurable. Also, all applications working properly today won't probably need shims (otherwise they wouldn't be working correctly), so I wouldn't worry about this initially beyond checking that compile-times don't explode for the crates currently using SIMD. Whether execution times will be affected is very hard to tell. Applications have little explicit SIMD code, but that code is typically in a hot spot. As a thought experiment, we can consider the worst case: Beyond the worst case things get better though: if an application is just executing a single SIMD instruction in their hot loop, e.g., taking two arguments, and the shims aren't removed then they might go from 1 cycle to 4 cycles. And if they are doing something more complex then the cost of the shims quickly becomes irrelevant. In all of these cases:
If binary size/execution speed/compile time turns out to be a problem for debug builds I'd say let's worry about that when we get there. |
I've been thinking about this again recently with an eye towards hoping to push SIMD over the finish line towards stabilization. Historically I've been a proponent of adding "shims" to solve this problem at compile time. These shims would cause any ABI mismatch to get resolved by transferring arguments through memory instead of registers. As I think more and more about the shims, however, I'm coming round to the conclusion that they're overly difficult (if and not sure if possible) to implement. Especially when dealing with function pointers is where I feel like things get super tricky to do. Along those lines I've been reconsidering another implementation strategy, which is to always pass arguments via memory instead of by value. In other words, let's say you write: fn foo(a: u8x32) { ... } Today we'd generated something along the lines of (LLVM-wise) define @foo(<i8 x 32>) {
...
} whereas instead what I think we should generate is: define @foo(<i8 x 32>*) { ; note the *
...
} Or in other words, SIMD values are unconditionally passed through memory between all functions. This would, I think, be much easier to implement and also jive much more nicely with the implementation of everything else in rustc today. I've historically been opposed to this approach thinking that it would be bad for performance, but when thinking about it I actually don't think there's going to be that much impact. In general I'm under the impression that SIMD code is primarily about optimizing hot loops, and in these sorts of situations if you have a literal function call that's already killing performance anyway. In that sense we're already inlining everything enough to remove the layer of indirection by storing values on the stack. If that's true, I actually don't think that if we leave a AFAIK the main trickiness around this would be that Rust functions would pass all the vector types via memory, but we'd need a way to pass them by value to variuos intrinsic functions in LLVM. In general though, what do others think about an always-memory approach? |
The intrinsics don't have the I think this approach is the easiest out of all possible ones, since all you need to change is: rust/src/librustc_trans/abi.rs Lines 872 to 875 in 247835a
There's two ways to do it:
|
This commit changes the ABI of SIMD types in the "Rust" ABI to unconditionally be passed via pointers instead of being passed as immediates. This should fix a longstanding issue, rust-lang#44367, where SIMD-using programs ended up showing very odd behavior at runtime because the ABI between functions was mismatched. As a bit of a recap, this is sort of an LLVM bug and sort of an LLVM feature (today's behavior). LLVM will generate code for a function solely looking at the function it's generating, including calls to other functions. Let's then say you've got something that looks like: ```llvm define void @foo() { ; no target features enabled call void @bar(<i64 x 4> zeroinitializer) ret void } define void @bar(<i64 x 4>) #0 { ; enables the AVX feature ... } ``` LLVM will codegen the call to `bar` *without* using AVX registers becauase `foo` doesn't have access to these registers. Instead it's generated with emulation that uses two 128-bit registers. The `bar` function, on the other hand, will expect its argument in an AVX register (as it has AVX enabled). This means we've got a codegen problem! Comments on rust-lang#44367 have some more contexutal information but the crux of the issue is that if we want SIMD to work in general we'll need to ensure that whenever a function calls another they ABI of the arguments being passed is in agreement. One possible solution to this would be to insert "shim functions" where whenever a `target_feature` mismatch is detected the compiler inserts a shim function where you pass arguments via memory to the shim and then the shim loads the values and calls the target function (where the shim and the target have the same target features enabled). This unfortunately is quite nontrivial to implement in rustc today (especially when accounting for function pointers and such). This commit takes a different solution, *always* passing SIMD arguments through memory instead of passing as immediates. This strategy solves the problem at the LLVM layer because the ABI between two functions never uses SIMD registers. This also shouldn't be a hit to performance because SIMD performance is thought to often rely on inlining anyway, where a `call` instruction, even if using SIMD registers, would be disastrous to performance regardless. LLVM should then be more than capable of fixing all our memory usage to use registers instead after enough inlining has been performed. Note that there's a few caveats to this commit though: * The "platform intrinsic" ABI is omitted from "always pass via memory". This ABI is used to define intrinsics like `simd_shuffle4` where LLVM and rustc need to have the arguments as an immediate. * Additionally this commit does *not* fix the `extern` ("C") ABI. This means that the bug in rust-lang#44367 can still happen when using non-Rust-ABI functions. My hope is that before stabilization we can ban and/or warn about SIMD types in these functions (as AFAIK there's not much motivation to belong there anyway), but I'll leave that for a later commit and if this is merged I'll file a follow-up issue. All in all this... Closes rust-lang#44367
The issue of passing around SIMD types as values between functions has seen [quite a lot] of [discussion], and although we thought [we fixed it][quite a lot] it [wasn't]! This PR is a change to rustc to, again, try to fix this issue. The fundamental problem here remains the same, if a SIMD vector argument is passed by-value in LLVM's function type, then if the caller and callee disagree on target features a miscompile happens. We solve this by never passing SIMD vectors by-value, but LLVM will still thwart us with its argument promotion pass to promote by-ref SIMD arguments to by-val SIMD arguments. This commit is an attempt to thwart LLVM thwarting us. We, just before codegen, will take yet another look at the LLVM module and demote any by-value SIMD arguments we see. This is a very manual attempt by us to ensure the codegen for a module keeps working, and it unfortunately is likely producing suboptimal code, even in release mode. The saving grace for this, in theory, is that if SIMD types are passed by-value across a boundary in release mode it's pretty unlikely to be performance sensitive (as it's already doing a load/store, and otherwise perf-sensitive bits should be inlined). The implementation here is basically a big wad of C++. It was largely copied from LLVM's own argument promotion pass, only doing the reverse. In local testing this... Closes rust-lang#50154 Closes rust-lang#52636 Closes rust-lang#54583 Closes rust-lang#55059 [quite a lot]: rust-lang#47743 [discussion]: rust-lang#44367 [wasn't]: rust-lang#50154
rustc: Fix (again) simd vectors by-val in ABI The issue of passing around SIMD types as values between functions has seen [quite a lot] of [discussion], and although we thought [we fixed it][quite a lot] it [wasn't]! This PR is a change to rustc to, again, try to fix this issue. The fundamental problem here remains the same, if a SIMD vector argument is passed by-value in LLVM's function type, then if the caller and callee disagree on target features a miscompile happens. We solve this by never passing SIMD vectors by-value, but LLVM will still thwart us with its argument promotion pass to promote by-ref SIMD arguments to by-val SIMD arguments. This commit is an attempt to thwart LLVM thwarting us. We, just before codegen, will take yet another look at the LLVM module and demote any by-value SIMD arguments we see. This is a very manual attempt by us to ensure the codegen for a module keeps working, and it unfortunately is likely producing suboptimal code, even in release mode. The saving grace for this, in theory, is that if SIMD types are passed by-value across a boundary in release mode it's pretty unlikely to be performance sensitive (as it's already doing a load/store, and otherwise perf-sensitive bits should be inlined). The implementation here is basically a big wad of C++. It was largely copied from LLVM's own argument promotion pass, only doing the reverse. In local testing this... Closes #50154 Closes #52636 Closes #54583 Closes #55059 [quite a lot]: #47743 [discussion]: #44367 [wasn't]: #50154
The issue of passing around SIMD types as values between functions has seen [quite a lot] of [discussion], and although we thought [we fixed it][quite a lot] it [wasn't]! This PR is a change to rustc to, again, try to fix this issue. The fundamental problem here remains the same, if a SIMD vector argument is passed by-value in LLVM's function type, then if the caller and callee disagree on target features a miscompile happens. We solve this by never passing SIMD vectors by-value, but LLVM will still thwart us with its argument promotion pass to promote by-ref SIMD arguments to by-val SIMD arguments. This commit is an attempt to thwart LLVM thwarting us. We, just before codegen, will take yet another look at the LLVM module and demote any by-value SIMD arguments we see. This is a very manual attempt by us to ensure the codegen for a module keeps working, and it unfortunately is likely producing suboptimal code, even in release mode. The saving grace for this, in theory, is that if SIMD types are passed by-value across a boundary in release mode it's pretty unlikely to be performance sensitive (as it's already doing a load/store, and otherwise perf-sensitive bits should be inlined). The implementation here is basically a big wad of C++. It was largely copied from LLVM's own argument promotion pass, only doing the reverse. In local testing this... Closes rust-lang#50154 Closes rust-lang#52636 Closes rust-lang#54583 Closes rust-lang#55059 [quite a lot]: rust-lang#47743 [discussion]: rust-lang#44367 [wasn't]: rust-lang#50154
rustc: Fix (again) simd vectors by-val in ABI The issue of passing around SIMD types as values between functions has seen [quite a lot] of [discussion], and although we thought [we fixed it][quite a lot] it [wasn't]! This PR is a change to rustc to, again, try to fix this issue. The fundamental problem here remains the same, if a SIMD vector argument is passed by-value in LLVM's function type, then if the caller and callee disagree on target features a miscompile happens. We solve this by never passing SIMD vectors by-value, but LLVM will still thwart us with its argument promotion pass to promote by-ref SIMD arguments to by-val SIMD arguments. This commit is an attempt to thwart LLVM thwarting us. We, just before codegen, will take yet another look at the LLVM module and demote any by-value SIMD arguments we see. This is a very manual attempt by us to ensure the codegen for a module keeps working, and it unfortunately is likely producing suboptimal code, even in release mode. The saving grace for this, in theory, is that if SIMD types are passed by-value across a boundary in release mode it's pretty unlikely to be performance sensitive (as it's already doing a load/store, and otherwise perf-sensitive bits should be inlined). The implementation here is basically a big wad of C++. It was largely copied from LLVM's own argument promotion pass, only doing the reverse. In local testing this... Closes rust-lang#50154 Closes rust-lang#52636 Closes rust-lang#54583 Closes rust-lang#55059 [quite a lot]: rust-lang#47743 [discussion]: rust-lang#44367 [wasn't]: rust-lang#50154
The issue of passing around SIMD types as values between functions has seen [quite a lot] of [discussion], and although we thought [we fixed it][quite a lot] it [wasn't]! This PR is a change to rustc to, again, try to fix this issue. The fundamental problem here remains the same, if a SIMD vector argument is passed by-value in LLVM's function type, then if the caller and callee disagree on target features a miscompile happens. We solve this by never passing SIMD vectors by-value, but LLVM will still thwart us with its argument promotion pass to promote by-ref SIMD arguments to by-val SIMD arguments. This commit is an attempt to thwart LLVM thwarting us. We, just before codegen, will take yet another look at the LLVM module and demote any by-value SIMD arguments we see. This is a very manual attempt by us to ensure the codegen for a module keeps working, and it unfortunately is likely producing suboptimal code, even in release mode. The saving grace for this, in theory, is that if SIMD types are passed by-value across a boundary in release mode it's pretty unlikely to be performance sensitive (as it's already doing a load/store, and otherwise perf-sensitive bits should be inlined). The implementation here is basically a big wad of C++. It was largely copied from LLVM's own argument promotion pass, only doing the reverse. In local testing this... Closes rust-lang#50154 Closes rust-lang#52636 Closes rust-lang#54583 Closes rust-lang#55059 [quite a lot]: rust-lang#47743 [discussion]: rust-lang#44367 [wasn't]: rust-lang#50154
The issue of passing around SIMD types as values between functions has seen [quite a lot] of [discussion], and although we thought [we fixed it][quite a lot] it [wasn't]! This PR is a change to rustc to, again, try to fix this issue. The fundamental problem here remains the same, if a SIMD vector argument is passed by-value in LLVM's function type, then if the caller and callee disagree on target features a miscompile happens. We solve this by never passing SIMD vectors by-value, but LLVM will still thwart us with its argument promotion pass to promote by-ref SIMD arguments to by-val SIMD arguments. This commit is an attempt to thwart LLVM thwarting us. We, just before codegen, will take yet another look at the LLVM module and demote any by-value SIMD arguments we see. This is a very manual attempt by us to ensure the codegen for a module keeps working, and it unfortunately is likely producing suboptimal code, even in release mode. The saving grace for this, in theory, is that if SIMD types are passed by-value across a boundary in release mode it's pretty unlikely to be performance sensitive (as it's already doing a load/store, and otherwise perf-sensitive bits should be inlined). The implementation here is basically a big wad of C++. It was largely copied from LLVM's own argument promotion pass, only doing the reverse. In local testing this... Closes rust-lang#50154 Closes rust-lang#52636 Closes rust-lang#54583 Closes rust-lang#55059 [quite a lot]: rust-lang#47743 [discussion]: rust-lang#44367 [wasn't]: rust-lang#50154
The following should be discussed as part of an RFC for supporting portable vector types (
repr(simd)
) but the current behavior is unsound (playground):Basically, those two objects of type
f32x8
have a different layout, sofoo
andbar
have a different ABI / calling convention. This can be introduced withouttarget_feature
, by compiling two crates with different--target-cpu
s and linking them, buttarget_feature
was used here for simplicity.The text was updated successfully, but these errors were encountered: