Aligning std::simd and Rust on Arm v7 Neon float behavior #439

workingjubilee · 2024-09-12T03:10:18Z

This is going to be a bit grisly: the Arm v7 Neon registers flush subnormals and Rust has defined floats as to deny flushing subnormals to be a valid behavior. If we want std::simd to align here with scalar ops, we will have to unfortunately kinda chuck the vector ops for non-integer operations.

Meta

rustc --version --verbose:

rustc 1.83.0-nightly (0ee7cb5e3 2024-09-10)
binary: rustc
commit-hash: 0ee7cb5e3633502d9a90a85c3c367eccd59a0aba
commit-date: 2024-09-10
host: x86_64-unknown-linux-gnu
release: 1.83.0-nightly
LLVM version: 19.1.0

The text was updated successfully, but these errors were encountered:

RalfJung · 2024-09-14T07:23:01Z

This seems like basically the same issue as rust-lang/rust#129880, but might be worth tracking in this repo as well I guess?

I guess stdarch is also affected, but arguably there it is okay to expose the underlying hardware behavior... that is, assuming we don't get unsoundness due to llvm/llvm-project#89885.

workingjubilee · 2024-09-14T17:11:07Z

@RalfJung It has particular considerations for our API design yes.

DemiMarie · 2024-11-10T01:59:51Z

I don’t think it makes sense to expect vector operations to have defined subnormal behavior. There is too much hardware where perfect IEEE conformance is either impossible or requires software support code. Making flushing subnormals to zero permissible behavior is the only approach that allows for predictable runtime performance and predictable lowering to target-specific assembly.

RalfJung · 2024-11-10T08:44:14Z

Unfortunately LLVM is unsound on hardware that flushes subnormals.

predictable runtime performance and predictable lowering

And completely unpredictable runtime behavior. Great.

workingjubilee · 2024-11-10T10:26:46Z

@DemiMarie easily done, all it needs is a small fix in LLVMIR and SelectionDAG: llvm/llvm-project#30633

calebzulawski · 2024-11-10T16:10:08Z

Is it unpredictable because of reordering? I don't see what can be accomplished that doesn't make std::simd useless on armv7 or ppc other than allowing ftz

RalfJung · 2024-11-10T16:13:41Z

It is unpredictable in the sense of giving different results on different targets, and (depending on what semantics LLVM implements once they properly support NEON on 32-bit ARM, which currently they do not) different optimization levels and different ways of writing the same code.

calebzulawski · 2024-11-10T16:23:51Z

Considering these are old targets I'm not expecting a huge push to fix the backends, but would simply disallowing certain optimizations be sufficient? We do note in the std::simd docs that ftz will happen on some targets. We could e.g. expose a cfg value if necessary.

RalfJung · 2024-11-10T16:26:44Z

I mean we could try to disable the scalar evolution pass and hope that this suffices. But that's far from a robust solution, so it's not really aligned with Rust's values IMO.

RalfJung · 2024-11-10T16:29:18Z

Anyway I think portable-simd has a lot of things to resolve before this becomes a pressing question. Right now, not even the core::arch operations are stable on ARM32.

DemiMarie · 2024-11-10T20:12:20Z

Unfortunately LLVM is unsound on hardware that flushes subnormals.

predictable runtime performance and predictable lowering

And completely unpredictable runtime behavior. Great.

This can be worked around by implementing the relevant intrinsics using LLVM inline assembly instead.

RalfJung · 2024-11-10T20:20:03Z

That would not achieve the "predictable runtime performance" part of your goals, as the optimizer would have to treat this like a black box.

And behavior would still be unpredictable in the sense of differing across architectures. So IMO it would also be reasonable to say that portable-simd is simply not supported on 32bit ARM, and only provide core::arch primitives where people are hopefully aware of the semantic pitfalls.

But anyway as I said, we're likely years away from this being a high-priority question. First all of the rest of the portable-simd API needs to be worked out...

DemiMarie · 2024-11-10T20:25:05Z

That would not achieve the "predictable runtime performance" part of your goals, as the optimizer would have to treat this like a black box.

Is the optimizer actually able to usefully reason about SIMD intrinsics anyway? The optimizer can (IIUC) be informed that the operations don’t access memory and can be elided if their result is not needed. My understanding is that SIMD programmers typically use the compiler as a glorified register allocator and so don’t particularly care about other optimizations. Is this accurate?

RalfJung · 2024-11-10T20:31:57Z

The simd_* intrinsics, which are used for everything in portable-simd, are fully understood by LLVM and can be optimized like scalar operations. I don't know how much that matters in practice, but const-folding does seem like a useful optimization even for SIMD.

DemiMarie · 2024-11-10T20:35:44Z

I think it would be better to have SIMD that cannot be constant-folded than to not have SIMD at all.

workingjubilee added the C-bug Category: Bug label Sep 12, 2024

Tropix126 mentioned this issue Nov 9, 2024

Add armv7a-vex-v5 tier three target rust-lang/rust#131530

Open

rust-lang locked as too heated and limited conversation to collaborators Nov 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aligning std::simd and Rust on Arm v7 Neon float behavior #439

Aligning std::simd and Rust on Arm v7 Neon float behavior #439

workingjubilee commented Sep 12, 2024

RalfJung commented Sep 14, 2024

workingjubilee commented Sep 14, 2024

DemiMarie commented Nov 10, 2024

RalfJung commented Nov 10, 2024

workingjubilee commented Nov 10, 2024

calebzulawski commented Nov 10, 2024

RalfJung commented Nov 10, 2024 •

edited

Loading

calebzulawski commented Nov 10, 2024

RalfJung commented Nov 10, 2024

RalfJung commented Nov 10, 2024

DemiMarie commented Nov 10, 2024

RalfJung commented Nov 10, 2024

DemiMarie commented Nov 10, 2024

RalfJung commented Nov 10, 2024 •

edited

Loading

DemiMarie commented Nov 10, 2024

Aligning std::simd and Rust on Arm v7 Neon float behavior #439

Aligning std::simd and Rust on Arm v7 Neon float behavior #439

Comments

workingjubilee commented Sep 12, 2024

Meta

RalfJung commented Sep 14, 2024

workingjubilee commented Sep 14, 2024

DemiMarie commented Nov 10, 2024

RalfJung commented Nov 10, 2024

workingjubilee commented Nov 10, 2024

calebzulawski commented Nov 10, 2024

RalfJung commented Nov 10, 2024 • edited Loading

calebzulawski commented Nov 10, 2024

RalfJung commented Nov 10, 2024

RalfJung commented Nov 10, 2024

DemiMarie commented Nov 10, 2024

RalfJung commented Nov 10, 2024

DemiMarie commented Nov 10, 2024

RalfJung commented Nov 10, 2024 • edited Loading

DemiMarie commented Nov 10, 2024

RalfJung commented Nov 10, 2024 •

edited

Loading

RalfJung commented Nov 10, 2024 •

edited

Loading