Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Auto merge of #90821 - scottmcm:new-slice-reverse, r=Mark-Simulacrum
MIRI says `reverse` is UB, so replace it with something LLVM can vectorize For small types with padding, the current implementation is UB because it does integer operations on uninit values. ``` error: Undefined Behavior: using uninitialized data, but this operation requires initialized memory --> /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/num/mod.rs:836:5 | 836 | / uint_impl! { u32, u32, i32, 32, 4294967295, 8, "0x10000b3", "0xb301", "0x12345678", 837 | | "0x78563412", "0x1e6a2c48", "[0x78, 0x56, 0x34, 0x12]", "[0x12, 0x34, 0x56, 0x78]", "", "" } | |________________________________________________________________________________________________^ using uninitialized data, but this operation requires initialized memory | = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information = note: inside `core::num::<impl u32>::rotate_left` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/num/uint_macros.rs:211:13 = note: inside `core::slice::<impl [Foo]>::reverse` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/slice/mod.rs:701:58 ``` <https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=340739f22ca5b457e1da6f361768edc6> But LLVM has gotten smarter since I wrote the previous implementation in 2017, so this PR removes all the manual magic and just writes it in such a way that LLVM will vectorize. This code is much simpler and has very little `unsafe`, and is actually faster to boot! If you're curious to see the codegen: <https://rust.godbolt.org/z/Pcn13Y9E3> Before: ``` running 7 tests test slice::reverse_simd_f64x4 ... bench: 17,940 ns/iter (+/- 481) = 58448 MB/s test slice::reverse_u128 ... bench: 17,758 ns/iter (+/- 205) = 59048 MB/s test slice::reverse_u16 ... bench: 158,234 ns/iter (+/- 6,876) = 6626 MB/s test slice::reverse_u32 ... bench: 62,047 ns/iter (+/- 1,117) = 16899 MB/s test slice::reverse_u64 ... bench: 31,582 ns/iter (+/- 552) = 33201 MB/s test slice::reverse_u8 ... bench: 81,253 ns/iter (+/- 1,510) = 12905 MB/s test slice::reverse_u8x3 ... bench: 270,615 ns/iter (+/- 11,463) = 3874 MB/s ``` After: ``` running 7 tests test slice::reverse_simd_f64x4 ... bench: 17,731 ns/iter (+/- 306) = 59137 MB/s test slice::reverse_u128 ... bench: 17,919 ns/iter (+/- 239) = 58517 MB/s test slice::reverse_u16 ... bench: 43,160 ns/iter (+/- 607) = 24295 MB/s test slice::reverse_u32 ... bench: 21,065 ns/iter (+/- 371) = 49778 MB/s test slice::reverse_u64 ... bench: 21,118 ns/iter (+/- 482) = 49653 MB/s test slice::reverse_u8 ... bench: 76,878 ns/iter (+/- 1,688) = 13639 MB/s test slice::reverse_u8x3 ... bench: 264,723 ns/iter (+/- 5,544) = 3961 MB/s ``` Those are the existing benches, <https://github.com/rust-lang/rust/blob/14a2fd640e0df9ee8cc1e04280b0c3aff93c42da/library/alloc/benches/slice.rs#L322-L346>
- Loading branch information