Use MIR's `Offset` for pointer `add` too #110837

scottmcm · 2023-04-26T06:11:40Z

~~Status: draft while waiting for #110822 to land, since this is built atop that.~~
~~r? @ghost~~

Canonical Rust code has mostly moved to add/sub on pointers, which take usize, instead of offset which takes isize. (And, relatedly, when sub_ptr was added it turned out it replaced every single in-tree use of offset_from, because usize is just so much more useful than isize in Rust.)

Unfortunately, intrinsics::offset could only accept *const and isize, so there's a huge amount of type conversions back and forth being done. They're identity conversions in the backend, but still end up producing quite a lot of unhelpful MIR.

This PR changes intrinsics::offset to accept *const and *mut along with isize and usize. Conveniently, the backends and CTFE already handle this, since MIR's BinOp::Offset already supports all four combinations.

To demonstrate the difference, I added some mir-opt/pre-codegen/ tests around slice indexing. Here's the difference to [T]::get_mut, since it uses <*mut _>::add internally:

@@ -79,30 +70,21 @@ fn slice_get_mut_usize(_1: &mut [u32], _2: usize) -> Option<&mut u32> {
         StorageLive(_12);                // scope 3 at $SRC_DIR/core/src/slice/index.rs:LL:COL
         StorageLive(_9);                 // scope 6 at $SRC_DIR/core/src/slice/index.rs:LL:COL
         _9 = _8 as *mut u32 (PtrToPtr);  // scope 11 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
-        StorageLive(_13);                // scope 13 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
-        _13 = _2 as isize (IntToInt);    // scope 13 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
-        StorageLive(_14);                // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
-        StorageLive(_15);                // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
-        _15 = _9 as *const u32 (Pointer(MutToConstPointer)); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
-        _14 = Offset(move _15, _13);     // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
-        StorageDead(_15);                // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
-        _7 = move _14 as *mut u32 (PtrToPtr); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
-        StorageDead(_14);                // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
-        StorageDead(_13);                // scope 13 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
+        _7 = Offset(_9, _2);             // scope 13 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
         StorageDead(_9);                 // scope 6 at $SRC_DIR/core/src/slice/index.rs:LL:COL
         StorageDead(_12);                // scope 3 at $SRC_DIR/core/src/slice/index.rs:LL:COL
         StorageDead(_11);                // scope 3 at $SRC_DIR/core/src/slice/index.rs:LL:COL

1c1c8e4#diff-a841b6a4538657add3f39bc895744331453d0625e7aace128b1f604f0b63c8fdR80

scottmcm · 2023-04-26T08:07:10Z

@bors try @rust-timer queue

bors · 2023-04-26T08:07:21Z

⌛ Trying commit ba11a57d8732f24440bc0cea73d9349ff1bfe1f2 with merge f18e18871f6876c0ae22f4f4939e847cb0a3eaf4...

bors · 2023-04-26T09:48:12Z

☀️ Try build successful - checks-actions
Build commit: f18e18871f6876c0ae22f4f4939e847cb0a3eaf4 (f18e18871f6876c0ae22f4f4939e847cb0a3eaf4)

scottmcm · 2023-04-26T18:05:46Z

@bors try @rust-timer queue

bors · 2023-04-26T18:05:56Z

⌛ Trying commit 0864da33e625dd85a381598a6ec9779b8cc5f0a8 with merge c677fe4f1af9780ecb80c7d1c453d9de4051aadd...

bors · 2023-04-26T19:46:57Z

☀️ Try build successful - checks-actions
Build commit: c677fe4f1af9780ecb80c7d1c453d9de4051aadd (c677fe4f1af9780ecb80c7d1c453d9de4051aadd)

rust-timer · 2023-04-26T21:06:29Z

Finished benchmarking commit (c677fe4f1af9780ecb80c7d1c453d9de4051aadd): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.7%	[0.3%, 2.0%]	5
Regressions ❌ (secondary)	0.3%	[0.2%, 0.4%]	7
Improvements ✅ (primary)	-0.7%	[-1.1%, -0.3%]	10
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.3%	[-1.1%, 2.0%]	15

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.7%	[1.0%, 4.6%]	3
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-3.2%	[-6.5%, -0.3%]	3
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.3%	[-6.5%, 4.6%]	6

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.2%	[2.2%, 2.2%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	2.2%	[2.2%, 2.2%]	1

compiler/rustc_codegen_ssa/src/mir/rvalue.rs

compiler-errors

Changes look fine to me other than one question, and then I am not sure if this perf regression is actually worth caring about or not.

compiler-errors · 2023-04-27T05:56:19Z

I thought about it more. I think the emitted code is overall simpler and the codegen changes are yeah, more of a side-effect of the usage of the Offset instruction and subtle interactions with the inliner.

@bors r+

bors · 2023-04-27T05:56:21Z

📌 Commit 1c1c8e442add0f46905a57a25a6cba52b8b0c54d has been approved by compiler-errors

It is now in the queue for this repository.

compiler/rustc_codegen_ssa/src/mir/rvalue.rs

WaffleLapkin · 2023-04-27T12:00:38Z

compiler/rustc_hir_analysis/src/check/intrinsic.rs

@@ -215,7 +215,8 @@ pub fn check_intrinsic_type(tcx: TyCtxt<'_>, it: &hir::ForeignItem<'_>) {

            sym::type_name => (1, Vec::new(), tcx.mk_static_str()),
            sym::type_id => (1, Vec::new(), tcx.types.u64),
-            sym::offset | sym::arith_offset => (
+            sym::offset => (2, vec![param(0), param(1)], param(0)),
+            sym::arith_offset => (


Is arith_offset not changed because we don't have a version of it in MIR?

Makes me wander if we should try intrinsicing other offset like functions like {byte,wrapping,wrapping_byte}_{add,sub,offset}

@WaffleLapkin The weird thing about byte_add is that it works on fat pointers. So it ends up quite a bit more complex than offset. I'm not sure if that's worth doing as an intrinsic or not. For the data part of it, cast-then-offset-then-cast of course works fine.

As for sub, maybe we should start by changing it from wrapping_neg to unchecked_neg, especially after #112238...

Hm, but wouldn't byte_add be just self.data += x? Doesn't sound like it's too hard for an intrinsic...

@WaffleLapkin Is perhaps your email client catching up on some ancient messages? Or are looking at old stuff again?

@scottmcm I cleared up old notifications that I were putting off for a long time 😄

scottmcm · 2023-04-28T05:45:08Z

Just made the requested comment change.

@bors r=compiler-errors

bors · 2023-04-28T05:45:09Z

📌 Commit e1da77c has been approved by compiler-errors

It is now in the queue for this repository.

bors · 2023-04-28T09:27:03Z

⌛ Testing commit e1da77c with merge 43a7802...

bors · 2023-04-28T12:30:26Z

☀️ Test successful - checks-actions
Approved by: compiler-errors
Pushing 43a7802 to master...

rust-timer · 2023-04-28T14:32:23Z

Finished benchmarking commit (43a7802): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.2%	[0.9%, 1.4%]	2
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.8%	[-1.4%, -0.5%]	14
Improvements ✅ (secondary)	-1.3%	[-2.0%, -0.5%]	11
All ❌✅ (primary)	-0.6%	[-1.4%, 1.4%]	16

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	6.5%	[4.3%, 8.6%]	2
Regressions ❌ (secondary)	1.1%	[1.1%, 1.1%]	1
Improvements ✅ (primary)	-3.5%	[-7.7%, -0.4%]	4
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.2%	[-7.7%, 8.6%]	6

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.2%	[2.2%, 2.2%]	1
Improvements ✅ (primary)	-0.8%	[-0.9%, -0.8%]	3
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.8%	[-0.9%, -0.8%]	3

scottmcm · 2023-04-28T16:18:53Z

Ah, those perf results make more sense to me. A few improvements to check/debug, net improvement for opt despite some regressions, and a small improvement to binary size (mostly in debug).

nnethercote · 2023-04-28T21:59:02Z

Wins exceed losses.

@rustbot label: +perf-regression-triaged

RalfJung · 2024-09-11T07:54:31Z

library/core/src/intrinsics.rs

+    #[must_use = "returns a new pointer rather than modifying its argument"]
+    #[rustc_const_stable(feature = "const_ptr_offset", since = "1.61.0")]
+    #[rustc_nounwind]
+    pub fn offset<Ptr, Delta>(dst: Ptr, offset: Delta) -> Ptr;


When changing intrinsics, please make sure to always Cc the opsem and miri teams! These are language primitives, a bunch of places need to typically be adjusted to make everything fit again. In this case, this PR made the Miri logic all wrong since it will still always assume the offset to be an isize -- thus missing UB, which is a critical bug in Miri.

Sorry about that, Ralf! I'll keep it in mind next time.

When the validator was explicitly allowing both isize and usize as argument types, I'd assumed it was handling the usize in the "obvious" way, but clearly I should have checked.

RalfJung · 2024-09-11T07:55:38Z

Conveniently, the backends and CTFE already handle this,

CTFE handles this, but wrong -- it will always transmute the argument to isize, which doesn't properly detect overflow for the usize case.

…ottmcm ptr::add/sub: do not claim equivalence with `offset(c as isize)` In rust-lang#110837, the `offset` intrinsic got changed to also allow a `usize` offset parameter. The intention is that this will do an unsigned multiplication with the size, and we have UB if that overflows -- and we also have UB if the result is larger than `usize::MAX`, i.e., if a subsequent cast to `isize` would wrap. ~~The LLVM backend sets some attributes accordingly.~~ This updates the docs for `add`/`sub` to match that intent, in preparation for adjusting codegen to exploit this UB. We use this opportunity to clarify what the exact requirements are: we compute the offset using mathematical multiplication (so it's no problem to have an `isize * usize` multiplication, we just multiply integers), and the result must fit in an `isize`. Cc `@rust-lang/opsem` `@nikic` rust-lang#130239 updates Miri to detect this UB. `sub` still has some cases of UB not reflected in the underlying intrinsic semantics (and Miri does not catch): when we subtract `usize::MAX`, then after casting to `isize` that's just `-1` so we end up adding one unit without noticing any UB, but actually the offset we gave does not fit in an `isize`. Miri will currently still not complain for such cases: ```rust fn main() { let x = &[0i32; 2]; let x = x.as_ptr(); // This should be UB, we are subtracting way too much. unsafe { x.sub(usize::MAX).read() }; } ``` However, the LLVM IR we generate here also is UB-free. This is "just" library UB but not language UB. Cc `@saethlin;` might be worth adding precondition checks against overflow on `offset`/`add`/`sub`? Fixes rust-lang#130211

Rollup merge of rust-lang#130229 - RalfJung:ptr-offset-unsigned, r=scottmcm ptr::add/sub: do not claim equivalence with `offset(c as isize)` In rust-lang#110837, the `offset` intrinsic got changed to also allow a `usize` offset parameter. The intention is that this will do an unsigned multiplication with the size, and we have UB if that overflows -- and we also have UB if the result is larger than `usize::MAX`, i.e., if a subsequent cast to `isize` would wrap. ~~The LLVM backend sets some attributes accordingly.~~ This updates the docs for `add`/`sub` to match that intent, in preparation for adjusting codegen to exploit this UB. We use this opportunity to clarify what the exact requirements are: we compute the offset using mathematical multiplication (so it's no problem to have an `isize * usize` multiplication, we just multiply integers), and the result must fit in an `isize`. Cc `@rust-lang/opsem` `@nikic` rust-lang#130239 updates Miri to detect this UB. `sub` still has some cases of UB not reflected in the underlying intrinsic semantics (and Miri does not catch): when we subtract `usize::MAX`, then after casting to `isize` that's just `-1` so we end up adding one unit without noticing any UB, but actually the offset we gave does not fit in an `isize`. Miri will currently still not complain for such cases: ```rust fn main() { let x = &[0i32; 2]; let x = x.as_ptr(); // This should be UB, we are subtracting way too much. unsafe { x.sub(usize::MAX).read() }; } ``` However, the LLVM IR we generate here also is UB-free. This is "just" library UB but not language UB. Cc `@saethlin;` might be worth adding precondition checks against overflow on `offset`/`add`/`sub`? Fixes rust-lang#130211

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Apr 26, 2023

scottmcm force-pushed the offset-for-add branch from 31935ae to 0e9c35e Compare April 26, 2023 06:15