Represent `Result<usize, Box<T>>` as ScalarPair(i64, ptr) #121668

erikdesjardins · 2024-02-27T05:18:58Z

This allows types like Result<usize, std::io::Error> (and integers of differing sign, e.g. Result<u64, i64>) to be passed in a pair of registers instead of through memory, like Result<u64, u64> or Result<Box<T>, Box<U>> are today.

Fixes #97540.

r? @ghost

erikdesjardins · 2024-02-27T05:19:17Z

tests/codegen/common_prim_int_ptr.rs

+// CHECK-LABEL: @insert_int
+#[no_mangle]
+pub fn insert_int(x: usize) -> Result<usize, Box<()>> {
+    // CHECK: start:
+    // CHECK-NEXT: inttoptr i{{[0-9]+}} %x to ptr
+    // CHECK-NEXT: insertvalue
+    // CHECK-NEXT: ret { i{{[0-9]+}}, ptr }
+    Ok(x)
+}


I am surprised that this just works, inserting inttoptr/ptrtoint as necessary with only a layout change and no changes to codegen. It feels like something in codegen is a bit too permissive about adding casts whenever it needs to...

This is optimized IR, so it's likely that rustc did something much more boring but LLVM optimized it to this.

Add -C no-prepopulate-passes in the compile-flags if you want to look at what cg_llvm is doing directly.

For example, you get an inttoptr like that after LLVM optimizes a store+load: https://rust.godbolt.org/z/qMPWr7Tzz

(Rust would actually prefer this example to be a GEP off null, like #121242, I think. Not that this PR would need to do that.)

How did I forget that...

Yeah, the IR we generate just spills to an alloca.

erikdesjardins · 2024-02-27T05:20:29Z

tests/codegen/try_question_mark_nop.rs

 // CHECK-LABEL: @result_nop_match_32
 #[no_mangle]
 pub fn result_nop_match_32(x: Result<i32, u32>) -> Result<i32, u32> {
-    // CHECK: start
-    // CHECK-NEXT: ret i64 %0
+    // CHECK: start:
+    // CHECK-NEXT: insertvalue { i32, i32 }
+    // CHECK-NEXT: insertvalue { i32, i32 }
+    // CHECK-NEXT: ret { i32, i32 }


Although this is no longer just a ret instruction, this is the fewest possible instructions for scalar pair layout, since you need to pack the two separate arguments into a single return value.

For real, non-noop code, the scalar pair layout is arguably slightly better, since passing the components in separate arguments means that consuming code can just use the second argument directly, instead of having to shift it down from the top 32 bits of an i64 reg.

It was kinda janky in the first place for this test to rely on the fact that integers with differing signedness don't get scalar pair layout; Result<u32, u32> has always gotten the same scalar pair layout (since 1.27) that Result<i32, u32> now gets with this PR.

Yup, agreed it's fine to change this test like this. The important part is that it's not branching, not icmping, not selecting, etc.

Noratrieb · 2024-02-27T05:53:08Z

@bors try @rust-timer queue

Represent `Result<usize, Box<T>>` as ScalarPair(i64, ptr) This allows types like `Result<usize, std::io::Error>` (and integers of differing sign, e.g. `Result<u64, i64>`) to be passed in a pair of registers instead of through memory, like `Result<u64, u64>` or `Result<Box<T>, Box<U>>` are today. Fixes rust-lang#97540. r? `@ghost`

bors · 2024-02-27T05:54:19Z

⌛ Trying commit 4dabbcb with merge 620bc57...

bors · 2024-02-27T07:24:19Z

☀️ Try build successful - checks-actions
Build commit: 620bc57 (620bc57fd23a71e77bfab2c9db4cb12dff02e970)

rust-timer · 2024-02-27T20:57:50Z

Finished benchmarking commit (620bc57): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.7%	[0.4%, 0.8%]	5
Regressions ❌ (secondary)	0.4%	[0.2%, 1.2%]	27
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.8%	[-0.8%, -0.8%]	1
All ❌✅ (primary)	0.7%	[0.4%, 0.8%]	5

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.8%	[-3.8%, -3.8%]	1
All ❌✅ (primary)	-	-	0

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.0%	[0.0%, 0.0%]	4
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.2%	[-0.2%, -0.2%]	1
All ❌✅ (primary)	0.0%	[0.0%, 0.0%]	4

Bootstrap: 649.3s -> 650.378s (0.17%)
Artifact size: 311.08 MiB -> 311.04 MiB (-0.01%)

erikdesjardins · 2024-02-27T23:06:46Z

Hmm, nearly all of the changed benchmarks seem to be noisy, so it's hard to tell if it's real or not...

Noratrieb · 2024-02-28T06:07:14Z

looks like noise and it being perf neutral to me

erikdesjardins · 2024-02-28T22:56:07Z

In that case...

r? compiler

davidtwco

This seems reasonable to me but I'm not familiar enough with this code to anticipate all the consequences of the change.

davidtwco · 2024-03-04T11:52:04Z

r? compiler

nnethercote · 2024-03-05T02:51:37Z

I'm not a good reviewer for this, let's try r? @scottmcm and cc @cuviper for good luck.

bors · 2024-03-06T10:03:19Z

💔 Test failed - checks-actions

It seems that LLVM 17 doesn't fully optimize out unwrap_unchecked. We can just loosen the check lines to account for this, since we don't really care about the exact instructions, we just want to make sure that inttoptr/ptrtoint aren't used for Box.

erikdesjardins · 2024-03-07T00:39:48Z

The job x86_64-gnu-llvm-17 failed!

It seems LLVM 17 doesn't optimize out the unwrap_unchecked. Loosened the check lines since we don't care about the precise instruction sequence.

erikdesjardins · 2024-03-12T21:50:13Z

Ping @scottmcm, I made some changes to the codegen tests so they also pass on LLVM 17. Let me know if you'd prefer to revert the last commit and use min-llvm-version: 18 instead.

tests/codegen/common_prim_int_ptr.rs

scottmcm

Hmm, looks like I accidentally commented on the commit, which has github a bit confused. Let's try this instead...

(And do @rustbot ready once you're happy with it, please.)

tests/codegen/common_prim_int_ptr.rs

Co-authored-by: Scott McMurray <scottmcm@users.noreply.github.com>

erikdesjardins · 2024-03-13T05:35:06Z

Yeah, something like that works.

@rustbot ready

scottmcm · 2024-03-13T15:19:35Z

Oh, right, scalar pair, so clearly more than one parameter 🤦

Test updates look good:
@bors r=scottmcm,oli-obk rollup=never

bors · 2024-03-13T15:19:37Z

📌 Commit 9f55200 has been approved by scottmcm,oli-obk

It is now in the queue for this repository.

bors · 2024-03-13T15:25:39Z

⌛ Testing commit 9f55200 with merge 3cbb932...

bors · 2024-03-13T17:32:09Z

☀️ Test successful - checks-actions
Approved by: scottmcm,oli-obk
Pushing 3cbb932 to master...

rust-timer · 2024-03-13T18:56:04Z

Finished benchmarking commit (3cbb932): comparison URL.

Overall result: ❌✅ regressions and improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.1%	[1.1%, 1.1%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.5%	[-3.5%, -3.5%]	1
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.0%	[-3.0%, -3.0%]	1
All ❌✅ (primary)	-	-	0

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.0%	[0.0%, 0.2%]	5
Regressions ❌ (secondary)	1.0%	[1.0%, 1.3%]	28
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.2%	[-0.2%, -0.2%]	1
All ❌✅ (primary)	0.0%	[0.0%, 0.2%]	5

Bootstrap: 675.199s -> 675.071s (-0.02%)
Artifact size: 310.80 MiB -> 310.73 MiB (-0.02%)

erikdesjardins · 2024-03-13T21:53:15Z

Quoth nnethercote: Are the binary size increases expected?

This change looks basically the same as #121665 (comment), where all of the regressions are due to a uniform ~4k increase in the stdlib "runtime" that's included in every binary.

I'm looking into whether we can make some improvements to the stdlib backtrace code as a whole, either by cha—wait a minute.

Didn't helloworld-tiny start at 305,488 bytes last time too? It sure did--and broadly the same (+/- a few hundred bytes) for all the other 1% regressions.

So it was noise all along?

(I'm still gonna look into some backtrace size improvements though, since all the big functions in helloworld are backtrace related. Edit: opened #122462 as a first step)

allow using scalarpair with a common prim of ptr/ptr-sized-int

4dabbcb

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 27, 2024

erikdesjardins commented Feb 27, 2024

View reviewed changes

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 27, 2024

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Feb 27, 2024

simplify common prim computation

f574c65

erikdesjardins marked this pull request as ready for review February 28, 2024 22:56

rustbot assigned davidtwco Feb 28, 2024

This comment has been minimized.

Sign in to view

fix test failure due to differing u64 alignment on different targets

8e40b17

davidtwco reviewed Mar 4, 2024

View reviewed changes

rustbot assigned nnethercote and unassigned davidtwco Mar 4, 2024

rustbot assigned scottmcm and unassigned nnethercote Mar 5, 2024

This comment has been minimized.

Sign in to view

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 6, 2024

scottmcm reviewed Mar 13, 2024

View reviewed changes

tests/codegen/common_prim_int_ptr.rs Outdated Show resolved Hide resolved

scottmcm reviewed Mar 13, 2024

View reviewed changes

tests/codegen/common_prim_int_ptr.rs Outdated Show resolved Hide resolved

scottmcm requested changes Mar 13, 2024

View reviewed changes

tests/codegen/common_prim_int_ptr.rs Outdated Show resolved Hide resolved

tests/codegen/common_prim_int_ptr.rs Outdated Show resolved Hide resolved

tests/codegen/common_prim_int_ptr.rs Show resolved Hide resolved

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 13, 2024

refine common_prim test

9f55200

Co-authored-by: Scott McMurray <scottmcm@users.noreply.github.com>

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 13, 2024

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 13, 2024

riking mentioned this pull request Mar 13, 2024

Result that uses niches resulting in a final size of 16 bytes emits poor LLVM IR due to ABI #97540

Closed

bors added the merged-by-bors This PR was explicitly merged by bors. label Mar 13, 2024

bors merged commit 3cbb932 into rust-lang:master Mar 13, 2024
12 checks passed

rustbot added this to the 1.78.0 milestone Mar 13, 2024

rustbot removed the perf-regression Performance regression. label Mar 13, 2024

erikdesjardins deleted the commonprim branch March 13, 2024 21:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Represent `Result<usize, Box<T>>` as ScalarPair(i64, ptr) #121668

Represent `Result<usize, Box<T>>` as ScalarPair(i64, ptr) #121668

erikdesjardins commented Feb 27, 2024 •

edited

Loading

erikdesjardins Feb 27, 2024

scottmcm Mar 5, 2024

scottmcm Mar 5, 2024

scottmcm Mar 5, 2024 •

edited

Loading

erikdesjardins Mar 5, 2024

erikdesjardins Feb 27, 2024

scottmcm Mar 5, 2024

Noratrieb commented Feb 27, 2024

This comment has been minimized.

bors commented Feb 27, 2024

This comment has been minimized.

bors commented Feb 27, 2024

This comment has been minimized.

rust-timer commented Feb 27, 2024

erikdesjardins commented Feb 27, 2024

Noratrieb commented Feb 28, 2024

erikdesjardins commented Feb 28, 2024

This comment has been minimized.

davidtwco left a comment

davidtwco commented Mar 4, 2024

nnethercote commented Mar 5, 2024

This comment has been minimized.

bors commented Mar 6, 2024

erikdesjardins commented Mar 7, 2024 •

edited

Loading

erikdesjardins commented Mar 12, 2024

scottmcm left a comment •

edited

Loading

erikdesjardins commented Mar 13, 2024

scottmcm commented Mar 13, 2024

bors commented Mar 13, 2024

bors commented Mar 13, 2024

bors commented Mar 13, 2024

rust-timer commented Mar 13, 2024

erikdesjardins commented Mar 13, 2024 •

edited

Loading

Represent Result<usize, Box<T>> as ScalarPair(i64, ptr) #121668

Represent Result<usize, Box<T>> as ScalarPair(i64, ptr) #121668

Conversation

erikdesjardins commented Feb 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scottmcm Mar 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Noratrieb commented Feb 27, 2024

This comment has been minimized.

bors commented Feb 27, 2024

This comment has been minimized.

bors commented Feb 27, 2024

This comment has been minimized.

rust-timer commented Feb 27, 2024

Overall result: ❌ regressions - ACTION NEEDED

erikdesjardins commented Feb 27, 2024

Noratrieb commented Feb 28, 2024

erikdesjardins commented Feb 28, 2024

This comment has been minimized.

davidtwco left a comment

Choose a reason for hiding this comment

davidtwco commented Mar 4, 2024

nnethercote commented Mar 5, 2024

This comment has been minimized.

bors commented Mar 6, 2024

erikdesjardins commented Mar 7, 2024 • edited Loading

erikdesjardins commented Mar 12, 2024

scottmcm left a comment • edited Loading

Choose a reason for hiding this comment

erikdesjardins commented Mar 13, 2024

scottmcm commented Mar 13, 2024

bors commented Mar 13, 2024

bors commented Mar 13, 2024

bors commented Mar 13, 2024

rust-timer commented Mar 13, 2024

Overall result: ❌✅ regressions and improvements - no action needed

erikdesjardins commented Mar 13, 2024 • edited Loading

Represent `Result<usize, Box<T>>` as ScalarPair(i64, ptr) #121668

Represent `Result<usize, Box<T>>` as ScalarPair(i64, ptr) #121668

erikdesjardins commented Feb 27, 2024 •

edited

Loading

scottmcm Mar 5, 2024 •

edited

Loading

erikdesjardins commented Mar 7, 2024 •

edited

Loading

scottmcm left a comment •

edited

Loading

erikdesjardins commented Mar 13, 2024 •

edited

Loading