Replace the default branch with an unreachable branch If it is the last variant #120268

DianQK · 2024-01-23T12:37:49Z

Fixes #119520. Fixes #110097.

LLVM currently has limited ability to eliminate dead branches in switches, even with the patch of llvm/llvm-project#73446.

The main reasons are as follows:

Additional costs are required to calculate the range of values, and there exist many scenarios that cannot be analyzed accurately.
Matching values by bitwise calculation cannot handle odd branches, nor can it handle values like -1, 0, 1. See SimplifyCFG.cpp#L5424 and https://llvm.godbolt.org/z/qYMqhvMa8
The current range information is continuous, even if the metadata for the range is submitted. See ConstantRange.cpp#L1869-L1870.
The metadata of the range may be lost in passes such as SROA. See https://rust.godbolt.org/z/e7f87vKMK.

Although we can make improvements, I think it would be more appropriate to put this issue to rustc first. After all, we can easily know the possible values.

Note that we've currently found a slow compilation problem in the presence of unreachable branches. See
llvm/llvm-project#78578.

r? compiler

rustbot · 2024-01-23T12:38:00Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

oli-obk · 2024-01-23T16:21:17Z

@bors try @rust-timer queue

bors · 2024-01-23T16:22:28Z

⌛ Trying commit 5617d16 with merge f893d88...

…tchs, r=<try> Replace the default branch with an unreachable branch If it is the last variant Fixes rust-lang#119520. LLVM currently has limited ability to eliminate dead branches in switches, even with the patch of llvm/llvm-project#73446. The main reasons are as follows: - Additional costs are required to calculate the range of values, and there exist many scenarios that cannot be analyzed accurately. - Matching values by bitwise calculation cannot handle odd branches, nor can it handle values like `-1, 0, 1`. See [SimplifyCFG.cpp#L5424](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L5424) and https://llvm.godbolt.org/z/qYMqhvMa8 - The current range information is continuous, even if the metadata for the range is submitted. See [ConstantRange.cpp#L1869-L1870](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/IR/ConstantRange.cpp#L1869-L1870). - The metadata of the range may be lost in passes such as SROA. See https://rust.godbolt.org/z/e7f87vKMK. Although we can make improvements, I think it would be more appropriate to put this issue to rustc first. After all, we can easily know the possible values. Note that we've currently found a slow compilation problem in the presence of unreachable branches. See llvm/llvm-project#78578. r? compiler

bors · 2024-01-23T17:48:00Z

☀️ Try build successful - checks-actions
Build commit: f893d88 (f893d886617ac224771fc6bbfd026e43d860599d)

rust-timer · 2024-01-23T19:07:19Z

Finished benchmarking commit (f893d88): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.2%	[-0.3%, -0.2%]	12
Improvements ✅ (secondary)	-0.3%	[-0.4%, -0.3%]	15
All ❌✅ (primary)	-0.2%	[-0.3%, -0.2%]	12

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.2%	[2.2%, 2.2%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-5.2%	[-10.4%, -1.4%]	5
Improvements ✅ (secondary)	-4.0%	[-4.0%, -4.0%]	1
All ❌✅ (primary)	-4.0%	[-10.4%, 2.2%]	6

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.4%	[-3.4%, -3.4%]	1
All ❌✅ (primary)	-	-	0

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.0%, 0.5%]	13
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.2%	[-0.4%, -0.1%]	5
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.1%	[-0.4%, 0.5%]	18

Bootstrap: 661.891s -> 662.949s (0.16%)
Artifact size: 308.33 MiB -> 308.27 MiB (-0.02%)

compiler/rustc_mir_transform/src/uninhabited_enum_branching.rs

cjgillot · 2024-01-23T22:26:23Z

tests/mir-opt/uninhabited_enum_branching.rs

+        Test4::C => "C",
+        _ => "D",
+    };
+}


Does this still work if Test4 holds a generic type instead of an i32? Should it be made so?

If I understand correctly, this will be split into multiple switchInt. So I expect generic type to get the same result. But the test case is not happening, so I should be missing something.

I added support for generic type.

tests/mir-opt/uninhabited_enum_branching.rs

DianQK · 2024-01-24T12:58:25Z

Based on the discussion in zulipchat, I have changed src/bootstrap/src/core/build_steps/test.rs.

oli-obk · 2024-01-24T13:49:38Z

@bors r+ rollup=never

bors · 2024-01-24T13:49:40Z

📌 Commit d02299c has been approved by oli-obk

It is now in the queue for this repository.

DianQK · 2024-01-25T00:18:01Z

Hmm, I just remembered a possible regression issue, but I don't think that should affect merging this PR. Because we're on the road to a better outcome.

#![crate_type = "lib"]

pub enum Bar {
    Foo = 1,
    Bar = 2,
    Baz = 3
}

#[no_mangle]
pub fn lookup(v: Bar) -> i32 {
    match v {
        Bar::Foo => 8,
        Bar::Bar => 9,
        Bar::Baz => 3,
    }
}

#[no_mangle]
pub fn compare(v: Bar) -> i32 {
    match v {
        Bar::Foo => 8,
        Bar::Bar => 9,
        _ => 3,
    }
}

#[no_mangle]
pub fn lookup2(v: Bar) -> i32 {
    match v {
        Bar::Foo => 1,
        Bar::Bar => 2,
        Bar::Baz => 3,
    }
}

#[no_mangle]
pub fn compare2(v: Bar) -> i32 {
    match v {
        Bar::Foo => 1,
        Bar::Bar => 2,
        _ => 3,
    }
}

GodBolt: https://rust.godbolt.org/z/55dq6zbjd

The lookup function will create a lookup table with a load instruction. I'm not sure which is better, compared to the compare. See SimplifyCFG.cpp#L6410.
But most of the scenarios should get better as we provide new unreachable facts, such as lookup2 and compare2.

bors · 2024-01-26T05:28:31Z

⌛ Testing commit d02299c with merge 4b854f3...

…tchs, r=oli-obk Replace the default branch with an unreachable branch If it is the last variant Fixes rust-lang#119520. LLVM currently has limited ability to eliminate dead branches in switches, even with the patch of llvm/llvm-project#73446. The main reasons are as follows: - Additional costs are required to calculate the range of values, and there exist many scenarios that cannot be analyzed accurately. - Matching values by bitwise calculation cannot handle odd branches, nor can it handle values like `-1, 0, 1`. See [SimplifyCFG.cpp#L5424](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L5424) and https://llvm.godbolt.org/z/qYMqhvMa8 - The current range information is continuous, even if the metadata for the range is submitted. See [ConstantRange.cpp#L1869-L1870](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/IR/ConstantRange.cpp#L1869-L1870). - The metadata of the range may be lost in passes such as SROA. See https://rust.godbolt.org/z/e7f87vKMK. Although we can make improvements, I think it would be more appropriate to put this issue to rustc first. After all, we can easily know the possible values. Note that we've currently found a slow compilation problem in the presence of unreachable branches. See llvm/llvm-project#78578. r? compiler

bors · 2024-01-26T05:49:11Z

💔 Test failed - checks-actions

oli-obk · 2024-03-08T06:22:28Z

@bors r+

bors · 2024-03-08T06:22:31Z

📌 Commit 2884230 has been approved by oli-obk

It is now in the queue for this repository.

RalfJung · 2024-03-08T07:13:34Z

tests/codegen/enum/uninhabited_enum_default_branch.rs

There is no uninhabited enum anywhere in this test... how does the test filename make sense?

This test case came directly from the issue it fixed. It will call partial_cmp, so it's essentially the same as #119520 (comment) . I think it makes sense to add the test code in the issue, maybe I should create two test cases.

But again there's no uninhabited enums anywhere that I can see, so (a) what does the test content have to do with the filename, and (b) what does it have to do with this PR?

Reading the code a bit more, I think the MIR pass (and associated test) are misnamed. This is no longer just about uninhabited variants, it is now also about exploiting that Discriminant will never return something that isn't a variant index. I am a bit surprised that this is done as a MIR transform rather than during MIR building but the MIR transform is correct according to our current understanding of MIR semantics. Just the name is misleading after this PR.

(a) I can change the file name to issue-119520.rs.
(b) https://rust.godbolt.org/z/za5c5hzoY When I reduce the issue's case, I found out that it uses Ordering after inlining. It implies the enum. I also hope that this test case will not lose optimization due to other changes in the future.

Yes. I'm considering updating the name. (It’s just that I didn’t think of a suitable name.)
Maybe I can change it to UnreachableEnumBranching.

I am a bit surprised that this is done as a MIR transform rather than during MIR building but the MIR transform is correct according to our current understanding of MIR semantics. Just the name is misleading after this PR.

It better for me to have MIR building match the structure of the code itself where possible. (This purpose may not matter either?)

It better for me to have MIR building match the structure of the code itself where possible. (This purpose may not matter either?)

Ah, I may have misunderstood where this optimization kicks in. I thought even this would just use fallback for the last variant:

match c { Less => -5, Equal => 0, Greater => 42, }

But already on stable that becomes switchInt(move _2) -> [255: bb3, 0: bb4, 1: bb1, otherwise: bb2].

I can change the file name to issue-119520.rs.

Once we have a better name for the pass, it can use that name. (Though it would also be good to mention the issue either in the file name or file contents. It's always good to add more cross-references and those are otherwise much harder to reconstruct in the future.)

Maybe I can change it to UnreachableEnumBranching.

I like it. :) The module-level comment in that file should then explain the two ways that "unreachable" is determined.

Ah, I may have misunderstood where this optimization kicks in. I thought even this would just use fallback for the last variant:

match c { Less => -5, Equal => 0, Greater => 42, }

But already on stable that becomes switchInt(move _2) -> [255: bb3, 0: bb4, 1: bb1, otherwise: bb2].

This is something UninhabitedEnumBranching has already done.

This PR transforms following codes

match c { Less => -5, Equal => 0, _ => 42, }

to

match c { Less => -5, Equal => 0, Greater => 42, }

.

RalfJung · 2024-03-08T07:16:17Z

compiler/rustc_mir_transform/src/uninhabited_enum_branching.rs

+                && allowed_variants.len() == 1
+                && check_successors(&body.basic_blocks, targets.otherwise());
+            let replace_otherwise_to_unreachable = otherwise_is_last_variant
+                || !otherwise_is_empty_unreachable && allowed_variants.is_empty();


This could use a few more comments explaining what happens here -- why all these checks are needed and why they are combined in exactly the way they are. Imagine someone reading this code in a year without knowing about this PR -- what would they have to know to make sense of all this? For instance, what is check_successors even checking?

Also, a || b && c could use parentheses, the precedence is currently unclear.

Hmm, testing has started, r- or I'll add a PR subsequently?

Subsequent PR is fine.

bors · 2024-03-08T07:18:21Z

⌛ Testing commit 2884230 with merge 14fbc3c...

bors · 2024-03-08T09:32:58Z

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing 14fbc3c to master...

rust-timer · 2024-03-08T10:47:39Z

Finished benchmarking commit (14fbc3c): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.7%	[0.2%, 1.8%]	4
Regressions ❌ (secondary)	0.2%	[0.2%, 0.3%]	7
Improvements ✅ (primary)	-0.8%	[-1.2%, -0.3%]	5
Improvements ✅ (secondary)	-0.9%	[-2.2%, -0.3%]	3
All ❌✅ (primary)	-0.1%	[-1.2%, 1.8%]	9

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	3.9%	[0.2%, 8.8%]	4
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-3.6%	[-6.3%, -2.2%]	3
Improvements ✅ (secondary)	-5.4%	[-5.4%, -5.4%]	1
All ❌✅ (primary)	0.7%	[-6.3%, 8.8%]	7

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.0%	[2.0%, 2.0%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	2.0%	[2.0%, 2.0%]	1

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.1%	[0.0%, 0.2%]	11
Regressions ❌ (secondary)	0.0%	[0.0%, 0.0%]	7
Improvements ✅ (primary)	-0.2%	[-1.0%, -0.0%]	35
Improvements ✅ (secondary)	-0.1%	[-1.6%, -0.0%]	15
All ❌✅ (primary)	-0.1%	[-1.0%, 0.2%]	46

Bootstrap: 646.506s -> 648.483s (0.31%)
Artifact size: 172.55 MiB -> 172.46 MiB (-0.05%)

Rename `UninhabitedEnumBranching` to `UnreachableEnumBranching` Per [rust-lang#120268](rust-lang#120268 (comment)), I rename `UninhabitedEnumBranching` to `UnreachableEnumBranching` . I solved some nits to add some comments. I adjusted the workaround restrictions. This should be useful for `a <= b` and `if let Some/Ok(v)`. For enum with few variants, `early-tailduplication` should not cause compile time overhead. r? RalfJung

pnkfelix · 2024-03-12T17:24:47Z

Visiting for weekly performance triage.

@DianQK the 1.8% regression to cargo opt-full is concerning to me. But from looking at the early rust-timer invocations, I saw it come up only once, not every time. So its not clear to me how much it was anticipated in your developments here.

Do you have any idea where the cargo opt-full regression is arising? Is it somehow connected to llvm/llvm-project#78578 ?

(not marking as triaged, not yet.)

…iant_switchs, r=oli-obk" This reverts commit 14fbc3c, reversing changes made to 9fb91aa.

[perf test] rust-lang#120268 r? ghost

DianQK · 2024-03-13T14:19:57Z

I don't think this is a regression. This is the result of previous try:
#120268 (comment). I don't see cargo related regressions.
I made another attempt by revert this PR: #122414. I can only see cranelift-codegen restoring the previous result.

Or maybe I'm missing something.
Usually, I think adding a fact may provide more opportunities for optimization. We need more work to optimize. This may be a desired result.

DianQK · 2024-03-13T14:23:30Z

Is it somehow connected to llvm/llvm-project#78578 ?

I don't think it's relevant. I should see the output size increase because early-tailduplication will duplicate instructions. But the actual result is -0.38%.

pnkfelix · 2024-03-14T14:56:27Z

We discussed the cargo opt-full result a little at T-compiler meeting today (zulip)

We decided that this does seem, from the chart, that it was subsequently resolved (potentially by PR #120985).

@rustbot label: +perf-regression-triaged

Rename `UninhabitedEnumBranching` to `UnreachableEnumBranching` Per [rust-lang#120268](rust-lang#120268 (comment)), I rename `UninhabitedEnumBranching` to `UnreachableEnumBranching` . I solved some nits to add some comments. I adjusted the workaround restrictions. This should be useful for `a <= b` and `if let Some/Ok(v)`. For enum with few variants, `early-tailduplication` should not cause compile time overhead. r? RalfJung

Rename `UninhabitedEnumBranching` to `UnreachableEnumBranching` Per [#120268](rust-lang/rust#120268 (comment)), I rename `UninhabitedEnumBranching` to `UnreachableEnumBranching` . I solved some nits to add some comments. I adjusted the workaround restrictions. This should be useful for `a <= b` and `if let Some/Ok(v)`. For enum with few variants, `early-tailduplication` should not cause compile time overhead. r? RalfJung

rustbot assigned davidtwco Jan 23, 2024

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jan 23, 2024

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 23, 2024

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 23, 2024

oli-obk reviewed Jan 23, 2024

View reviewed changes

compiler/rustc_mir_transform/src/uninhabited_enum_branching.rs Outdated Show resolved Hide resolved

cjgillot reviewed Jan 23, 2024

View reviewed changes

DianQK force-pushed the otherwise_is_last_variant_switchs branch from 5617d16 to 5a398d3 Compare January 24, 2024 00:30

This comment has been minimized.

Sign in to view

rustbot added the T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) label Jan 24, 2024

This comment has been minimized.

Sign in to view

DianQK force-pushed the otherwise_is_last_variant_switchs branch from af2420c to d02299c Compare January 24, 2024 13:23

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 24, 2024

This comment has been minimized.

Sign in to view

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 8, 2024

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 8, 2024

RalfJung reviewed Mar 8, 2024

View reviewed changes

bors added the merged-by-bors This PR was explicitly merged by bors. label Mar 8, 2024

bors merged commit 14fbc3c into rust-lang:master Mar 8, 2024
12 checks passed

rustbot added this to the 1.78.0 milestone Mar 8, 2024

DianQK deleted the otherwise_is_last_variant_switchs branch March 8, 2024 09:36

bors mentioned this pull request Mar 8, 2024

Transforms match into an assignment statement #120614

Merged

DianQK mentioned this pull request Mar 9, 2024

Rename UninhabitedEnumBranching to UnreachableEnumBranching #122225

Merged

DianQK mentioned this pull request Mar 12, 2024

[WIP] Re-enable the early otherwise branch optimization #121397

Closed

DianQK added a commit to DianQK/rust that referenced this pull request Mar 13, 2024

Revert "Auto merge of rust-lang#120268 - DianQK:otherwise_is_last_var…

272dd0b

…iant_switchs, r=oli-obk" This reverts commit 14fbc3c, reversing changes made to 9fb91aa.

bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 13, 2024

Auto merge of rust-lang#122414 - DianQK:perf/test-120268, r=<try>

38b84c2

[perf test] rust-lang#120268 r? ghost

rustbot added the perf-regression-triaged The performance regression has been triaged. label Mar 14, 2024

Replace the default branch with an unreachable branch If it is the last variant #120268

Replace the default branch with an unreachable branch If it is the last variant #120268

Conversation

DianQK commented Jan 23, 2024 • edited Loading

rustbot commented Jan 23, 2024

oli-obk commented Jan 23, 2024

This comment has been minimized.

bors commented Jan 23, 2024

bors commented Jan 23, 2024

This comment has been minimized.

rust-timer commented Jan 23, 2024

Overall result: ✅ improvements - no action needed

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment has been minimized.

DianQK commented Jan 24, 2024

This comment has been minimized.

oli-obk commented Jan 24, 2024

bors commented Jan 24, 2024

DianQK commented Jan 25, 2024

bors commented Jan 26, 2024

This comment has been minimized.

bors commented Jan 26, 2024

oli-obk commented Mar 8, 2024

bors commented Mar 8, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RalfJung Mar 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DianQK Mar 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RalfJung Mar 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bors commented Mar 8, 2024

bors commented Mar 8, 2024

rust-timer commented Mar 8, 2024

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

pnkfelix commented Mar 12, 2024 • edited Loading

DianQK commented Mar 13, 2024

DianQK commented Mar 13, 2024

pnkfelix commented Mar 14, 2024 • edited Loading

DianQK commented Jan 23, 2024 •

edited

Loading

RalfJung Mar 8, 2024 •

edited

Loading

DianQK Mar 8, 2024 •

edited

Loading

RalfJung Mar 8, 2024 •

edited

Loading

pnkfelix commented Mar 12, 2024 •

edited

Loading

pnkfelix commented Mar 14, 2024 •

edited

Loading