Missing MIR optimization: Replace matches with loads if possible #88793

pcwalton · 2021-09-09T20:27:14Z

Many times, people write matches on enums that are really loads from a table. It'd be nice if we could codegen them as such.

Here's an example, at https://godbolt.org/z/4P9v61an7:

pub enum Char {
    A,
    B,
    C,
    D,
    E,
    F,
    G,
    H,
    I,
    J,
    K,
    L,
}

pub fn to_str(val: Char) -> &'static str {
    match val {
        Char::A => "A",
        Char::B => "B",
        Char::C => "C",
        Char::D => "D",
        Char::E => "E",
        Char::F => "F",
        Char::G => "G",
        Char::H => "H",
        Char::I => "I",
        Char::J => "J",
        Char::K => "K",
        Char::L => "L",
    }
}

The codegen here has a lot to be desired:

example::to_str:
        lea     rax, [rip + .L__unnamed_1]
        movzx   ecx, dil
        lea     rdx, [rip + .LJTI0_0]
        movsxd  rcx, dword ptr [rdx + 4*rcx]
        add     rcx, rdx
        jmp     rcx
.LBB0_1:
        lea     rax, [rip + .L__unnamed_2]
        mov     edx, 1
        ret
.LBB0_2:
        lea     rax, [rip + .L__unnamed_3]
        mov     edx, 1
        ret
.LBB0_3:
        lea     rax, [rip + .L__unnamed_4]
        mov     edx, 1
        ret
.LBB0_4:
        lea     rax, [rip + .L__unnamed_5]
        mov     edx, 1
        ret
.LBB0_5:
        lea     rax, [rip + .L__unnamed_6]
        mov     edx, 1
        ret
.LBB0_6:
        lea     rax, [rip + .L__unnamed_7]
        mov     edx, 1
        ret
.LBB0_7:
        lea     rax, [rip + .L__unnamed_8]
        mov     edx, 1
        ret
.LBB0_8:
        lea     rax, [rip + .L__unnamed_9]
        mov     edx, 1
        ret
.LBB0_9:
        lea     rax, [rip + .L__unnamed_10]
        mov     edx, 1
        ret
.LBB0_10:
        lea     rax, [rip + .L__unnamed_11]
        mov     edx, 1
        ret
.LBB0_11:
        lea     rax, [rip + .L__unnamed_12]
.LBB0_12:
        mov     edx, 1
        ret
.LJTI0_0:
        .long   .LBB0_12-.LJTI0_0
        .long   .LBB0_1-.LJTI0_0
        .long   .LBB0_2-.LJTI0_0
        .long   .LBB0_3-.LJTI0_0
        .long   .LBB0_4-.LJTI0_0
        .long   .LBB0_5-.LJTI0_0
        .long   .LBB0_6-.LJTI0_0
        .long   .LBB0_7-.LJTI0_0
        .long   .LBB0_8-.LJTI0_0
        .long   .LBB0_9-.LJTI0_0
        .long   .LBB0_10-.LJTI0_0
        .long   .LBB0_11-.LJTI0_0

.L__unnamed_12:
        .byte   76

.L__unnamed_11:
        .byte   75

.L__unnamed_10:
        .byte   74

.L__unnamed_9:
        .byte   73

.L__unnamed_8:
        .byte   72

.L__unnamed_7:
        .byte   71

.L__unnamed_6:
        .byte   70

.L__unnamed_5:
        .byte   69

.L__unnamed_4:
        .byte   68

.L__unnamed_3:
        .byte   67

.L__unnamed_2:
        .byte   66

.L__unnamed_1:
        .byte   65

Deriving Debug can cause poor codegen too: https://godbolt.org/z/xnexGxo8e

There's some discussion on Twitter from LLVM folks that suggests this would be best as an MIR optzn: https://twitter.com/pcwalton/status/1436036809603960835

glandium · 2021-09-09T21:00:23Z

It's worth noting that clang is doing better at equivalent C++ code. https://godbolt.org/z/zqT1z36es

pcwalton · 2021-09-09T22:17:33Z

Wonder if this is a pass ordering issue then.

pcwalton · 2021-09-09T22:33:34Z

There's LLVM code that already tries to do this. I guess it's not running on Rust for some reason? https://llvm.org/doxygen/SimplifyCFG_8cpp_source.html#l05786

pcwalton · 2021-09-09T22:55:33Z

I think I have an LLVM fix. Will post a patch upstream.

…deriving Debug for unit-like enum variants. The intent here is to allow LLVM to remove the switch entirely in favor of an indexed load from a table of constant strings, which is likely what the programmer would write in C. Unfortunately, LLVM currently doesn't perform this optimization due to a bug, but there is [a patch](https://reviews.llvm.org/D109565) that fixes this issue. I've verified that, with that patch applied on top of this commit, Debug for unit-like tuple variants becomes a load, reducing the O(n) code bloat to O(1). Note that inlining `DebugTuple::finish()` wasn't enough to allow LLVM to optimize the code properly; I had to avoid the abstraction entirely. Not using the abstraction is likely better for compile time anyway. Part of rust-lang#88793.

…, r=oli-obk Introduce a fast path that avoids the `debug_tuple` abstraction when deriving Debug for unit-like enum variants. The intent here is to allow LLVM to remove the switch entirely in favor of an indexed load from a table of constant strings, which is likely what the programmer would write in C. Unfortunately, LLVM currently doesn't perform this optimization due to a bug, but there is [a patch](https://reviews.llvm.org/D109565) that fixes this issue. I've verified that, with that patch applied on top of this commit, Debug for unit-like tuple variants becomes a load, reducing the O(n) code bloat to O(1). Note that inlining `DebugTuple::finish()` wasn't enough to allow LLVM to optimize the code properly; I had to avoid the abstraction entirely. Not using the abstraction is likely better for compile time anyway. Part of rust-lang#88793. r? `@oli-obk`

mati865 · 2022-01-14T00:46:47Z

Should this issues be relabelled as A-LLVM?

pcwalton added A-mir-opt Area: MIR optimizations C-enhancement Category: An issue proposing an enhancement or a PR with one. labels Sep 9, 2021

pcwalton mentioned this issue Sep 10, 2021

Introduce a fast path that avoids the debug_tuple abstraction when deriving Debug for unit-like enum variants. #88832

Merged

saethlin mentioned this issue Jan 14, 2023

derive(Debug) on simple enums should emit a single call to Formatter::write_str #106875

Closed

workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing MIR optimization: Replace matches with loads if possible #88793

Missing MIR optimization: Replace matches with loads if possible #88793

pcwalton commented Sep 9, 2021

glandium commented Sep 9, 2021 •

edited

Loading

pcwalton commented Sep 9, 2021

pcwalton commented Sep 9, 2021

pcwalton commented Sep 9, 2021

mati865 commented Jan 14, 2022

Missing MIR optimization: Replace matches with loads if possible #88793

Missing MIR optimization: Replace matches with loads if possible #88793

Comments

pcwalton commented Sep 9, 2021

glandium commented Sep 9, 2021 • edited Loading

pcwalton commented Sep 9, 2021

pcwalton commented Sep 9, 2021

pcwalton commented Sep 9, 2021

mati865 commented Jan 14, 2022

glandium commented Sep 9, 2021 •

edited

Loading