-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance regression with niche optimization #101872
Comments
@rustbot label +T-compiler +regression-untriaged |
#102035 is a related example, where increased use of niche-filling causes some small but widespread instruction count regressions. |
WG-prioritization assigning priority (Zulip discussion). @rustbot label -I-prioritize +P-medium |
A little more data: I did some local testing using the "Self profile" mode of the The results I get locally make a lot more sense to me. The linked data shows the |
In addition to #102872, I have in mind two other optimizations for switching on a discriminant: First, rather than this llvmir-like pseudocode:
we could instead do this:
Replacing a conditional move with a jump. I am fairly certain this would be a win, as it would allow to skip the tag calculations whenever we have the untagged variant, it would let us remove the cmov, which has data dependencies on a lot of the other instructions, and the new jump is at least as predictable as the existing one. I don't think it's possible to make this happen just by modifying the current
The second optimization I have in mind would need to happen in LLVM. |
I filed this LLVM issue, which if addressed would lead to better code in niche match statements in many cases. |
In some cases we can avoid arithmetic before checking whether a niche represents an untagged variant. This is relevant to rust-lang#101872
rustc_codegen_ssa: Better code generation for niche discriminants. In some cases we can avoid arithmetic before checking whether a niche is a tag. Also rename some identifiers around niches. This is relevant to rust-lang#101872
@mikebenfield did #102872 solved this issue? Checking progress to unblock #102035. Thanks! |
@apiraino Honestly I don’t know. It definitely improved some things, but it’s not clear to be which benchmarks are particularly relevant, so I’m not sure whether we can say this issue is solved. |
After PR #94075 for more niche optimizations was merged, there were some performance regressions.
Some performance regressions are possibly just due to the extra arithmetic and branch or cmov required when getting a discriminant out of a tag. But notably when compiling syn, the regression is largely due to extra time in
LLVM_lto_optimize
.It would be nice to understand this better, and ideally do something about it.
The text was updated successfully, but these errors were encountered: