Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore ALWAYS_ABORT_FLAG in count_is_zero fast path #106197

Conversation

danielverkamp
Copy link

Based on the assumption that always_abort usage is uncommon, change the count_is_zero() fast path to only check if GLOBAL_PANIC_COUNT is not zero, ignoring ALWAYS_ABORT_FLAG. This improves code generation for the fast path, since it no longer has to mask off the flag first.

This has a small but potentially widespread impact on code quality and size, since panicking::count_is_zero() is inlined at each Mutex lock and unlock call site.


Small test case that shows the improvement:

pub fn test_func(mtx: &Mutex<u32>) {
    let mut guard = mtx.lock().unwrap();
    *guard += 0x42;
}

Code generation without this change (rustc 1.67.0-nightly (2585bce 2022-11-28) on x86_64 Linux (trimmed to just the fast path to highlight the changes):

    865c:       b9 01 00 00 00          mov    ecx,0x1
    8661:       31 c0                   xor    eax,eax
    8663:       f0 0f b1 0f             lock cmpxchg DWORD PTR [rdi],ecx
    8667:       75 4c                   jne    86b5 <rust_mutex::test_func+0x65>

/// lock panicking() check
    8669:       4c 8d 35 48 6a 04 00    lea    r14,[rip+0x46a48]        # 4f0b8 <std::panicking::panic_count::GLOBAL_PANIC_COUNT>
    8670:       49 8b 06                mov    rax,QWORD PTR [r14]
    8673:       48 c1 e0 01             shl    rax,0x1
    8677:       48 85 c0                test   rax,rax
    867a:       75 44                   jne    86c0 <rust_mutex::test_func+0x70>
\\\

    867c:       0f b6 43 04             movzx  eax,BYTE PTR [rbx+0x4]
    8680:       84 c0                   test   al,al
    8682:       0f 85 86 00 00 00       jne    870e <rust_mutex::test_func+0xbe>
    8688:       4c 8d 7b 04             lea    r15,[rbx+0x4]

    868c:       83 43 08 42             add    DWORD PTR [rbx+0x8],0x42

/// unlock panicking() check
    8690:       48 b8 ff ff ff ff ff    movabs rax,0x7fffffffffffffff
    8697:       ff ff 7f 
    869a:       49 8b 0e                mov    rcx,QWORD PTR [r14]
    869d:       48 85 c1                test   rcx,rax
    86a0:       75 4a                   jne    86ec <rust_mutex::test_func+0x9c>
\\\

The compiler backend is clever and optimizes the & !ALWAYS_ABORT_FLAG check into a shl ..., 1 to shift away the top bit in the lock fast path, but for some reason doesn't apply the same optimization in the unlock path (it emits a movabs instruction with a large 0x7fffffffffffffff immediate to mask off the flag bit instead). In either case, omitting the flag check avoids an instruction or two in the critical path.

With this change applied:

    877c:       b9 01 00 00 00          mov    ecx,0x1
    8781:       31 c0                   xor    eax,eax
    8783:       f0 0f b1 0f             lock cmpxchg DWORD PTR [rdi],ecx
    8787:       75 3a                   jne    87c3 <rust_mutex::test_func+0x53>

/// lock panicking() check
    8789:       4c 8d 35 a8 c8 04 00    lea    r14,[rip+0x4c8a8]        # 55038 <std::panicking::panic_count::GLOBAL_PANIC_COUNT>
    8790:       49 8b 06                mov    rax,QWORD PTR [r14]
    8793:       48 85 c0                test   rax,rax
    8796:       75 43                   jne    87db <rust_mutex::test_func+0x6b>
\\\

    8798:       0f b6 43 04             movzx  eax,BYTE PTR [rbx+0x4]
    879c:       84 c0                   test   al,al
    879e:       75 7f                   jne    881f <rust_mutex::test_func+0xaf>
    87a0:       4c 8d 7b 04             lea    r15,[rbx+0x4]

    87a4:       83 43 08 42             add    DWORD PTR [rbx+0x8],0x42

/// unlock panicking() check
    87a8:       49 8b 06                mov    rax,QWORD PTR [r14]
    87ab:       48 85 c0                test   rax,rax
    87ae:       75 4d                   jne    87fd <rust_mutex::test_func+0x8d>
\\\

With this patch applied, the inlined panicking() function's fast path is now a load + test reg, reg + conditional jump in both the lock and unlock path.


My assumption is that use of panic::always_abort() (#84438) is uncommon and performance is not as important in cases where it has been set; if that's not accurate, maybe this change is not a good idea, since it pessimizes the always_abort case (making it always read the thread-local panic count instead).

Based on the assumption that always_abort usage is uncommon, change the
count_is_zero() fast path to only check if GLOBAL_PANIC_COUNT is not
zero, ignoring ALWAYS_ABORT_FLAG. This improves code generation for the
fast path, since it no longer has to mask off the flag first.

This has a small but potentially widespread impact on code quality and
size, since panicking::count_is_zero() is inlined at each Mutex lock and
unlock call site.
@rustbot
Copy link
Collaborator

rustbot commented Dec 27, 2022

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @m-ou-se (or someone else) soon.

Please see the contribution instructions for more information.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Dec 27, 2022
@rustbot
Copy link
Collaborator

rustbot commented Dec 27, 2022

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@saethlin
Copy link
Member

cc @thomcc because I know we were talking about panic/aborting code paths earlier

Copy link
Member

@m-ou-se m-ou-se left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment about the always abort flag a bit earlier in the file says:

Accessing LOCAL_PANIC_COUNT in a child created by libc::fork would lead to a memory allocation. Only GLOBAL_PANIC_COUNT can be accessed in this situation.

This PR would cause LOCAL_PANIC_COUNT to be accessed even when the always abort flag has been set.

@m-ou-se m-ou-se added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 28, 2022
@danielverkamp
Copy link
Author

Ah, that makes sense, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants