Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more optimizations for (https://github.com/dotnet/runtime/issues/61412) #74806

Merged
merged 7 commits into from
Oct 11, 2022

Conversation

En3Tho
Copy link
Contributor

@En3Tho En3Tho commented Aug 30, 2022

Closes #61412
Enhances #73120 with (X & 1) == 0 to ((NOT X) & 1) in addition to (X & 1) != 0 to (X & 1)

Cases of == 1 and != 1 are supported too, #73120 transforms them to 0 comparisons

Please correct me as I'm a newbie.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Aug 30, 2022
@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Aug 30, 2022
@ghost
Copy link

ghost commented Aug 30, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

null

Author: En3Tho
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@JulieLeeMSFT JulieLeeMSFT added this to the 8.0.0 milestone Sep 1, 2022
@En3Tho
Copy link
Contributor Author

En3Tho commented Sep 14, 2022

public class Issue61412
{
    [MethodImpl(MethodImplOptions.NoInlining)]
    public static bool Equal0(int x) => (x & 1) == 0;

    [MethodImpl(MethodImplOptions.NoInlining)]
    public static bool Equal1(int x) => (x & 1) == 1;

    [MethodImpl(MethodImplOptions.NoInlining)]
    public static bool NotEqual1(int x) => (x & 1) != 1;

    [MethodImpl(MethodImplOptions.NoInlining)]
    public static bool NotEqual0(int x) => (x & 1) != 0;
}
; Assembly listing for method Issue61412:Equal0(int):bool
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; Final local variable assignments
;
;  V00 arg0         [V00,T00] (  3,  3   )     int  ->  rcx         single-def
;# V01 OutArgs      [V01    ] (  1,  1   )  lclBlk ( 0) [rsp+00H]   "OutgoingArgSpace"
;
; Lcl frame size = 0

G_M14579_IG01:              ;; offset=0000H
						;; size=0 bbWeight=1    PerfScore 0.00
G_M14579_IG02:              ;; offset=0000H
       8BC1                 mov      eax, ecx
       F7D0                 not      eax
       83E001               and      eax, 1
						;; size=7 bbWeight=1    PerfScore 0.75
G_M14579_IG03:              ;; offset=0007H
       C3                   ret      
						;; size=1 bbWeight=1    PerfScore 1.00

; Total bytes of code 8, prolog size 0, PerfScore 2.55, instruction count 4, allocated bytes for code 8 (MethodHash=70dec70c) for method Issue61412:Equal0(int):bool
; ============================================================

; Assembly listing for method Issue61412:Equal1(int):bool
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; Final local variable assignments
;
;  V00 arg0         [V00,T00] (  3,  3   )     int  ->  rcx         single-def
;# V01 OutArgs      [V01    ] (  1,  1   )  lclBlk ( 0) [rsp+00H]   "OutgoingArgSpace"
;
; Lcl frame size = 0

G_M54258_IG01:              ;; offset=0000H
						;; size=0 bbWeight=1    PerfScore 0.00
G_M54258_IG02:              ;; offset=0000H
       8BC1                 mov      eax, ecx
       83E001               and      eax, 1
						;; size=5 bbWeight=1    PerfScore 0.50
G_M54258_IG03:              ;; offset=0005H
       C3                   ret      
						;; size=1 bbWeight=1    PerfScore 1.00

; Total bytes of code 6, prolog size 0, PerfScore 2.10, instruction count 3, allocated bytes for code 6 (MethodHash=b52d2c0d) for method Issue61412:Equal1(int):bool
; ============================================================

; Assembly listing for method Issue61412:NotEqual1(int):bool
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; Final local variable assignments
;
;  V00 arg0         [V00,T00] (  3,  3   )     int  ->  rcx         single-def
;# V01 OutArgs      [V01    ] (  1,  1   )  lclBlk ( 0) [rsp+00H]   "OutgoingArgSpace"
;
; Lcl frame size = 0

G_M3143_IG01:              ;; offset=0000H
						;; size=0 bbWeight=1    PerfScore 0.00
G_M3143_IG02:              ;; offset=0000H
       8BC1                 mov      eax, ecx
       F7D0                 not      eax
       83E001               and      eax, 1
						;; size=7 bbWeight=1    PerfScore 0.75
G_M3143_IG03:              ;; offset=0007H
       C3                   ret      
						;; size=1 bbWeight=1    PerfScore 1.00

; Total bytes of code 8, prolog size 0, PerfScore 2.55, instruction count 4, allocated bytes for code 8 (MethodHash=b798f3b8) for method Issue61412:NotEqual1(int):bool
; ============================================================

; Assembly listing for method Issue61412:NotEqual0(int):bool
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; Final local variable assignments
;
;  V00 arg0         [V00,T00] (  3,  3   )     int  ->  rcx         single-def
;# V01 OutArgs      [V01    ] (  1,  1   )  lclBlk ( 0) [rsp+00H]   "OutgoingArgSpace"
;
; Lcl frame size = 0

G_M35142_IG01:              ;; offset=0000H
						;; size=0 bbWeight=1    PerfScore 0.00
G_M35142_IG02:              ;; offset=0000H
       8BC1                 mov      eax, ecx
       83E001               and      eax, 1
						;; size=5 bbWeight=1    PerfScore 0.50
G_M35142_IG03:              ;; offset=0005H
       C3                   ret      
						;; size=1 bbWeight=1    PerfScore 1.00

; Total bytes of code 6, prolog size 0, PerfScore 2.10, instruction count 3, allocated bytes for code 6 (MethodHash=94c276b9) for method Issue61412:NotEqual0(int):bool
; ============================================================

@En3Tho
Copy link
Contributor Author

En3Tho commented Sep 23, 2022

Can someone review this please?

Copy link
Member

@EgorBo EgorBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, the optimization seems to be quite conservatives around surroundings but it was like that before your changes.

@AndyAyersMS
Copy link
Member

Any idea why we don't see arm64 diffs? Is this handled already there?

@En3Tho
Copy link
Contributor Author

En3Tho commented Sep 23, 2022

Thanks!

I guess because of this check? Can't say for sure

GenTree* Lowering::OptimizeConstCompare(GenTree* cmp)
{
    assert(cmp->gtGetOp2()->IsIntegralConst());

#if defined(TARGET_XARCH) || defined(TARGET_ARM64)
    GenTree*       op1      = cmp->gtGetOp1();
    GenTreeIntCon* op2      = cmp->gtGetOp2()->AsIntCon();
    ssize_t        op2Value = op2->IconValue();

#ifdef TARGET_ARM64 // <---
    // Do not optimise further if op1 has a contained chain.
    if (op1->OperIs(GT_AND) &&
        (op1->gtGetOp1()->isContainedAndNotIntOrIImmed() || op1->gtGetOp2()->isContainedAndNotIntOrIImmed()))
    {
        return cmp;
    }
#endif
///...
}

@EgorBo
Copy link
Member

EgorBo commented Sep 23, 2022

@En3Tho oh, interesting, if you want you can remove that ifdef so we can see SPMI diffs on Ci as part of this PR

@En3Tho
Copy link
Contributor Author

En3Tho commented Sep 23, 2022

@EgorBo Sure. Let's see what will break :D

@En3Tho
Copy link
Contributor Author

En3Tho commented Sep 24, 2022

One of failures is #76041 . I'm not sure what those Push work item to Helix failures mean. Is that a pure ci problem? Also, should spmi for arm triggered manually? Am I just missing arm results or there are none?

UPD: arm has regressed so reverting that check back

@EgorBo EgorBo merged commit 3e9df90 into dotnet:main Oct 11, 2022
@EgorBo
Copy link
Member

EgorBo commented Oct 11, 2022

@En3Tho thanks!

@ghost ghost locked as resolved and limited conversation to collaborators Nov 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JIT: Optimize "X & 1 == 0" to "X & 1"
4 participants