Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT could further optimize "== 0" checks in Boolean logic #13573

Closed
GrabYourPitchforks opened this issue Oct 11, 2019 · 11 comments · Fixed by #49548
Closed

JIT could further optimize "== 0" checks in Boolean logic #13573

GrabYourPitchforks opened this issue Oct 11, 2019 · 11 comments · Fixed by #49548
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI optimization
Milestone

Comments

@GrabYourPitchforks
Copy link
Member

There are some well-known bit manipulation patterns that the JIT doesn't currently optimize. For instance:

public static bool AreZero(int x, int y) {
    return x == 0 && y == 0;
}

Produces this codegen:

AreZero(Int32, Int32)
    L0000: test ecx, ecx
    L0002: jnz L000d
    L0004: test edx, edx
    L0006: setz al
    L0009: movzx eax, al
    L000c: ret
    L000d: xor eax, eax
    L000f: ret

But it could produce this more optimal codegen, folding the two comparisons together into a single or and zero check:

AreZero(Int32, Int32)
    L0000: or edx, ecx
    L0002: setz al
    L0005: movzx eax, al
    L0008: ret

category:cq
theme:basic-cq
skill-level:beginner
cost:small

@EgorBo
Copy link
Member

EgorBo commented Oct 11, 2019

Interesting, 3 (and more) arguments in AreZero actually use this optimization:

private static bool AreZero(int x, int y, int z)
{
    return x == 0 && y == 0 && z == 0;
}

Codegen:

; Method ConsoleApp2.Program:AreZero(int,int,int):bool
G_M22205_IG01:
G_M22205_IG02:
       or       edx, ecx
       jne      SHORT G_M22205_IG05
G_M22205_IG03:		;; bbWeight=0.50
       test     r8d, r8d
       sete     al
       movzx    rax, al
G_M22205_IG04:		;; bbWeight=0.50
       ret      
G_M22205_IG05:		;; bbWeight=0.50
       xor      eax, eax
G_M22205_IG06:		;; bbWeight=0.50
       ret      
; Total bytes of code: 17

(I suspect it happens in Compiler::optOptimizeBools)

@EgorBo
Copy link
Member

EgorBo commented Oct 11, 2019

Hack:

private static bool AreZero(int x, int y)
{
    return x == 0 && y == 0 && BitConverter.IsLittleEndian;
}

where BitConverter.IsLittleEndian is a runtime true constant so Roslyn won't remove it.

Codegen:

; Method ConsoleApp2.Program:AreZero(int,int):bool
G_M22205_IG01:
G_M22205_IG02:
       or       edx, ecx
       jne      SHORT G_M22205_IG05
G_M22205_IG03:		;; bbWeight=0.50
       mov      eax, 1
G_M22205_IG04:		;; bbWeight=0.50
       ret      
G_M22205_IG05:		;; bbWeight=0.50
       xor      eax, eax
G_M22205_IG06:		;; bbWeight=0.50
       ret      
; Total bytes of code: 13

😄 sharplab.io

@GrabYourPitchforks
Copy link
Member Author

Similarly:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static bool IsEitherNegative(int a, int b)
{
    return a < 0 || b < 0;
}

Current codegen:

    L0000: test ecx, ecx
    L0002: jl L000d
    L0004: test edx, edx
    L0006: setl al
    L0009: movzx eax, al
    L000c: ret
    L000d: mov eax, 0x1
    L0012: ret

Possible optimal codegen below, using the fact that the or instruction sets SF:

or ecx, edx
sets al   ; or setl, if you prefer, since "or" is known to clear OF
movzx eax, al
ret

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
@JulieLeeMSFT JulieLeeMSFT self-assigned this Jan 8, 2021
@JulieLeeMSFT JulieLeeMSFT modified the milestones: Future, 6.0.0 Jan 12, 2021
@JulieLeeMSFT JulieLeeMSFT removed the JitUntriaged CLR JIT issues needing additional triage label Jan 12, 2021
@JulieLeeMSFT
Copy link
Member

JulieLeeMSFT commented Jan 13, 2021

Reproduced

  • AreZero(int x, int y)
  • AreZero(int x, int y, int z), and
  • AreZero(int x, int y) with the 3rd operand, BitConverter.IsLittleEndian.

will check where the optimization happens.

@JulieLeeMSFT
Copy link
Member

JulieLeeMSFT commented Jan 26, 2021

  • Compiler::optOptimizeBools optimizes to OR only if both the 1st basic block (BB) and the 2nd BB are conditional jump (BBJ_COND).
  • For the two argument case, the 2nb BB is RETURN statement with comparison, thus it is not optimized.
STMT00002 (IL 0x003...  ???)
               [000009] ------------              *  RETURN    int   
               [000008] ------------              \--*  EQ        int   
               [000006] ------------                 +--*  LCL_VAR   int    V01 arg1         
               [000007] ------------                 \--*  CNS_INT   int    0

Plan:

  • Modify the method to add pattern matches for different cases.

@JulieLeeMSFT
Copy link
Member

JulieLeeMSFT commented Feb 13, 2021

I was able to modify the code to generate the optimized code for the two argument case.

G_M56891_IG01:
G_M56891_IG02:
       or       edx, ecx
       sete     al
       movzx    rax, al
G_M56891_IG03:
       ret

; Total bytes of code 9, prolog size 0, PerfScore 3.40, instruction count 4, allocated bytes for code 9 (MethodHash=d88b21c4) for method Test:AreZero(int,int):bool

will run diffs and tests.

@JulieLeeMSFT
Copy link
Member

jit-diff shows 1440 methods improvement.

Top file improvements (bytes):
     -360652 : System.Net.Http.dasm (-52.74% of base)
         -18 : Microsoft.CSharp.dasm (-0.00% of base)
         -18 : System.ComponentModel.TypeConverter.dasm (-0.01% of base)
         -16 : System.Private.Xml.Linq.dasm (-0.01% of base)
         -15 : Microsoft.Extensions.Primitives.dasm (-0.06% of base)

5 total files with Code Size differences (5 improved, 0 regressed), 266 unchanged.

Top method improvements (bytes):
      -17401 (-100.00% of base) : System.Net.Http.dasm - QPackStaticTable:.cctor() (1 base, 0 diff methods)
      -16753 (-100.00% of base) : System.Net.Http.dasm - KnownHeaders:.cctor() (1 base, 0 diff methods)
       -6654 (-100.00% of base) : System.Net.Http.dasm - <GetHttp2ConnectionAsync>d__68:MoveNext():this (1 base, 0 diff methods)
       -5796 (-100.00% of base) : System.Net.Http.dasm - <SendAsync>d__27:MoveNext():this (1 base, 0 diff methods)
       -5211 (-100.00% of base) : System.Net.Http.dasm - H2StaticTable:.cctor() (1 base, 0 diff methods)
       -4775 (-100.00% of base) : System.Net.Http.dasm - <ProcessServerStreamAsync>d__49:MoveNext():this (1 base, 0 diff methods)
       -4427 (-100.00% of base) : System.Net.Http.dasm - <ProcessServerControlStreamAsync>d__50:MoveNext():this (1 base, 0 diff methods)
       -4013 (-100.00% of base) : System.Net.Http.dasm - <ProcessIncomingFramesAsync>d__49:MoveNext():this (1 base, 0 diff methods)
       -3989 (-100.00% of base) : System.Net.Http.dasm - <SendWithRetryAsync>d__72:MoveNext():this (1 base, 0 diff methods)
       -3585 (-100.00% of base) : System.Net.Http.dasm - <ConnectAsync>d__82:MoveNext():this (1 base, 0 diff methods)
       -3584 (-100.00% of base) : System.Net.Http.dasm - <SendHeadersAsync>d__83:MoveNext():this (1 base, 0 diff methods)
       -3128 (-100.00% of base) : System.Net.Http.dasm - <SendRequestBodyAsync>d__37:MoveNext():this (1 base, 0 diff methods)
       -3118 (-100.00% of base) : System.Net.Http.dasm - <SendAsync>d__4:MoveNext():this (1 base, 0 diff methods)
       -3037 (-100.00% of base) : System.Net.Http.dasm - <SendAsync>d__37:MoveNext():this (1 base, 0 diff methods)
       -2893 (-100.00% of base) : System.Net.Http.dasm - GenericHeaderParser:.cctor() (1 base, 0 diff methods)
       -2809 (-100.00% of base) : System.Net.Http.dasm - <SendContentAsync>d__29:MoveNext():this (1 base, 0 diff methods)
       -2685 (-100.00% of base) : System.Net.Http.dasm - <SendAsyncCore>d__5:MoveNext():this (1 base, 0 diff methods)
       -2618 (-100.00% of base) : System.Net.Http.dasm - <ConnectToTcpHostAsync>d__83:MoveNext():this (1 base, 0 diff methods)
       -2605 (-100.00% of base) : System.Net.Http.dasm - <ReadFrameEnvelopeAsync>d__42:MoveNext():this (1 base, 0 diff methods)
       -2588 (-100.00% of base) : System.Net.Http.dasm - <GetHttp3ConnectionAsync>d__71:MoveNext():this (1 base, 0 diff methods)

Top method improvements (percentages):
         -32 (-100.00% of base) : System.Net.Http.dasm - Http2WriteStream:Dispose(bool):this (1 base, 0 diff methods)
          -3 (-100.00% of base) : System.Net.Http.dasm - Http2WriteStream:get_CanRead():bool:this (1 base, 0 diff methods)
         -12 (-100.00% of base) : System.Net.Http.dasm - Http2WriteStream:get_CanWrite():bool:this (1 base, 0 diff methods)
         -40 (-100.00% of base) : System.Net.Http.dasm - Http2WriteStream:Read(Span`1):int:this (1 base, 0 diff methods)
         -40 (-100.00% of base) : System.Net.Http.dasm - Http2WriteStream:ReadAsync(Memory`1,CancellationToken):ValueTask`1:this (1 base, 0 diff methods)
        -239 (-100.00% of base) : System.Net.Http.dasm - Http2WriteStream:WriteAsync(ReadOnlyMemory`1,CancellationToken):ValueTask:this (1 base, 0 diff methods)
         -81 (-100.00% of base) : System.Net.Http.dasm - Http2WriteStream:FlushAsync(CancellationToken):Task:this (1 base, 0 diff methods)
         -11 (-100.00% of base) : System.Net.Http.dasm - Http2WriteStream:.ctor(Http2Stream):this (1 base, 0 diff methods)
       -3128 (-100.00% of base) : System.Net.Http.dasm - <SendRequestBodyAsync>d__37:MoveNext():this (1 base, 0 diff methods)
         -27 (-100.00% of base) : System.Net.Http.dasm - <SendRequestBodyAsync>d__37:SetStateMachine(IAsyncStateMachine):this (1 base, 0 diff methods)
         -71 (-100.00% of base) : System.Net.Http.dasm - <>c:<WaitFor100ContinueAsync>b__38_0(Object):this (1 base, 0 diff methods)
        -171 (-100.00% of base) : System.Net.Http.dasm - <>c:<WaitFor100ContinueAsync>b__38_1(Object):this (1 base, 0 diff methods)
         -61 (-100.00% of base) : System.Net.Http.dasm - <>c:<RegisterRequestBodyCancellation>b__74_0(Object):this (1 base, 0 diff methods)
        -325 (-100.00% of base) : System.Net.Http.dasm - <>c:<WaitForDataAsync>b__79_0(Object,CancellationToken):this (1 base, 0 diff methods)
       -1979 (-100.00% of base) : System.Net.Http.dasm - <WaitFor100ContinueAsync>d__38:MoveNext():this (1 base, 0 diff methods)
         -27 (-100.00% of base) : System.Net.Http.dasm - <WaitFor100ContinueAsync>d__38:SetStateMachine(IAsyncStateMachine):this (1 base, 0 diff methods)
       -1112 (-100.00% of base) : System.Net.Http.dasm - <ReadResponseHeadersAsync>d__64:MoveNext():this (1 base, 0 diff methods)
         -27 (-100.00% of base) : System.Net.Http.dasm - <ReadResponseHeadersAsync>d__64:SetStateMachine(IAsyncStateMachine):this (1 base, 0 diff methods)
       -1069 (-100.00% of base) : System.Net.Http.dasm - <ReadDataAsync>d__68:MoveNext():this (1 base, 0 diff methods)
         -27 (-100.00% of base) : System.Net.Http.dasm - <ReadDataAsync>d__68:SetStateMachine(IAsyncStateMachine):this (1 base, 0 diff methods)

1440 total methods with Code Size differences (1440 improved, 0 regressed), 258579 unchanged.

@AndyAyersMS
Copy link
Member

Unfortunately I think this may be telling you something has gone wrong -- for a change like this we would expect baseline and new jit to process the exact same number of methods.

Output like this means the new jit did not jit (or failed while trying to jit) some methods.

      -17401 (-100.00% of base) : System.Net.Http.dasm - QPackStaticTable:.cctor() (1 base, 0 diff methods)

Was there a message printed before he output you show above about "reconciling"?

@JulieLeeMSFT
Copy link
Member

@AndyAyersMS you are right. There were 2 base and 3 diff error messages before this, and I was looking into it. Will ping you.

@JulieLeeMSFT
Copy link
Member

jit-diff error message

Beginning PMI CodeSize Diffs for System.Private.CoreLib.dll, framework assemblies
- Finished 201/271 Base 0/271 Diff [200.0 sec]Error running C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\corerun.exe on C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\System.Private.CoreLib.dll
1 errors compiling set.
Dasm command "jit-dasm-pmi --platform C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root --corerun C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\corerun.exe --jit C:\runtime2\artifacts\bin\coreclr\Windows.x64.Checked\clrjit.dll --nocopy --output c:\diffs\dasmset_13\base C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\System.Private.CoreLib.dll" returned with 1 failures
\ Finished 270/271 Base 0/271 Diff [281.7 sec]Error running C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\corerun.exe on C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\mscorlib.dll
1 errors compiling set.
Dasm command "jit-dasm-pmi --platform C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root --corerun C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\corerun.exe --jit C:\runtime2\artifacts\bin\coreclr\Windows.x64.Checked\clrjit.dll --nocopy --output c:\diffs\dasmset_13\base C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\mscorlib.dll" returned with 1 failures
\ Finished 271/271 Base 2/271 Diff [295.3 sec]Error running C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\corerun.exe on C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\System.Net.Http.dll
1 errors compiling set.
Dasm command "jit-dasm-pmi --platform C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root --corerun C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\corerun.exe --jit C:\runtime\artifacts\bin\coreclr\Windows.x64.Checked\clrjit.dll --nocopy --output c:\diffs\dasmset_13\diff C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\System.Net.Http.dll" returned with 1 failures
\ Finished 271/271 Base 195/271 Diff [479.7 sec]Error running C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\corerun.exe on C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\System.Private.CoreLib.dll
1 errors compiling set.
Dasm command "jit-dasm-pmi --platform C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root --corerun C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\corerun.exe --jit C:\runtime\artifacts\bin\coreclr\Windows.x64.Checked\clrjit.dll --nocopy --output c:\diffs\dasmset_13\diff C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\System.Private.CoreLib.dll" returned with 1 failures
/ Finished 271/271 Base 270/271 Diff [562.0 sec]Error running C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\corerun.exe on C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\mscorlib.dll
1 errors compiling set.
Dasm command "jit-dasm-pmi --platform C:\runtime\artifacts\te| Finished 271/271 Base 270/271 Diff [562.1 sec]sts\coreclr\Windows.x64.Release\Tests\Core_Root --corerun C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\corerun.exe --jit C:\runtime\artifacts\bin\coreclr\Windows.x64.Checked\clrjit.dll --nocopy --output c:\diffs\dasmset_13\diff C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\mscorlib.dll" returned with 1 failures
\ Finished 271/271 Base 271/271 Diff [562.1 sec]
Dasm commands returned 2 base failures, 3 diff failures.
Completed PMI CodeSize Diffs for System.Private.CoreLib.dll, framework assemblies in 566.93s

This is the error message when running one of them, but I cannot relate it with my code change.

c:\JitBench\jitutils>C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\corerun.exe C:\runtime\artifacts\tests\coreclr\Windows.x64.Release\Tests\Core_Root\System.Private.CoreLib.dll
Unhandled exception. System.MissingMethodException: Entry point not found in assembly 'System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e'.

@JulieLeeMSFT
Copy link
Member

Reproduced the issue. Debugging.

In Compiler::optIsBoolComp the tree gtType was incorrectly set to TYP_VOID.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Mar 12, 2021
JulieLeeMSFT added a commit that referenced this issue Jul 14, 2021
* Equals to 0 optimization in Boolean logic

* Limit bool optimization to Integral return type only

* Use the updated flowList:setEdgeWeights method with the 3rd parameter

* Skip bool optimization for cases that require NOT transformation

* Skip bool optimization when the third block GT_RETURN is not CNT_INT int

* format patch

* Added more bool optimization cases

* format patch

* Refactored setting fold type and comparison type to fix jitstress error

* format patch

* Refactored common codes for conditional block and return block boolean optimizations

* format patch

* Unit test changed to remove EH handling and add return value checks

* Unit test: add back test cases for ANDing and NE cases

* Made OptBoolsDsc struct to pass it off to the helper methods.

* format patch

* Changed to substructure OptTestInfo within OptBoolsDisc

* Cleaned up tree variables in OptBoolsDsc struct

* Moved some methods for Boolean Optimization to OptBoolsDsc struct

* Moved all private methods for Boolean Optimization to OptBoolsDsc struct

* Boolean Optimization: Handled code review feedback

* Optimize bools: hoisted jump destination check to optOptimizeBools() and added test cases

* format patch

* Moved initialization to OptBoolsDsc constructor

* format patch

Co-authored-by: Julie Lee <jeonlee@microsoft.com>
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jul 14, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Aug 21, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI optimization
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants