[JIT] Optimization for left-shift operator #52297

DarkBullNull · 2021-05-05T13:18:15Z

Oh, time to learn how to use git) Sorry me for my "Crooked PRs"
The operation "x << 2" and "x << 3" can be optimized. Just 1 quick instruction.
============(x << 2)============
Before:

"MOV EAX, EDX"     // 89 D0
"SHL EAX, 2"       // C1 E0 02
"RET"              // C3

After:

"LEA EAX, [EDX*4]" // 8D 04 95 00 00 00 00
"RET"              // C3

============(x << 3)============
Before:

"MOV EAX, EDX"     // 89 D0
"SHL EAX, 3"       // C1 E0 03
"RET"              // C3

After:

"LEA EAX, [EDX*8]" // 8D 04 D5 00 00 00 00
"RET"              // C3

This will be faster than a normal left shift.

SPMI shows that there are no improvements, there are regressions. But actually, this is because "LEA REG, [REG*4]" takes up two bytes more than the shift, but this is compensated by the execution speed.

If def DISPLAY_SIZES = 1 in jit.h, then gives and error

If def DISPLAY_SIZES = 1, gives an error

because of intersection of names "dataSize" between "compiler.h" and in "codegenarm64.cpp"

AndyAyersMS · 2021-06-07T19:52:00Z

@DarkBullNull can you recheck diffs now that #53053 is done?

DarkBullNull · 2021-06-28T18:21:42Z

Sorry for my absence. I have a term paper on the 30th, so I'm not answering yet.

DarkBullNull · 2021-07-01T21:22:43Z

C#:

[MethodImpl(MethodImplOptions.NoInlining | MethodImplOptions.AggressiveOptimization)]
bool Test1(int x)
{
	return ((x << 2) == 0) ? false : true;
}

[MethodImpl(MethodImplOptions.NoInlining | MethodImplOptions.AggressiveOptimization)]
bool Test2(int x, int y)
{
	return ((y << 2) == x) ? false : true;
}

Before all the changes:

Test1:
Asm: (14 bytes)
c1e202           shl     edx,2
7406             je      00007fff1363f63d
b801000000       mov     eax,1
c3               ret
33c0             xor     eax,eax
c3               ret

Test2:
Asm: (18 bytes)
41c1e002         shl     r8d,2
443bc2           cmp     r8d,edx
7406             je      00007fff13642f4f
b801000000       mov     eax,1
c3               ret
33c0             xor     eax,eax
c3               ret

After all the changes:

Test1:
Asm: (20 bytes)
8d149500000000   lea     edx,[rdx*4]
85d2             test    edx,edx
7406             je      00007ffef6648d01
b801000000       mov     eax,1
c3		 ret
33c0	         xor     eax,eax
c3		 ret

Test2:
Asm: (23 bytes)
468d048500000000 lea     r8d,[r8*4]
443bc2	         cmp     r8d,edx
7406	         je      00007ffef664c163
b801000000       mov     eax,1
c3		 ret
33c0	         xor     eax,eax
c3		 ret

jit-diffs result: https://gist.github.com/DarkBullNull/011acc7a68d8e055d0e24ff973c438fa

kunalspathak · 2021-07-01T21:28:34Z

Did you rebase your changes? With #53214, I am seeing below code for Test1(). It doesn't have test instruction after shl.

G_M37926_IG01:

G_M37926_IG02:
       shl      edx, 2
       je       SHORT G_M37926_IG05

G_M37926_IG03:
       mov      eax, 1

G_M37926_IG04:
       ret

G_M37926_IG05:
       xor      eax, eax

G_M37926_IG06:
       ret

DarkBullNull · 2021-07-01T21:30:23Z

@kunalspathak Yes, sorry, i fixed it

kunalspathak · 2021-07-01T21:34:42Z

Thanks! So I see that the code size regressed more (as expected) compared to before #53214. Could you also post perfscore diffs for various collections that you posted initially? This will give an idea if the code size regression is worth considering.

JulieLeeMSFT · 2021-08-31T13:19:06Z

Thanks! So I see that the code size regressed more (as expected) compared to before #53214. Could you also post perfscore diffs for various collections that you posted initially? This will give an idea if the code size regression is worth considering.

Ping @DarkBullNull

kunalspathak · 2021-10-04T16:14:02Z

Ping again @DarkBullNull

kunalspathak · 2021-11-01T14:02:11Z

@DarkBullNull - let us know if you still want to pursue this?

DarkBullNull · 2021-11-02T11:05:04Z

With optimiz:
PerfScore is 1.75 for:

bool Test1(int x)
{
     return ((x << 2) == 0) ? false : true;
}

PerfScore is 1.75 for:

[MethodImpl(MethodImplOptions.NoInlining | MethodImplOptions.AggressiveOptimization)]
bool Test2(int x, int y)
{
    return ((y << 2) == x) ? false : true;
}

Without optimiz:
PerfScore is 1.50 for:

bool Test1(int x)
{
     return ((x << 2) == 0) ? false : true;
}

PerfScore is 1.75 for:

[MethodImpl(MethodImplOptions.NoInlining | MethodImplOptions.AggressiveOptimization)]
bool Test2(int x, int y)
{
    return ((y << 2) == x) ? false : true;
}

kunalspathak · 2021-11-02T16:58:07Z

Thanks @DarkBullNull ...I will take a look

kunalspathak · 2021-11-10T18:22:41Z

Do you mind rebasing your changes on latest main? I will then double check the diffs.

kunalspathak · 2021-11-16T06:55:15Z

/azp run runtime-coreclr superpmi-asmdiffs

azure-pipelines · 2021-11-16T06:55:34Z

Azure Pipelines successfully started running 1 pipeline(s).

kunalspathak · 2021-11-16T18:53:18Z

I inspected the asmdiffs coming out of this PR and as you pointed, the primary reason of the code size diff is coming from lea being longer instruction, but there are many places where now we are emitting longer jump because of that.

https://www.diffchecker.com/Yqe33Cac

As such, I am not sure the savings we get from converting shl -> lea justify the longer jumps penalties. Do you have any measurements that prove that this optimization still shows any benefits?

kunalspathak · 2021-12-13T15:34:21Z

Ping. Should we pursue this PR or should I go ahead and close it?

DarkBullNull added 15 commits April 24, 2021 22:38

Fix bool-check call and add nullptr

2c7d76e

Add lvar dataSize

47517a7

If def DISPLAY_SIZES = 1 in jit.h, then gives and error

Remove type def

7b2a14a

If def DISPLAY_SIZES = 1, gives an error

Move def dataSize

0ee8295

Update codegen.h

49fd837

Update compiler.h

3a542a4

Rename "dataSize" to "eDataSize" (EmitDataSize)

0d38ce3

because of intersection of names "dataSize" between "compiler.h" and in "codegenarm64.cpp"

Grammar fix compiler.h

dd96487

Grammar fix

52c08c6

Add files via upload

2da9735

Update codegenxarch.cpp

eb88fee

Update morph.cpp

54cf11d

Update codegenxarch.cpp

8fd2375

Update morph.cpp

3ef0e24

Update codegenxarch.cpp

512daa7

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 5, 2021

DarkBullNull added 14 commits May 5, 2021 16:20

Merge branch 'main' into runtime6.3_test

e20628e

Update codegenarm64.cpp

e973f62

Update morph.cpp

e32ab33

Update morph.cpp

ce9db99

Update compiler.h

44330b1

Update gentree.cpp

400a360

Update compiler.cpp

9589f2c

Update codegencommon.cpp

f782919

Update codegencommon.cpp

744ed90

Update codegenxarch.cpp

4415518

Update compiler.cpp

6f5bd46

Update compiler.cpp

a529be4

Update compiler.h

5247e6f

Fix comments

acb2d4c

kunalspathak mentioned this pull request May 20, 2021

xarch: Use ZF and CF flags whenever possible to eliminate test instruction #53053

Closed

JulieLeeMSFT assigned DarkBullNull Jun 28, 2021

JulieLeeMSFT added the needs author feedback label Jun 28, 2021

terrajobst added the community-contribution Indicates that the PR has been added by a community member label Jul 19, 2021

JulieLeeMSFT added needs author feedback and removed needs author feedback labels Sep 27, 2021

JulieLeeMSFT assigned kunalspathak Sep 27, 2021

eiriktsarpalis added needs more info and removed needs author feedback labels Oct 5, 2021

Merge branch 'dotnet:main' into runtime6.3_test

08f4152

kunalspathak closed this Dec 15, 2021

ghost locked as resolved and limited conversation to collaborators Jan 14, 2022

eiriktsarpalis added needs-author-action An issue or pull request that requires more info or actions from the author. and removed needs more info labels Jan 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[JIT] Optimization for left-shift operator #52297

[JIT] Optimization for left-shift operator #52297

DarkBullNull commented May 5, 2021 •

edited

Loading

AndyAyersMS commented Jun 7, 2021

DarkBullNull commented Jun 28, 2021

DarkBullNull commented Jul 1, 2021 •

edited

Loading

kunalspathak commented Jul 1, 2021

DarkBullNull commented Jul 1, 2021 •

edited

Loading

kunalspathak commented Jul 1, 2021

JulieLeeMSFT commented Aug 31, 2021

kunalspathak commented Oct 4, 2021

kunalspathak commented Nov 1, 2021

DarkBullNull commented Nov 2, 2021

kunalspathak commented Nov 2, 2021

kunalspathak commented Nov 10, 2021

kunalspathak commented Nov 16, 2021

azure-pipelines bot commented Nov 16, 2021

kunalspathak commented Nov 16, 2021

kunalspathak commented Dec 13, 2021

[JIT] Optimization for left-shift operator #52297

[JIT] Optimization for left-shift operator #52297

Conversation

DarkBullNull commented May 5, 2021 • edited Loading

AndyAyersMS commented Jun 7, 2021

DarkBullNull commented Jun 28, 2021

DarkBullNull commented Jul 1, 2021 • edited Loading

kunalspathak commented Jul 1, 2021

DarkBullNull commented Jul 1, 2021 • edited Loading

kunalspathak commented Jul 1, 2021

JulieLeeMSFT commented Aug 31, 2021

kunalspathak commented Oct 4, 2021

kunalspathak commented Nov 1, 2021

DarkBullNull commented Nov 2, 2021

kunalspathak commented Nov 2, 2021

kunalspathak commented Nov 10, 2021

kunalspathak commented Nov 16, 2021

azure-pipelines bot commented Nov 16, 2021

kunalspathak commented Nov 16, 2021

kunalspathak commented Dec 13, 2021

DarkBullNull commented May 5, 2021 •

edited

Loading

DarkBullNull commented Jul 1, 2021 •

edited

Loading

DarkBullNull commented Jul 1, 2021 •

edited

Loading