Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JIT] Optimization for left-shift operator #52297

Closed
wants to merge 32 commits into from

Conversation

DarkBullNull
Copy link
Contributor

@DarkBullNull DarkBullNull commented May 5, 2021

Oh, time to learn how to use git) Sorry me for my "Crooked PRs"
The operation "x << 2" and "x << 3" can be optimized. Just 1 quick instruction.
============(x << 2)============
Before:

"MOV EAX, EDX"     // 89 D0
"SHL EAX, 2"       // C1 E0 02
"RET"              // C3

After:

"LEA EAX, [EDX*4]" // 8D 04 95 00 00 00 00
"RET"              // C3

============(x << 3)============
Before:

"MOV EAX, EDX"     // 89 D0
"SHL EAX, 3"       // C1 E0 03
"RET"              // C3

After:

"LEA EAX, [EDX*8]" // 8D 04 D5 00 00 00 00
"RET"              // C3

This will be faster than a normal left shift.

SPMI shows that there are no improvements, there are regressions. But actually, this is because "LEA REG, [REG*4]" takes up two bytes more than the shift, but this is compensated by the execution speed.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 5, 2021
@AndyAyersMS
Copy link
Member

@DarkBullNull can you recheck diffs now that #53053 is done?

@DarkBullNull
Copy link
Contributor Author

Sorry for my absence. I have a term paper on the 30th, so I'm not answering yet.

@DarkBullNull
Copy link
Contributor Author

DarkBullNull commented Jul 1, 2021

C#:

[MethodImpl(MethodImplOptions.NoInlining | MethodImplOptions.AggressiveOptimization)]
bool Test1(int x)
{
	return ((x << 2) == 0) ? false : true;
}

[MethodImpl(MethodImplOptions.NoInlining | MethodImplOptions.AggressiveOptimization)]
bool Test2(int x, int y)
{
	return ((y << 2) == x) ? false : true;
}

Before all the changes:

Test1:
Asm: (14 bytes)
c1e202           shl     edx,2
7406             je      00007fff1363f63d
b801000000       mov     eax,1
c3               ret
33c0             xor     eax,eax
c3               ret

Test2:
Asm: (18 bytes)
41c1e002         shl     r8d,2
443bc2           cmp     r8d,edx
7406             je      00007fff13642f4f
b801000000       mov     eax,1
c3               ret
33c0             xor     eax,eax
c3               ret

After all the changes:

Test1:
Asm: (20 bytes)
8d149500000000   lea     edx,[rdx*4]
85d2             test    edx,edx
7406             je      00007ffef6648d01
b801000000       mov     eax,1
c3		 ret
33c0	         xor     eax,eax
c3		 ret

Test2:
Asm: (23 bytes)
468d048500000000 lea     r8d,[r8*4]
443bc2	         cmp     r8d,edx
7406	         je      00007ffef664c163
b801000000       mov     eax,1
c3		 ret
33c0	         xor     eax,eax
c3		 ret

jit-diffs result: https://gist.github.com/DarkBullNull/011acc7a68d8e055d0e24ff973c438fa

@kunalspathak
Copy link
Member

Did you rebase your changes? With #53214, I am seeing below code for Test1(). It doesn't have test instruction after shl.

G_M37926_IG01:

G_M37926_IG02:
       shl      edx, 2
       je       SHORT G_M37926_IG05

G_M37926_IG03:
       mov      eax, 1

G_M37926_IG04:
       ret

G_M37926_IG05:
       xor      eax, eax

G_M37926_IG06:
       ret

@DarkBullNull
Copy link
Contributor Author

DarkBullNull commented Jul 1, 2021

@kunalspathak Yes, sorry, i fixed it

@kunalspathak
Copy link
Member

Thanks! So I see that the code size regressed more (as expected) compared to before #53214. Could you also post perfscore diffs for various collections that you posted initially? This will give an idea if the code size regression is worth considering.

@terrajobst terrajobst added the community-contribution Indicates that the PR has been added by a community member label Jul 19, 2021
@JulieLeeMSFT
Copy link
Member

Thanks! So I see that the code size regressed more (as expected) compared to before #53214. Could you also post perfscore diffs for various collections that you posted initially? This will give an idea if the code size regression is worth considering.

Ping @DarkBullNull

@kunalspathak
Copy link
Member

Ping again @DarkBullNull

@kunalspathak
Copy link
Member

@DarkBullNull - let us know if you still want to pursue this?

@DarkBullNull
Copy link
Contributor Author

With optimiz:
PerfScore is 1.75 for:

bool Test1(int x)
{
     return ((x << 2) == 0) ? false : true;
}

PerfScore is 1.75 for:

[MethodImpl(MethodImplOptions.NoInlining | MethodImplOptions.AggressiveOptimization)]
bool Test2(int x, int y)
{
    return ((y << 2) == x) ? false : true;
}

Without optimiz:
PerfScore is 1.50 for:

bool Test1(int x)
{
     return ((x << 2) == 0) ? false : true;
}

PerfScore is 1.75 for:

[MethodImpl(MethodImplOptions.NoInlining | MethodImplOptions.AggressiveOptimization)]
bool Test2(int x, int y)
{
    return ((y << 2) == x) ? false : true;
}

@kunalspathak
Copy link
Member

Thanks @DarkBullNull ...I will take a look

@kunalspathak
Copy link
Member

Do you mind rebasing your changes on latest main? I will then double check the diffs.

@kunalspathak
Copy link
Member

/azp run runtime-coreclr superpmi-asmdiffs

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@kunalspathak
Copy link
Member

I inspected the asmdiffs coming out of this PR and as you pointed, the primary reason of the code size diff is coming from lea being longer instruction, but there are many places where now we are emitting longer jump because of that.

https://www.diffchecker.com/Yqe33Cac

As such, I am not sure the savings we get from converting shl -> lea justify the longer jumps penalties. Do you have any measurements that prove that this optimization still shows any benefits?

@kunalspathak
Copy link
Member

Ping. Should we pursue this PR or should I go ahead and close it?

@ghost ghost locked as resolved and limited conversation to collaborators Jan 14, 2022
@eiriktsarpalis eiriktsarpalis added needs-author-action An issue or pull request that requires more info or actions from the author. and removed needs more info labels Jan 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member needs-author-action An issue or pull request that requires more info or actions from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants