-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Manually optimize a rem 64 instruction to avoid regression on Mono #96203
Merged
kotlarmilos
merged 1 commit into
dotnet:main
from
andrewjsaid:frozen-collections-length-filter-perf-rem64
Dec 20, 2023
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like something that should be fixed in mono itself, if it makes such an impactful difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't appear to be the primary cause of regressions in c28bec4. We will investigate it further to detect the root cause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was the validation in #96203 (comment) incorrect? Can we revert this then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my limited knowledge and research I couldn't find the optimization of
% 64
->& 0x3F
in mono's code so this optimization might still be valid.Just looking at the regressions and I want to point out that we see a regression in
System.Collections.Perf_SubstringFrozenDictionary
on mono dotnet/perf-autofiling-issues#26221.This is strange as the original commit c28bec4 is designed to not affect
TryGetValue
on substring strategy subtypes ofOrdinalStringFrozenDictionary
. It works that way because each concrete implementation should be getting it's own codegen and in turn be optimized toif(true)
as this existing comment sums upBased on that I'd say regressions on
SubstringFrozenDictionary
tests point towards the method call toCheckLengthFilter
is not being inlined or possibly we are even doing virtual method dispatch.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if it is, the changed code is harder to understand / maintain than the original IMO, and the pattern of mod'ing an array/span length is super common; this is just one occurrence of that. If it's impactful here, it'd be impactful in many more places, and I'd prefer we not one-off it. It also sounds like the measurements that suggested this was valuable in this case was flawed, and so we don't actually know in this particular case whether it made a meaningful difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also worth noting that
x % 64
becomingx & 0x3F
is only a valid optimization forpositive x
, while it returns a different result fornegative x
, so it's not always a universal option to replace either.This optimization currently lights up in RyuJIT by virtue of
key
being astring
and the runtime having implicit knowledge thatstring.Length
(as well asarray.Length
andspan.Length
) are never negative.Such optimizations generally involve checking that for
x % y
,x is positive
andy is a power of two
. It's generally easiest to just check the codegen, but this is really one of those fundamental optimizations around division/remainder that a compiler should recognize as Stephen indicated.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to submit a revert PR now and for it to be approved/merged whenever but I might be unavailable for a few weeks at some point soon so I'd rather submit now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The validation was incorrect, sorry for confusion.
Could the regressions be related to inlining, especially since
CheckLengthQuick
was introduced? Additionally, there has been a change from usingif (Equals(item, _items[index]))
toif (hashCode == _hashTable.HashCodes[index])
. Is it possible thatEquals
has been optimized?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost certainly since in CoreCLR, for the benchmark
SubstringFrozenDictionary
,if(CheckLengthQuick(key))
is first inlined toif(true)
and then just optimized away resulting in no change at all. Whereas there is a regression in mono meaning that at the very least one of these optimizations didn't work.The inlining is possible because concrete sealed implementations of FrozenDictionary all have this line of code
which allows the JIT to codegen for each concrete implementation. When doing so, it is able to inline the implementation's
Equals
,GetHashCode
as it is generating the code for the specific implementation not the base class. in CoreCLR the same is happening forCheckLengthQuick
, however as opposed to the existing methods,CheckLengthQuick
is virtual and not overridden - could either of these be a reason that mono isn't inlining it like it presumably inlinesEquals
andGetHashCode
?@kotlarmilos I believe that's just the diff viewer. The change is really just adding the
if (CheckLengthQuick(key))
and some indenting.