Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manually optimize a rem 64 instruction to avoid regression on Mono #96203

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,7 @@ private static FrozenDictionary<TKey, TValue> CreateFromDictionary<TKey, TValue>
{
if (key.Length < minLength) minLength = key.Length;
if (key.Length > maxLength) maxLength = key.Length;
lengthFilter |= (1UL << (key.Length % 64));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like something that should be fixed in mono itself, if it makes such an impactful difference?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't appear to be the primary cause of regressions in c28bec4. We will investigate it further to detect the root cause.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the validation in #96203 (comment) incorrect? Can we revert this then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my limited knowledge and research I couldn't find the optimization of % 64 -> & 0x3F in mono's code so this optimization might still be valid.

Just looking at the regressions and I want to point out that we see a regression in System.Collections.Perf_SubstringFrozenDictionary on mono dotnet/perf-autofiling-issues#26221.

This is strange as the original commit c28bec4 is designed to not affect TryGetValue on substring strategy subtypes of OrdinalStringFrozenDictionary . It works that way because each concrete implementation should be getting it's own codegen and in turn be optimized to if(true) as this existing comment sums up
image

Based on that I'd say regressions on SubstringFrozenDictionary tests point towards the method call to CheckLengthFilter is not being inlined or possibly we are even doing virtual method dispatch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this optimization might still be valid.

Even if it is, the changed code is harder to understand / maintain than the original IMO, and the pattern of mod'ing an array/span length is super common; this is just one occurrence of that. If it's impactful here, it'd be impactful in many more places, and I'd prefer we not one-off it. It also sounds like the measurements that suggested this was valuable in this case was flawed, and so we don't actually know in this particular case whether it made a meaningful difference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also worth noting that x % 64 becoming x & 0x3F is only a valid optimization for positive x, while it returns a different result for negative x, so it's not always a universal option to replace either.

This optimization currently lights up in RyuJIT by virtue of key being a string and the runtime having implicit knowledge that string.Length (as well as array.Length and span.Length) are never negative.

From my limited knowledge and research I couldn't find the optimization of % 64 -> & 0x3F in mono's code so this optimization might still be valid.

Such optimizations generally involve checking that for x % y, x is positive and y is a power of two. It's generally easiest to just check the codegen, but this is really one of those fundamental optimizations around division/remainder that a compiler should recognize as Stephen indicated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to submit a revert PR now and for it to be approved/merged whenever but I might be unavailable for a few weeks at some point soon so I'd rather submit now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the validation in #96203 (comment) incorrect? Can we revert this then?

The validation was incorrect, sorry for confusion.

Could the regressions be related to inlining, especially since CheckLengthQuick was introduced? Additionally, there has been a change from using if (Equals(item, _items[index])) to if (hashCode == _hashTable.HashCodes[index]). Is it possible that Equals has been optimized?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the regressions be related to inlining, especially since CheckLengthQuick was introduced?

Almost certainly since in CoreCLR, for the benchmark SubstringFrozenDictionary, if(CheckLengthQuick(key)) is first inlined to if(true) and then just optimized away resulting in no change at all. Whereas there is a regression in mono meaning that at the very least one of these optimizations didn't work.

The inlining is possible because concrete sealed implementations of FrozenDictionary all have this line of code

private protected override ref readonly TValue GetValueRefOrNullRefCore(string key) => ref base.GetValueRefOrNullRefCore(key);

which allows the JIT to codegen for each concrete implementation. When doing so, it is able to inline the implementation's Equals, GetHashCode as it is generating the code for the specific implementation not the base class. in CoreCLR the same is happening for CheckLengthQuick, however as opposed to the existing methods, CheckLengthQuick is virtual and not overridden - could either of these be a reason that mono isn't inlining it like it presumably inlines Equals and GetHashCode?

Additionally, there has been a change from using if (Equals(item, _items[index])) to if (hashCode == _hashTable.HashCodes[index]). Is it possible that Equals has been optimized?

@kotlarmilos I believe that's just the diff viewer. The change is really just adding the if (CheckLengthQuick(key)) and some indenting.

lengthFilter |= (1UL << (key.Length & 0x3F));
}
Debug.Assert(minLength >= 0 && maxLength >= minLength);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ private static FrozenSet<T> CreateFromSet<T>(HashSet<T> source)
{
if (s.Length < minLength) minLength = s.Length;
if (s.Length > maxLength) maxLength = s.Length;
lengthFilter |= (1UL << (s.Length % 64));
lengthFilter |= (1UL << (s.Length & 0x3F));
}
Debug.Assert(minLength >= 0 && maxLength >= minLength);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,6 @@ internal OrdinalStringFrozenDictionary_Full(

private protected override bool Equals(string? x, string? y) => string.Equals(x, y);
private protected override int GetHashCode(string s) => Hashing.GetHashCodeOrdinal(s.AsSpan());
private protected override bool CheckLengthQuick(string key) => (_lengthFilter & (1UL << (key.Length % 64))) > 0;
private protected override bool CheckLengthQuick(string key) => (_lengthFilter & (1UL << (key.Length & 0x3F))) > 0;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,6 @@ internal OrdinalStringFrozenDictionary_FullCaseInsensitive(

private protected override bool Equals(string? x, string? y) => StringComparer.OrdinalIgnoreCase.Equals(x, y);
private protected override int GetHashCode(string s) => Hashing.GetHashCodeOrdinalIgnoreCase(s.AsSpan());
private protected override bool CheckLengthQuick(string key) => (_lengthFilter & (1UL << (key.Length % 64))) > 0;
private protected override bool CheckLengthQuick(string key) => (_lengthFilter & (1UL << (key.Length & 0x3F))) > 0;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,6 @@ internal OrdinalStringFrozenDictionary_FullCaseInsensitiveAscii(

private protected override bool Equals(string? x, string? y) => StringComparer.OrdinalIgnoreCase.Equals(x, y);
private protected override int GetHashCode(string s) => Hashing.GetHashCodeOrdinalIgnoreCaseAscii(s.AsSpan());
private protected override bool CheckLengthQuick(string key) => (_lengthFilter & (1UL << (key.Length % 64))) > 0;
private protected override bool CheckLengthQuick(string key) => (_lengthFilter & (1UL << (key.Length & 0x3F))) > 0;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,6 @@ internal OrdinalStringFrozenSet_Full(

private protected override bool Equals(string? x, string? y) => string.Equals(x, y);
private protected override int GetHashCode(string s) => Hashing.GetHashCodeOrdinal(s.AsSpan());
private protected override bool CheckLengthQuick(string key) => (_lengthFilter & (1UL << (key.Length % 64))) > 0;
private protected override bool CheckLengthQuick(string key) => (_lengthFilter & (1UL << (key.Length & 0x3F))) > 0;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,6 @@ internal OrdinalStringFrozenSet_FullCaseInsensitive(

private protected override bool Equals(string? x, string? y) => StringComparer.OrdinalIgnoreCase.Equals(x, y);
private protected override int GetHashCode(string s) => Hashing.GetHashCodeOrdinalIgnoreCase(s.AsSpan());
private protected override bool CheckLengthQuick(string key) => (_lengthFilter & (1UL << (key.Length % 64))) > 0;
private protected override bool CheckLengthQuick(string key) => (_lengthFilter & (1UL << (key.Length & 0x3F))) > 0;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,6 @@ internal OrdinalStringFrozenSet_FullCaseInsensitiveAscii(

private protected override bool Equals(string? x, string? y) => StringComparer.OrdinalIgnoreCase.Equals(x, y);
private protected override int GetHashCode(string s) => Hashing.GetHashCodeOrdinalIgnoreCaseAscii(s.AsSpan());
private protected override bool CheckLengthQuick(string key) => (_lengthFilter & (1UL << (key.Length % 64))) > 0;
private protected override bool CheckLengthQuick(string key) => (_lengthFilter & (1UL << (key.Length & 0x3F))) > 0;
}
}