-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Fixed the Equals bug of floating point number and user-defined struct. #13164
Conversation
@tannergooding @tarekgh @jkotas If the direction is correct, I may add a Line 2093 in 44f5706
|
@dotnet-bot test Ubuntu x64 corefx_baseline |
Would be nice to wait for @jkotas to review this change. |
src/vm/methodtablebuilder.cpp
Outdated
@@ -1809,7 +1811,7 @@ MethodTableBuilder::BuildMethodTableThrowing( | |||
|
|||
if (IsValueClass()) | |||
{ | |||
if (bmtFP->NumInstanceFieldBytes != totalDeclaredFieldSize || HasOverLayedField()) | |||
if (bmtFP->NumInstanceFieldBytes != totalDeclaredFieldSize || HasOverLayedField() || hasFloatingPointNumberField) | |||
GetHalfBakedClass()->SetIsNotTightlyPacked(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This IsNotTighlyPacked
bit is used to disable JIT "struct promotion" optimizations. I do not think we want to be disabling the "struct promotion" optimization when the struct has float field. You need to find a new bit for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same fundamental bug exists for all fields with a custom Equals operator that do not contain GC reference. float/double are just one instance of this problem. If we are doing something about this, it maybe nice to fix it for all cases, not just float/double. |
@jkotas I may miss some background knowledge about this bug. Could you please give me an example to demonstrate the bug in non-double/float fields? |
Actual output: true (NeverEquals.Equals is not called)
|
More realistic example. Actual output: false (NeverEquals.Equals is not called)
|
@jkotas Thanks for the example! Another question is seems like we've exhausted the bits for VMFLAG_*: Line 2103 in 44f5706
I mean we've already reached 0x80000000 , I cannot find a larger value that has only 1 bit set... Should I create a new enum to store my new flag bit? |
Find an existing unused bit, e.g. |
To be clear, I need two bits. One is for floating/double +0.0, -0.0, NaN scenario, another one is for user-defined struct with overrided
Line 1414 in 44f5706
coreclr/src/vm/runtimehandles.cpp Line 1044 in c61525b
However, Line 562 in 9b7e736
I think I should find another available bit. |
This calls |
All this is left over code from full .NET Framework partial trust that can be deleted. (It would be nice to do this cleanup in separate PR.) |
@jkotas Thanks. I'll finish this PR first and then clean up these left over code in another PR :) . For now, as I need to use these two bits, I may change something related to make it compile-able. |
@jkotas to fix the overrided Equals issue, I briefly did:
To achieve step 2, I have to leverage I tried to use UPDATE: |
src/vm/methodtablebuilder.cpp
Outdated
// mazong1123: a valid overrided method. | ||
if (!*hasOverridedEqualsMethod) | ||
{ | ||
*hasOverridedEqualsMethod = strcmp((*it)->GetMethodSignature().GetName(), "Equals") == 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can have false positives.
src/vm/methodtablebuilder.cpp
Outdated
|
||
// mazong1123: a valid overrided method. | ||
if (!*hasOverridedEqualsMethod) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comparing method names and signatures is not free. Adding this kind of logic to type loader will make every type pay for it, but Equals will only ever be called on a few types. It would be better to do this lazily, only once somebody calls Equals on the type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, try to find an overrided Equals
when somebody calls Equals in a ValueType will need to DFS or BFS all the available fields of the types in worse case. If the hierarchy is deep it's very inefficient IMO. For instance:
struct MyStruct2
{
public NeverEquals _x;
public NeverEquals _y;
}
struct MyStruct3
{
public MyStruct2 _x;
public MyStruct2 _y;
}
struct NeverEquals
{
public double _a;
public override bool Equals(object obj)
{
return false;
}
public override int GetHashCode()
{
return 0;
}
}
class Program
{
static void Main(string[] args)
{
MyStruct3 i = new MyStruct3();
MyStruct3 j = new MyStruct3();
Console.WriteLine(i.Equals(j));
}
}
We need to drill down to NeverEquals
to find an overrided Equals
. That makes the "fast equal" no more fast...
Although we'll cache the result, that still break the original purpose of "fast equal" because it may be slower than the naive comparison at the first time.
I think we may need some trade-off. For example, something like "if you override any method in a ValueType, CanCompareBits
will return false". Because we can easily record whether the user override any method in the hierarchy, we can simply check it in CanCompareBits
.
Similarly, we can easily record whether the user used any floating point number field in the hierarchy, but it's hard to check if any floating point number field stores 0.0
or NaN
at runtime (Need a DFS or BFS).
@jkotas please let me know if I missed something. Thanks.
Right, checking for the Equals overrides is not cheap and that's why the result should be cached. The good place to cache the result of this check would be
This is close to impossible to check, not just hard. |
Yet another way to fix this issue would be to auto-generate the IL code for the Equals method. It would have good steady-state performance for all cases. |
@jkotas I've roughly implemented the algorithm to determine whether a value type has overridden Equals in its type hierachy. I'm going to optimize the performance but since I'm not very familiar about the type system codebase, it would be great if you can give me a hint:
|
That's fine. I do not think that the efficiency of ApproxFieldDescIterator is a problem - you can profile it to tell for sure. A similar algorithm is used in number of other places. Also, value types are expected to be relatively simple and small. Everything will be slow for big complex structs.
|
@jkotas Thanks. I'll optimize the code and profile the performance tommorrow. |
@jkotas I just realized I think I should fix it in a separate PR. What's your opinion? |
You may want to do the GetHashCode fix first, so that |
(Having both fixes in one PR would be fine with me as well.) |
class Cls
{
public byte _a;
public override bool Equals(object obj)
{
return false;
}
public override int GetHashCode()
{
return 0;
}
}
struct MyStruct6
{
public Cls _x;
public Cls _y;
}
static void Main(string[] args)
{
MyStruct6 i = new MyStruct6();
MyStruct6 j = new MyStruct6();
Console.WriteLine(i.Equals(j));
}
_x and _y would be null so i should equal to j. |
Agree. Thanks! |
@mazong1123 Could you please squash the commits and write a description that matches the final change? |
@jktoas sure. Will do it later today. |
@jkotas I squashed most of the commits. However, some commits came before I merge commits from master branch to resolve a conflict. If I squash those commits, the history of the merged commits would be lost. So I kept them. Please let me know if I need to squash those commits as well. |
I do not think that the history for this merge is valuable. It can be all squashed into one commit. |
Oops.. I think I need to redo the squash as the merged commits has been lost. |
@jkotas I cherry-picked some commits so I'd like to run my local tests this morning (I don't have a dev environment now). Please don't merge it before I finish the tests in case of any regression bug. |
@dotnet-bot test Ubuntu x64 corefx_baseline |
@jkotas the code passed my local tests so the logic should be fine. However, the CI (Windows_NT x86 Checked Build and Test (Jit - CoreFx)) started to complain today. The code change would not cause this failure 2 days ago. Do we have any update on CI server?
|
There are two problems here:
|
@jkotas I'll start a PR to test if master branch code has the same JIT asserting issue. If not, it would be introduced by this code change. |
@jkotas Now I'm pretty much sure the JIT assertion failure is not caused by my changes here. See the same failure here: https://ci.dot.net/job/dotnet_coreclr/job/master/job/jitstress/job/x86_checked_windows_nt_corefx_baseline_prtest/49/ I've updated the test PR to involve most recently changes. If it still fails, I will fire an issue. But for this PR I think it should be ready to merge. |
@dotnet-bot test OSX10.12 x64 Checked Build and Test |
@jkotas I've filed an issue to track the JIT assertion failure: https://github.com/dotnet/coreclr/issues/13486 |
@dotnet-bot test Windows_NT x86 corefx_baseline |
BOOL ret = FALSE; | ||
|
||
HELPER_METHOD_FRAME_BEGIN_RET_ATTRIB_1(Frame::FRAME_ATTR_EXACT_DEPTH|Frame::FRAME_ATTR_CAPTURE_DEPTH_2, objRef); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: extra line
src/vm/class.h
Outdated
@@ -1398,29 +1398,6 @@ class EEClass // DO NOT CREATE A NEW EEClass USING NEW! | |||
} | |||
|
|||
public: | |||
|
|||
inline void SetDoesNotHaveSuppressUnmanagedCodeAccessAttr() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These diffs are not part of your change - they are in master already. Could you please rebase them away?
src/vm/comutilnative.cpp
Outdated
{ | ||
// Check current field type. | ||
MethodTable* fieldMethodTable = pField->GetApproxFieldTypeHandleThrowing().GetMethodTable(); | ||
if(!CanCompareBitsOrUseFastGetHashCode(fieldMethodTable)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: space after it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find extra space after this line or at tail. Did you mean I need add space after this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant to say add space between if
and (
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK got it. I'll update it.
I saw the CI failure has been fixed https://github.com/dotnet/coreclr/issues/13486#issuecomment-324081359 @jkotas Can this PR be merged? |
I have just look at this. Yes, it looks good to me. I have just commented on a few last nits. |
Other than `ContainsPointers` and `IsNotTightlyPacked`, added two new conditions for checking whether a valuetype can go to fast path or not. Following are the details of these 2 conditions: - Check whether a valuetype contains a Single/Double field. If it does, we cannot go to fast path. Floating point numbers have special `Equals` and `GetHashCode` implementation. We cannot simply compare floating point numbers via bits (e.g. +0.0 should equal to -0.0, but their underlying bits are different). - Check whether an user-defined valuetype overrides `Equals` or `GetHashCode`. If it does, we cannot go to fast path. Because `Equals` or `GetHashCode` result may be different to bit comparison. To find Single/Double fields and overridden `Equals`/`GetHashCode`, we start a DFS to go through the whole hierachy of the source valuetype, and cache the result to `MethodTable`. Fix #11452
@jkotas I've updated accordingly. Please take a look. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
This change fixed |
This PR disables the fast equal check and fast hash code helper for floating point numbers so that 0.0 and -0.0 can be considered as the same. Same for all NaNs.
This definately can be optimized. For example, we should only opt out 0 and NaN for double/single and let other values use fast equal check and fast hash code helper.
UPDATE
The final implementation is as following:
Other than
ContainsPointers
andIsNotTightlyPacked
, added two new conditions for checking whether a valuetype can go to fast path or not. Following are the details of these 2 conditions:Check whether a valuetype contains a Single/Double field. If it does, we cannot go to fast path. Floating point numbers have special
Equals
andGetHashCode
implementation. We cannot simply compare floating point numbers via bits (e.g. +0.0 should equal to -0.0, but their underlying bits are different).Check whether an user-defined valuetype overrides
Equals
orGetHashCode
. If it does, we cannot go to fast path. BecauseEquals
orGetHashCode
result may be different to bit comparison.To find Single/Double fields and overridden
Equals
/GetHashCode
, we start a DFS to go through the whole hierachy of the source valuetype, and cache the result toMethodTable
.Fix #11452