-
Notifications
You must be signed in to change notification settings - Fork 509
Remove reflection in ValueType.Equals/GetHashCode #5436
Remove reflection in ValueType.Equals/GetHashCode #5436
Conversation
@mikedn this is where dotnet/coreclr#16527 would help a bit |
Do you have any performance measurements for this? The reflection version on ProjectN has prompted some customer complaints when they hit it. |
|
||
// Compare the memory | ||
int valueTypeSize = (int)this.EETypePtr.ValueTypeSize; | ||
for (int i = 0; i < valueTypeSize; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really the efficient way to do this? I thought a lot of work went into making Span comparison fast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not available it CoreLib yet. It will be after dotnet/coreclr#16521.
int fieldOffset = __GetFieldHelper(i, out EETypePtr fieldType); | ||
|
||
// Fetch the value of the field on both types | ||
object thisField = RuntimeImports.RhBoxAny(ref Unsafe.Add(ref thisRawData, fieldOffset), fieldType); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is working on a boxed copy correct? I'd think you could have a self-modifying equals/gethashcode method (yes, I'm entirely clear on how insane that sounds)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not quite right, but it matched what CoreCLR and the existing ProjectN reflection-based implementation does for Equals. It is broken in so many different ways...
|
||
// TODO: what we're shooting for is overlapping fields | ||
// or gaps between fields | ||
if (type.IsExplicitLayout || type.GetClassLayout().Size != 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens for padding from alignment? For example:
struct MyStruct
{
byte b; // byte 0
// 3-7 bytes of padding
IntPtr ptr; // byte 4 or 8
}
We wouldn't want to do byte comparisons with padding since it's not guaranteed to be set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the TODO.
@@ -26,6 +31,7 @@ public override String ToString() | |||
return this.GetType().ToString(); | |||
} | |||
|
|||
#if PROJECTN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should/could ProjectN adopt this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would just have to update the IL2IL transform that does this. The long term plan is to get rid of transforms, so I would kind of prefer this was another of those things that will be naturally picked up when we do the next step of Project X (get rid of STS type system and host C2 from the CoreRT compiler driver directly).
} | ||
else if (fieldType.IsPrimitive) | ||
{ | ||
hashCode = FastGetValueTypeHashCodeHelper(fieldType, ref fieldData); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you've correctly ported this from CoreCLR, but it appears to be yet another bug (not every primitive has a hashcode equal to its value)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I've confirmed that structs smaller than 4 bytes all have the same hash code on CoreCLR because of this and the fact that FastGetValueTypeHashCodeHelper . 😦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
Debug.Assert(!fieldType.IsPointer); | ||
|
||
if (fieldType.CorElementType == RuntimeImports.RhCorElementType.ELEMENT_TYPE_R4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this looks like a good idea, CoreCLR just pushes this down the XOR path I complained about below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was supposed to be fixed by dotnet/coreclr#13164. Opened: https://github.com/dotnet/coreclr/issues/16545
// of __GetFieldHelper, decodes the unboxing stub pointed to by the slot to the real target | ||
// (we already have that part), and calls the entrypoint that expects a byref `this`, and use the | ||
// data to decide between calling fast or regular hashcode helper. | ||
object fieldValue = RuntimeImports.RhBox(fieldType, ref fieldData); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same boxing issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current CoreCLR implementation won't actually box and call GetHashCode for valuetype fields. It will always take xor paths for them (ignoring any GetHashCode overrides that the valuetypes may have).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will always take xor paths for them
It won't take the XOR path if the nested struct contains GC pointers or floating-point fields. Without building the extra feature the comment is talking about, we can't avoid the box. I'll do some factoring to avoid calling the actual GetHashCode
and simulate what CLR would do.
} | ||
else | ||
{ | ||
object fieldValue = Unsafe.Read<object>(ref fieldData); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the read guaranteed to be GC safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an atomic read from a GC tracked interior pointer, so yes.
Intuitively, it looked faster than the CLR, but it does sound like a good idea to collect some numbers for it: Test programstatic void Main(string[] args)
{
o1.Equals(o2);
o1.GetHashCode();
Stopwatch sw = Stopwatch.StartNew();
for (int i = 0; i < 1000000; i++)
{
o1.Equals(o2);
}
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Restart();
for (int i = 0; i < 1000000; i++)
{
o1.GetHashCode();
}
Console.WriteLine(sw.ElapsedMilliseconds);
} First definition of
|
Run 1 | Run 2 | Run 3 | |
---|---|---|---|
CLR | 384ms/49ms | 388ms/49ms | 395ms/50ms |
CoreRT new | 149ms/28ms | 149ms/30ms | 146ms/28ms |
CoreRT old | 400ms/960ms | 399ms/961ms | 395ms/952ms |
Second definition of Foo
static object o1 = new Foo();
static object o2 = new Foo { Z = 123 };
struct Foo
{
public int X;
public int Y;
public int Z;
}
Run 1 | Run 2 | Run 3 | |
---|---|---|---|
CLR | 23ms/31ms | 23ms/32ms | 23ms/31ms |
CoreRT new | 18ms/15ms | 18ms/15ms | 18ms/15ms |
CoreRT old | 468ms/948ms | 527ms/961ms | 492ms/965ms |
Looking at the reflection-based ("CoreRT old") Lines 155 to 166 in 91fde46
@morganbr Where is the lexicographical sort coming from? Testing this on the desktop CLR, it seems to be taking the first field in metadata order, not lexicographical order. I would still not want to run that code because it will still be slow, but maybe until Project N is able to adopt the new scheme, we should look at this part? |
The implementation relies on a new virtual method on `System.ValueType` that provides information about fields on a type. An override of this method is injected by the compiler when needed. This implementation is different from what Project N does (where we inject actual overrides of both methods). The CoreRT implementation is a bit more space-saving because there is only one method that does fewer things. This ends up being a 0.9% regression in size of a hello world app, but we do get some correctness with it and a potential to get the reflection stack completely out of the base hello world image (that one will be a huge win). We can get some of the regression back by: * Getting RyuJIT to generate more efficient code for the offset calculation (dotnet/coreclr#16527) * Making fewer things reflectable. We're currently generating the data for e.g. OSVERSIONINFO because it's used as a parameter in a method that is reflectable, and therefore the parameter gets boxed, and therefore it needs this data. It also drags in a fixed buffer internal type because it's a valuetype field on OSVERSIONINFO. This alone costs us 100+ bytes.
Alternate implementation to #5226 that has a lot less complexity, but it's a bit more costly from size on disk perspective.
The implementation relies on a new virtual method on
System.ValueType
that provides information about fields on a type. An override of this method is injected by the compiler when needed.This implementation is different from what Project N does (where we inject actual overrides of both methods). The CoreRT implementation is a bit more space-saving because there is only one method that does fewer things.
This ends up being a 0.9% regression in size of a hello world app, but we do get some correctness with it and a potential to get the reflection stack completely out of the base hello world image (that one will be a huge win). We can get some of the regression back by: