-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Implement struct marshalling via IL Stubs instead of via FieldMarshalers #26340
Implement struct marshalling via IL Stubs instead of via FieldMarshalers #26340
Conversation
a811d96
to
f3ac22e
Compare
What is the perf test actually doing?
Why is that an acceptable regression? |
The perf test is calling a PInvoke passing the struct by value. I'll post the benchmark on Monday. Re the allocation: I'm planning on trying to figure out how to cache the allocated object so we only allocate at most once per stub instead of once per stub call. Additionally, the allocation only happens when marshalling an array of non-blittable structures, which in that case already has allocations, so I felt it was still worth putting this up as a draft to get some full CI runs on it before figuring out how to remove the single allocation. |
Ok. Thanks. |
Here's the benchmark (using the structs mentioned in the original post. The native library is the native component of the StructMarshaling/PInvoke test. Benchmarknamespace InteropBenchmarks
{
[MemoryDiagnoser]
public class StructMarshalling
{
private float f;
private S8 s8;
private InnerArraySequential ias;
private FixedBufferClassificationTest fb;
[GlobalSetup]
public void Setup()
{
f = MarshalStructAsParam.ProductHFA(default);
s8 = Helper.NewS8("hello", true, 10, 128, 128, 32);
ias = Helper.NewInnerArraySequential(1, 1.0F, "some string");
fb = Helper.NewFixedBuffer(42.0f);
}
[Benchmark]
public bool S8ByValue()
{
return MarshalStructAsParam.MarshalStructAsParam_AsSeqByVal11(s8);
}
[Benchmark]
public bool InnerArraySequentialByValue()
{
return MarshalStructAsParam.MarshalStructAsParam_AsSeqByVal2(ias);
}
[Benchmark]
public bool FixedBufferClassificationTestByValue()
{
return MarshalStructAsParam.MarshalStructAsParam_AsSeqByValFixedBufferClassificationTest(fb, fb.f.F);
}
}
}
class MarshalStructAsParam
{
[DllImport(nameof(MarshalStructAsParam))]
public static extern float ProductHFA(HFA hfa);
[DllImport(nameof(MarshalStructAsParam))]
public static extern bool MarshalStructAsParam_AsSeqByVal11(S8 str);
[DllImport(nameof(MarshalStructAsParam))]
public static extern bool MarshalStructAsParam_AsSeqByVal2(InnerArraySequential str1);
[DllImport(nameof(MarshalStructAsParam))]
public static extern bool MarshalStructAsParam_AsSeqByValFixedBufferClassificationTest(FixedBufferClassificationTest str, float f);
} |
/azp run coreclr-ci |
Pull request contains merge conflicts. |
I've updated the perf benchmarks with the current result. I've removed the allocation from each iteration by caching it in the |
Perf numbers updated for commit 034db68 where I've changed the |
@AaronRobinsonMSFT can you make a review pass when you have a chance? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am half way through at dllimport.h
.
int numChars = strManaged.Length; | ||
if (numChars >= length) | ||
{ | ||
numChars = length - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a doc reference for the logic where we apply a guaranteed null terminator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe we have a doc reference for the auto-null-terminator logic.
@jkoritzinsky Can you squash these commits and rebase? Some of the tooling I have only supports 64 commits for review :-/ |
src/vm/ilstubresolver.cpp
Outdated
NewArrayHolder<CompileTimeState> pNewCompileTimeState = NULL; | ||
if (!UseLoaderHeap()) | ||
{ | ||
NewArrayHolder<BYTE> pNewILCodeBuffer = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you pull these command variables outside the branch. I don't see why they need to be different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason the branching here is weird is because we want to be able to use the holders for cleanup in the exceptional/failure case. However, when we don't need to use the loader heap, we allocate via "new" and delete in the ClearCompileTimeState
method. If we use the loader heap, we want to allocate from the loader heap and use the appropriate holder and only delete when the loader is unloaded (not in ClearCompileTimeState
). If we don't keep the data after ClearCompileTimeState
, the JIT will assert when trying to JIT a P/Invoke or struct marshal stub that calls this struct marshal stub.
It's a really awkward case where RAII doesn't quite do what we want unless we create a "multi-holder" type from which we could pick which holder from a set we want to use.
Remove dead code for calculating managed field size. Add field IL marshal infra up to MarshalInfo::GenerateFieldIL. Add preliminary changes for MarshalInfo ctor to have the logic handle field marshalling logic. Still need to handle WinRT struct field logic correctly. First shot at handling fields of WinRT structs. Cleanup Clean up entrypoints Fix cleanup marshal emit. Disable specific paths on struct marshal stubs. Implement emitting full struct marshalling stub. Add StructMarshalInteropStub method name. Add NDirect::CreateMarshalILStub Get byvalue StructMarshalling/PInvoke tests passing excluding missing ILMarshalers (ByValArray and ByValTStr). Correctly classify struct marshal stubs as struct marshal stubs instead of PInvoke stubs. Implement UnmanagedType.ByValArray IL marshaler. Implement ILMarshaler equivalent for ansi-char fixed arrays. Fix parameter mismatch in Native->CLR direction for struct marshalling. Implement ByValTStr marshalling. Support unaligned fields in IL stubs. Load CleanupWorkList from param list if in a struct marshalling stub Implement SafeHandle and CriticalHandle field marshalling in IL struct stubs Fix handle field marshalers. Add error reporting in struct field IL marshalers consistent with old FieldMarshaler error reporting. Convert Array-of-nonblittable-struct marshalling to use IL stubs. Convert LayoutClass marshalling to use IL stubs. Fix marshalling of LayoutClass fields in structs. Add non-blittable fixed buffer assert in the struct IL stub path. Implement Marshal APIs via the IL stubs. Fix default char marshaler selection. Move hidden-length-array marshalling over to struct marshalling stubs. Convert struct marshal IL stub users to use helper that will automatically cleanup on failure to marshal. Match MarshalInfo::MarshalInfo behavior to ParseNativeType for fields. Remove old FieldMarshaler-based marshalling. Fix signed/unsigned mismatch. Fix IsFieldScenario on non-COMINTEROP plaforms Fix off-Windows build. Handle automatic partial cleanup of struct marshaling entirely within the struct stub. Remove now-unused ValueClassMarshaler. Move DateMarshaler to managed since it doesn't need to be an FCall. Error out on recursive struct definitions in the IL stub generation as we did in the field marshalling. Remove FieldMarshalers and replace with a significantly simpler structure (NativeFieldDescriptor) that stores only the needed info for determining validity in WinRT, blittability, and calling convention classification. This will save 4/8 bytes for every field in a non-auto-layout structure or class loaded into the runtime. Add explicit test for recursive native layout. Allow marshalling as UnmanagedType.Error on all platforms (on non-Windows the behavior matches UnmanagedType.I4). Collapse common primitive marshalling cases together. Disable WinRT parameter/retval-only marshalers in field scenarios. Revert "Collapse common primitive marshalling cases together." This reverts commit e73b78a. Fix error marshalling off Windows for uint. Disable LPStruct marshalling in structs. Disable copy-constructor marshaler in the field scenario. Match error messages between MarshalInfo::MarshalInfo and ParseNativeType in the field scenario. Refactor managed-sequential check out of ParseNativeType. Remove invalid MARSHAL_TYPE_GENERIC_U8 references. Add override specifier. Change ParseNativeType to use MarshalInfo::MarshalInfo to calculate field marshalling info instead of maintaining two native field classification functions. Clean up native field categories. Remove nsenums.h since it is now unused. Move CheckIfDisqualifiedFromManagedSequential to class.cpp. Encapsulate stub flags for struct stubs. Read the BestFitAttribute once for the type instead of per field. Fix perf regression in by-val arrays of non-blittable structures by caching the MethodDesc* to the struct stub in the "managed marshaler" structure. Now we have a perf improvement! Fix memory leak in sig creation. Keep compile-time information for struct stubs around as long as the owning loader allocator (instead of leaking). Allocate the signature on the same heap as the IL stubs so it shares the same lifetime. Fix build with fragile NGen support enabled so as to not break partners. Add missing native field descriptors. Fix clang build. Only assert if we're emitting IL (so we don't assert during type-load). Determine desciptor for pointer-sized fields based on target not host pointer size. Don't emit IL stubs that call struct stubs into R2R images since there's not a good way for us to emit the token. Fix tracing test failures. Force field marshaling to not stackalloc. Cache Sytem.RuntimeMethodInfoStub instances created in the VM in the MethodDesc's owning LoaderAllocator. Struct marshal stubs don't have an MethodDesc context arg. Copy FieldDesc on NFD assignment. Fix initialization of stubMethodInfoCache lock owner. Fix alignment calculation of decimal fields and decimal array fields in NFDs. Fix Crst leveling. Enable handling decimal-as-currency fields accurately off-Windows. Fix deadlock where two threads are both trying to generate an IL stub for the same P/Invoke and one of the parameters needs a struct stub generated. Fix incorrect check for if we need a struct marshal stub in some of the variant/array cases. We never need to promote a field value to 8 bytes. Fix issue with recursive pointer fields. Shortcut blittable types in stubhelpers. Use LDFTN + PREPARE_NONVIRUTAL_CALLSITE_USING_CODE instead of LDTOKEN + GetInternalToken. Revert "Fix Crst leveling." This reverts commit 1d8e56e. Revert "Fix initialization of stubMethodInfoCache lock owner." This reverts commit a095390. Revert "Cache Sytem.RuntimeMethodInfoStub instances created in the VM in the MethodDesc's owning LoaderAllocator." This reverts commit 7266538. Fix case where struct marshal stub is unused in native-only mashalling paths. PR Feedback. Clean up terenary statement in dispatchinfo.cpp Cleanup ILStubResolver::AllocGeneratedIL a little bit.
d493541
to
59bb717
Compare
I've rebased this on master. There were a lot of conflicts from a few months ago that I had to re-resolve as part of that, so a mistake may have snuck in that I didn't catch locally. |
@@ -1301,7 +1304,7 @@ void OleVariant::MarshalBoolVariantOleToCom(VARIANT *pOleVariant, | |||
#endif // FEATURE_COMINTEROP | |||
|
|||
void OleVariant::MarshalBoolArrayOleToCom(void *oleArray, BASEARRAYREF *pComArray, | |||
MethodTable *pInterfaceMT) | |||
MethodTable *pInterfaceMT, PCODE pManagedMarshalerCode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pManagedMarshalerCode [](start = 75, length = 21)
It seems very few of these functions are using this parameter. What is the reason for them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These functions are passed up to the calling code through the Marshaler
struct in olevariant.h
. The design there is to have each field be a function pointer that points to each needed marshalling method. As a result, they all need to have the same signature, even if one of the parameters (such as this one) is unused. It's the same reason that every marshaler in olevariant.cpp has a MethodTable*
parameter when only the interface marshaler uses it.
We can't resolve the struct marshalling stub from the MethodTable*
in the marshaler since the lookup is slow enough to slow down non-blittable structure array marshalling by at least 600%.
Please create a new issue to reduce this argument list. It is close to impossible to understand at instantiation time what all the bools and constants mean. We have Refers to: src/vm/mlinfo.h:456 in 59bb717. [](commit_id = 59bb717, deletion_comment = False) |
Any chance you could keep the function but have it always return FALSE when Refers to: src/vm/mlinfo.h:675 in 59bb717. [](commit_id = 59bb717, deletion_comment = False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am generally okay with this, but do have some concerns I would like to talk about off-line before committing.
…lling, and VT_RECORD is not used in hidden-length array marshalling.
I've tried to do this and it only really allows us to remove the FEATURE_COMINTEROP defines around the error cases in four places. The majority of usages have to stay covered since we don't define the WinRT marshalers when FEATURE_COMINTEROP is turned off. Since most usages still have to be covered with ifdefs, I think we might as well leave it as is. |
Submitted the issue. Link: https://github.com/dotnet/coreclr/issues/27399 |
@jkotas do you have any feedback on this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I want to get this in as early as possible for consumption by as many stakeholders as we can. This can have great benefit to us and teams like WinForms, but is also a concern due to backwards compat. I see the care taken in merging the two methods, so thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not reviewed this in detail, but it looks good to me. Anything that removes 3000 lines without breaking any tests must be good :-)
Linux_musl x64 failure is #26057 |
Currently, our system for marshalling fields of structures between managed and native code is completely separate from our system for marshalling parameters or return values, even though most of the code in the two systems can be shared. This PR unifies the two systems by removing the field marshalers in favor of using IL stubs and a new
NativeFieldDescriptor
concept which is 9 bytes smaller then the oldFieldMarshaler
s.Perf numbers:
CoreCLR.dll's size is reduced by ~30kB on Windows x64.
I wrote some microbenchmarks to benchmark marshalling with various types of structs:
Struct Definitions
The following are the perf numbers I got on Windows x64: (CoreRun Master is a local Release build of CoreCLR on commit 402af7b.)
The allocation inInnerArraySequentialByValue
is aSystem.RuntimeMethodInfoStub
allocated by theJIT_GetRuntimeMethodStub
helper for implementing theldtoken
instruction that loads the token for the nestedInnerSequential
struct IL stub onto the stack in theInnerArraySequential
struct IL stub.In addition to normal CoreCLR testing, I've also run the WinForms integration test suite with this local build of CoreCLR to validate that it doesn't break upstream. I've also run the struct marshalling tests with GCStress modes 3 and C.