-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial spec for the meaning and relationships between data contracts #99936
Changes from all commits
f47c1a9
43e8b20
5930477
d28135f
3ffef6c
3b6b83c
42dedca
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
# Contract GCHandle | ||
|
||
This contract allows decoding and reading of GCHandles. This will also include handle enumeration in the future | ||
|
||
## Data structures defined by contract | ||
``` csharp | ||
struct DacGCHandle | ||
{ | ||
DacGCHandle(TargetPointer value) { Value = value; } | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Its hard to tell what level of indirection we are working with. When I think of GCHandle in the runtime I imagine:
But here I think you are defining Value as the Object**? Two things that might help:
I think we'd end up with something like:
Reading further below I think there is also a discrepancy whether types in this section were intended to represent runtime types or logical abstractions. If these are intended to be abstractions then I'd suggest either define the APIs as member functions of this type or use |
||
TargetPointer Value; | ||
} | ||
``` | ||
|
||
## Apis of contract | ||
``` csharp | ||
TargetPointer GetObject(DacGCHandle gcHandle); | ||
``` | ||
|
||
## Version 1 | ||
|
||
``` csharp | ||
TargetPointer GetObject(DacGCHandle gcHandle) | ||
{ | ||
if (gcHandle.Value == TargetPointer.Null) | ||
return TargetPointer.Null; | ||
return Target.ReadTargetPointer(gcHandle.Value); | ||
} | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
# Contract SList | ||
|
||
This contract allows reading and iterating over an SList data structure. | ||
|
||
## Data structures defined by contract | ||
``` csharp | ||
class SListReader | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The header here claimed we were defining data structures, but this SListReader is type with no member fields so it doesn't feel like it belongs here. It appears to be part of the implementation that describes API behavior. Upon reading further below I noticed we are probably using the same terminology for different things. I'm not tied to my terminology, but in total I think we have at least three things that ideally will be clearly distinguished:
|
||
{ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think SList is unnecessary ceremony as a contract. Can we just have a next pointer in the Thread and a global that points to the first thread? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I disagree. The idea of having a concept of how a linked list works that is external to the definition of any particular type works with it is quite useful. I don't expect that consumers of the cDAC will want to use the Although, looking a bit more closely at this particular use case, we might actually want to consider a doubly linked list instead of a singly linked list. The current model has a problem that cleaning up N threads has a O(N^2) cost instead of an O(N) cost since this is singly linked. Finally, for V1 I'm describing the current the CoreCLR implementation of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My experience has been that any abstractions for linked list that are more than a trivial helper function or a macro are useless. I agree with you that an abstraction like this would be useful for hashtables. |
||
public abstract TargetPointer GetHead(TargetPointer slistPointer); | ||
public abstract TargetPointer GetNext(TargetPointer entryInSList); | ||
public IEnumerator<TargetPointer> EnumerateList(TargetPointer slistPointer) | ||
{ | ||
TargetPointer current = GetHead(slistPointer); | ||
|
||
while (current != TargetPointer.Null) | ||
{ | ||
yield return current; | ||
current = GetNext(current); | ||
} | ||
} | ||
public IEnumerator<TargetPointer> EnumerateListFromEntry(TargetPointer entryInSList) | ||
{ | ||
TargetPointer current = entryInSList; | ||
|
||
while (current != TargetPointer.Null) | ||
{ | ||
yield return current; | ||
current = GetNext(current); | ||
} | ||
} | ||
} | ||
``` | ||
|
||
## Apis of contract | ||
``` csharp | ||
SListReader GetReader(string typeOfDataStructure); | ||
``` | ||
|
||
## Version 1 | ||
|
||
``` csharp | ||
private class SListReaderV1 : SListReader | ||
{ | ||
uint _offsetToSLinkField; | ||
Target Target; | ||
|
||
SListReaderV1(Target target, string typeToEnumerate) | ||
{ | ||
Target = target; | ||
_offsetToSLinkField = Target.Contracts.GetFieldLayout(typeToEnumerate, "m_Link").Offset; | ||
} | ||
public override TargetPointer GetHead(TargetPointer slistPointer) | ||
{ | ||
TargetPointer headPointer = new SListBase(Target, slistPointer).m_pHead; | ||
TargetPointer slinkInHeadObject = new SLink(Target, headPointer).m_pNext; | ||
if (slinkInHeadObject == TargetPointer.Null) | ||
return TargetPointer.Null; | ||
return slinkInHeadObject - _offsetToSLinkField; | ||
} | ||
|
||
public override TargetPointer GetNext(TargetPointer entryInSList) | ||
{ | ||
if (entryInSList == TargetPointer.Null) | ||
throw new ArgumentException(); | ||
|
||
TargetPointer slinkPointer = entryInSList + _offsetToSLinkField; | ||
TargetPointer slinkInObject = new SLink(Target, slinkPointer).m_pNext; | ||
if (slinkInObject == TargetPointer.Null) | ||
return TargetPointer.Null; | ||
return slinkInHeadObject - _offsetToSLinkField; | ||
} | ||
} | ||
|
||
SListReader GetReader(string typeOfDataStructure) | ||
{ | ||
return new SListReaderV1(typeOfDataStructure); | ||
} | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,195 @@ | ||
# Contract Thread | ||
|
||
This contract is for reading and iterating the threads of the process. | ||
|
||
## Data structures defined by contract | ||
``` csharp | ||
record struct DacThreadStoreData ( | ||
int ThreadCount, | ||
TargetPointer FirstThread, | ||
TargetPointer FinalizerThread, | ||
TargetPointer GcThread); | ||
|
||
record struct DacThreadStoreCounts ( | ||
int UnstartedThreadCount, | ||
int BackgroundThreadCount, | ||
int PendingThreadCount, | ||
int DeadThreadCount); | ||
Comment on lines
+7
to
+17
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should these rather be together in the "Apis of contract" section? These data structures are not data structures found in the target process. They are here just to support the "Apis of contract". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd like to do that as a follow on PR focused on that one question. My general thought, is sure, we can do that. I had envisioned the "types" section as always being contract api types, and not having anything to do with runtime types at all, but I think its an interesting question as to whether or not we should have a types section that refers to the types the contract might use. That's different, and implies that the types section should be specified per version. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, added to #100162 |
||
|
||
enum ThreadState | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For flags like this I think we should make a decision about the likelihood we'd ever want these values to change. If we anticipate they may change then we probably want to define their values as part of encoded data in process memory. If we'd be willing to take a breaking change to diagnostic tools in order to bump the contract version and modify their values than documenting them as constants is fine. I have no objection to these particular flags being constants, I just wanted to raise the broader issue because I suspect we won't want all enumerations to be documentation constants. For example I believe the NativeAOT MethodTable flags have been shifting in value over the past few years. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @noahfalk, handling of enums and varying values between implementations is a problem. At the moment, since this enum is defined in the API portion of the contract, it does not represent the enum values as are present in the runtime at all. Instead it represents the enum values that the contract exposes. I really should have named this Also, as @jkotas noted earlier, we probably shouldn't be exposing this complete set of enum values from the contracts. Instead we should have a restricted set that we believe is likely to be kept for the longer term exposed from the contracts, and prevent the contract from allowing visibility into other values held in the corresponding field in the runtime data structure. We have the luxury of the contract api not being a public, stable api, so if we guess wrong on the set of exposed flags, we can adjust that, but it will still be somewhat of a pain to do that. |
||
{ | ||
TS_Unknown = 0x00000000, // threads are initialized this way | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should review these flags. Number of them do not make sense to include as a contract. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good idea. |
||
|
||
TS_AbortRequested = 0x00000001, // Abort the thread | ||
|
||
TS_GCSuspendPending = 0x00000002, // ThreadSuspend::SuspendRuntime watches this thread to leave coop mode. | ||
TS_GCSuspendRedirected = 0x00000004, // ThreadSuspend::SuspendRuntime has redirected the thread to suspention routine. | ||
TS_GCSuspendFlags = TS_GCSuspendPending | TS_GCSuspendRedirected, // used to track suspension progress. Only SuspendRuntime writes/resets these. | ||
|
||
TS_DebugSuspendPending = 0x00000008, // Is the debugger suspending threads? | ||
TS_GCOnTransitions = 0x00000010, // Force a GC on stub transitions (GCStress only) | ||
|
||
TS_LegalToJoin = 0x00000020, // Is it now legal to attempt a Join() | ||
|
||
TS_ExecutingOnAltStack = 0x00000040, // Runtime is executing on an alternate stack located anywhere in the memory | ||
|
||
TS_Hijacked = 0x00000080, // Return address has been hijacked | ||
|
||
// unused = 0x00000100, | ||
TS_Background = 0x00000200, // Thread is a background thread | ||
TS_Unstarted = 0x00000400, // Thread has never been started | ||
TS_Dead = 0x00000800, // Thread is dead | ||
|
||
TS_WeOwn = 0x00001000, // Exposed object initiated this thread | ||
TS_CoInitialized = 0x00002000, // CoInitialize has been called for this thread | ||
|
||
TS_InSTA = 0x00004000, // Thread hosts an STA | ||
TS_InMTA = 0x00008000, // Thread is part of the MTA | ||
|
||
// Some bits that only have meaning for reporting the state to clients. | ||
TS_ReportDead = 0x00010000, // in WaitForOtherThreads() | ||
TS_FullyInitialized = 0x00020000, // Thread is fully initialized and we are ready to broadcast its existence to external clients | ||
|
||
TS_TaskReset = 0x00040000, // The task is reset | ||
|
||
TS_SyncSuspended = 0x00080000, // Suspended via WaitSuspendEvent | ||
TS_DebugWillSync = 0x00100000, // Debugger will wait for this thread to sync | ||
|
||
TS_StackCrawlNeeded = 0x00200000, // A stackcrawl is needed on this thread, such as for thread abort | ||
// See comment for s_pWaitForStackCrawlEvent for reason. | ||
|
||
// unused = 0x00400000, | ||
|
||
// unused = 0x00800000, | ||
TS_TPWorkerThread = 0x01000000, // is this a threadpool worker thread? | ||
|
||
TS_Interruptible = 0x02000000, // sitting in a Sleep(), Wait(), Join() | ||
TS_Interrupted = 0x04000000, // was awakened by an interrupt APC. !!! This can be moved to TSNC | ||
|
||
TS_CompletionPortThread = 0x08000000, // Completion port thread | ||
|
||
TS_AbortInitiated = 0x10000000, // set when abort is begun | ||
|
||
TS_Finalized = 0x20000000, // The associated managed Thread object has been finalized. | ||
// We can clean up the unmanaged part now. | ||
|
||
TS_FailStarted = 0x40000000, // The thread fails during startup. | ||
TS_Detached = 0x80000000, // Thread was detached by DllMain | ||
} | ||
|
||
record struct DacThreadData ( | ||
uint ThreadId; | ||
TargetNUint OsThreadId; | ||
ThreadState State; | ||
bool PreemptiveGCDisabled | ||
TargetPointer AllocContextPointer; | ||
TargetPointer AllocContextLimit; | ||
TargetPointer Frame; | ||
TargetPointer FirstNestedException; | ||
TargetPointer TEB; | ||
DacGCHandle LastThrownObjectHandle; | ||
TargetPointer NextThread; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this have a GCHandle for the managed thread object? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, probably. I modeled this on the GetThreadData SOS api which does not expose that, but it makes sense to provide along with the rest of these things. |
||
); | ||
``` | ||
|
||
## Apis of contract | ||
``` csharp | ||
DacThreadStoreData GetThreadStoreData(); | ||
DacThreadStoreCounts GetThreadCounts(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It feels a little odd to me that we'd be doing this grouping of ThreadStoreData and ThreadStoreCounts separately. I'd suggest we constrain the abstractions in our logical layer to either be 1:1 with some runtime type, or abstractions that are described in the ECMA CLI doc. In this particular case I think we should define a ThreadStore entity and put a bunch of properties on it for whatever values we believe should be exposed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I would agree with that if the runtime types had a reasonably clean design. The unmanaged Thread in CoreCLR is opposite of that. For historic reasons, Thread contained every thread-local variable ever used by the unmanaged runtime so it has been a dumping ground for a lot of unrelated stuff. Jeremy chipped away on it a little bit in #99989. I think we need to some plan to clean this up so that we do not end up faulting a ton of cruft into the contracts. One possible way to do that is to make some of the contracts identical between CoreCLR, NativeAOT and possibly Mono and refactor the code accordingly. Contracts that are identical between CoreCLR, NativeAOT and possibly Mono have a high chance of being properly factored. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
A general issue here is describing what we want with what we have. This model creates a fair bit of friction with by codifying current asperations instead of reality and permitted evolution with where we want to go. Either we should focus on the asperation and move in that direction or accept the current reality and model it as such while letting the versioning scheme grow as we improve. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the data contract project is going to fail if we accept the current reality and try to create contracts from it. We do not need to cleanup every little detail, but I think fair bit of cleanup and refactoring is required for the data contract project to succeed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I disagree with that assessment, but I accept I am missing some nuance of this approach that forces a cleaner implementation.
I see. It sounds like there is a prep work then with this effort. Has that specifically been identified and accounted for in the work for the cDAC? As in do we have a "simplication" work item for specific areas of the runtime we expect to focus on for a .NET 9 deliverable? I think it is fine if it is implied in the work item, but it should be an explicit expectation somewhere. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I wasn't claiming that we are stuck with the shape of the runtime as it exists today. If we want to change the runtime during .NET 9, then ship a contract that is 1:1 with that new runtime shape that exists when we ship .NET 9 that is fine by me. |
||
DacThreadData GetThreadData(TargetPointer threadPointer); | ||
TargetPointer GetNestedExceptionInfo(TargetPointer nestedExceptionPointer, out TargetPointer nextNestedException); | ||
TargetPointer GetManagedThreadObject(TargetPointer threadPointer); | ||
``` | ||
|
||
## Version 1 | ||
|
||
|
||
|
||
``` csharp | ||
SListReader ThreadListReader = Contracts.SList.GetReader("Thread"); | ||
|
||
DacThreadStoreData GetThreadStoreData() | ||
{ | ||
TargetPointer threadStore = Target.ReadGlobalTargetPointer("s_pThreadStore"); | ||
var runtimeThreadStore = new ThreadStore(Target, threadStore); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think ThreadStore and Thread in this code represent the physical model structures although they were never defined. I expect we will need to give an explicit definition of these types and their field lists somewhere in the documentation. At minimum the documentation needs to include the default field offsets for every field, otherwise the reader will not know the correct offset whenever an explicit offset wasn't encoded. |
||
|
||
TargetPointer firstThread = ThreadListReader.GetHead(runtimeThreadStore.SList.Pointer); | ||
|
||
return new DacThreadStoreData( | ||
ThreadCount : runtimeThreadStore.m_ThreadCount, | ||
FirstThread: firstThread, | ||
FinalizerThread: Target.ReadGlobalTargetPointer("g_pFinalizerThread"), | ||
GcThread: Target.ReadGlobalTargetPointer("g_pSuspensionThread")); | ||
} | ||
|
||
DacThreadStoreCounts GetThreadCounts() | ||
{ | ||
TargetPointer threadStore = Target.ReadGlobalTargetPointer("s_pThreadStore"); | ||
var runtimeThreadStore = new ThreadStore(Target, threadStore); | ||
|
||
return new DacThreadStoreCounts( | ||
ThreadCount : runtimeThreadStore.m_ThreadCount, | ||
UnstartedThreadCount : runtimeThreadStore.m_UnstartedThreadCount, | ||
BackgroundThreadCount : runtimeThreadStore.m_BackgroundThreadCount, | ||
PendingThreadCount : runtimeThreadStore.m_PendingThreadCount, | ||
DeadThreadCount: runtimeThreadStore.m_DeadThreadCount, | ||
Comment on lines
+132
to
+135
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
These are BCL concepts that we have not moved to C# yet. I am not sure whether it makes sense to have them in the low-level thread contract. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll split those out into a separate api then. It so happens that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, these are not currently only BCL concepts, as the iterators we have over threads depend on them. I'm not 100% certain they can just be moved to managed without consequence. It would be nice though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yeah, I have struggled with random pieces of information returned by DAC SOS APIs many times when doing significant changes in the runtime (and often ended up breaking them in the process without anybody caring). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I usually think of our diagnostics in two categories:
I think we have a lot of lattitude to do takebacks on the latter but need to be much more cautious on the former. |
||
} | ||
|
||
DacThreadData GetThreadData(TargetPointer threadPointer) | ||
{ | ||
var runtimeThread = new Thread(Target, threadPointer); | ||
|
||
TargetPointer firstNestedException = TargetPointer.Null; | ||
if (Target.ReadGlobalInt32("FEATURE_EH_FUNCLETS")) | ||
{ | ||
if (runtimeThread.m_ExceptionState.m_pCurrentTracker != TargetPointer.Null) | ||
{ | ||
firstNestedException = new ExceptionTrackerBase(Target, runtimeThread.m_ExceptionState.m_pCurrentTracker).m_pPrevNestedInfo; | ||
} | ||
} | ||
else | ||
{ | ||
firstNestedException = runtimeThread.m_ExceptionState.m_currentExInfo.m_pPrevNestedInfo; | ||
} | ||
|
||
return new DacThread( | ||
ThreadId : runtimeThread.m_ThreadId, | ||
OsThreadId : (OsThreadId)runtimeThread.m_OSThreadId, | ||
State : (ThreadState)runtimeThread.m_State, | ||
PreemptiveGCDisabled : thread.m_fPreemptiveGCDisabled != 0, | ||
AllocContextPointer : thread.m_alloc_context.alloc_ptr, | ||
AllocContextLimit : thread.m_alloc_context.alloc_limit, | ||
Frame : thread.m_pFrame, | ||
TEB : thread.Has_m_pTEB ? thread.m_pTEB : TargetPointer.Null, | ||
LastThreadObjectHandle : new DacGCHandle(thread.m_LastThrownObjectHandle), | ||
FirstNestedException : firstNestedException, | ||
NextThread : ThreadListReader.GetHead.GetNext(threadPointer) | ||
); | ||
} | ||
|
||
TargetPointer GetNestedExceptionInfo(TargetPointer nestedExceptionPointer, out TargetPointer nextNestedException) | ||
{ | ||
if (nestedExceptionPointer == TargetPointer.Null) | ||
{ | ||
throw new InvalidArgumentException(); | ||
} | ||
if (Target.ReadGlobalInt32("FEATURE_EH_FUNCLETS")) | ||
{ | ||
var exData = new ExceptionTrackerBase(Target, nestedExceptionPointer); | ||
nextNestedException = exData.m_pPrevNestedInfo; | ||
return Contracts.GCHandle.GetObject(exData.m_hThrowable); | ||
} | ||
else | ||
{ | ||
var exData = new ExInfo(Target, nestedExceptionPointer); | ||
nextNestedException = exData.m_pPrevNestedInfo; | ||
return Contracts.GCHandle.GetObject(exData.m_hThrowable); | ||
} | ||
} | ||
|
||
TargetPointer GetManagedThreadObject(TargetPointer threadPointer) | ||
{ | ||
var runtimeThread = new Thread(Target, threadPointer); | ||
return Contracts.GCHandle.GetObject(new DacGCHandle(runtimeThread.m_ExposedObject)); | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think that GCHandle enumeration is going to be separate from GCHandle dereferencing.
GCHandle is very simple and it is needed for minidump analysis.
GCHandle enumeration is much more involved and it is not needed for minidump analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair. I can easily see stopping at supporting only individual
GCHandle
here, and defining the enumerators separately. We may want to provide the ability to determine what kind of GCHandle
, and possibly the various other characteristics of a non-boringGCHandle
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The split that I'm guessing would work well is:
Fwiw I don't recall that ICorDebug or SOS has any exposure of GCHandle information that isn't combined with enumerating them first.
If you do want to split GCHandle additional state from GCHandle enumeration I'd be fine with it, but at that level of granularity I'm guessing we'd looking at 30-50 contracts rather than 5-20.