Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Jitted Code Pitching Feature implemented #10496

Merged
merged 1 commit into from
Jul 20, 2017
Merged

Conversation

sergign60
Copy link

@sergign60 sergign60 commented Mar 27, 2017

The given pull request proposes a way for resolving issue "Jitted Code Dropping Support" #6757
Its distinctive features and algorithm are:

1. All its code is under #if defined(FEATURE_JIT_PITCHING) and doesn't mess up with other code
2. This feature is working only if the options INTERNAL_JitPitchEnabled != 0 and INTERNAL_JitPitchMemThreshold > 0
3. Jitted code can be pitched only for methods that are not Dynamic, FCall or Virtual
4. If the size of the generated native code exceeds the value of INTERNAL_JitPitchMethodSizeThreshold this code is placed in the special heap code list. Each heap block in this list stores the code for only one method and has the sufficient size for the code of a method aligned to 4K. The pointers to such methods are stored in the "CalledMethods" hash map.
5. If the entrypoint of a method is backpatched this method is excluded from the "CalledMethods" hash map and stored in "NotForPitchingMethods" hashmap.
6. When the total size of the generated native code exceeds the value of INTERNAL_JitPitchMemThreshold option, the execution of the program is stopped and stack frames for all the threads are inspected and pointers to methods being executed are stored in the "ExecutedMethods" hash map
7. The code for all the methods from the "CalledMethods" that are not in the "ExecutedMethods" is pitched. (All heap blocks for these methods are set in the initial state and can be reused for newly compiled methods, pointers to the code for non-executed methods are set to NULL).
8. If the code for the given method is pitched once, this method is stored in the "NotForPitchingMethods" hashmap. Thus, if this method is compiled the second time, it is considered as called repeatedly, therefore, pitching for it is inexpedient, and the newly compiled code stored in the usual heap.
9. The coreclr code with this feature is built by the option

./build.sh cmakeargs -DFEATURE_JIT_PITCHING=true

@sergign60
Copy link
Author

#6769

@sergign60
Copy link
Author

@sergign60
Copy link
Author

Testing of Checked and Debug builds on the CoreCLR test suite with the environment variables

COMPlus_JitDropMemThreshold=1
COMPlus_JitDropMethodSizeThreshold=1
COMPlus_JitDropEnabled=1

does not show any regression

@sergign60
Copy link
Author

@sergign60 sergign60 force-pushed the jitdrop branch 8 times, most recently from bcf3769 to b4d51a5 Compare March 27, 2017 15:00
@sergign60
Copy link
Author

@dotnet-bot test Windows_NT x86 Checked Build and Test

@janvorli
Copy link
Member

CC: @noahfalk

@noahfalk
Copy link
Member

I haven't had a chance to go through this in depth, I'll try to make time to give it more scrutiny soon (tomorrow?). For now a few thoughts on various areas:

  1. Correctness - I haven't scrutinized carefully, but I suspect there are still race conditions which would cause this code to crash, even if the CoreCLR/CoreFx unit tests aren't stressing the runtime hard enough to hit them. I don't know what error-rate you'd consider acceptable for your usage, but if we wanted to get rigorous someone will need to do a full escape analysis on that code pointer to determine everywhere it might still be stored at the time it is being considered for deletion. For example consider any of the call paths which invoke GetMethodEntryPoint(), GetNativeCode(), GetSingleCallableAddrOfCode(), etc. The code pointer could be sitting in a register or stack location corresponding to that C++ code that will eventually be invoked when the caller returns. The JIT may have no explicit invariant that prevents it from caching the call target read via an indirection cell across a GC safe-point, in which case the code pointer could be stored in jitted code registers too.

  2. Compat with diagnostic tools - I think I already commented on the earlier iteration of this proposal that debuggers, profiler, and ETW all appear likely to fail when the code is deleted. I don't consider it a bar you have to meet to check-in, or even to use the feature productively in some scenario of your own, but it would definately block us from enabling it by default.

  3. Coordination with other jit updating technologies - I've got tiered compilation in progress over in
    Tiered Compilation step 1 #10478 . Initially there are some small areas of overlap that might be good to converge on, such as hooking into MethodDesc::IsPointingToStableNativeCode() and MethodDesc::IsNativeCodeStableAfterInit(). This should let you leverage other work that attempts to handle non-stable code pointers correctly and uniformly. However your requirements to be able to delete code are likely stricter than my requirements for tiered compilation. In your case it is an AV if you call the old code pointer whereas my PR only provides for eventual convergence on the most recent code pointer. As the tiered compilation project moves forward I anticipate there will be more opportunities for useful sharing.

@sergign60
Copy link
Author

sergign60 commented Mar 28, 2017

@noahfalk
many thanks for your comments. Just now I'm busy in another project, so I will be able to look through them only at this weekend. So now I can say only that I fully agree with 2.

@noahfalk
Copy link
Member

Its cool stuff you are working on here @sergign60. I've been taking a closer look through this and had some follow up comments. IMO the minimum bar to check this in should be:

  1. Current runtime behavior doesn't regress (using the off-by-default ifdef mostly solves this)
  2. It doesn't introduce unnecessary complexity/confusion for other developers likely to work in or around this code in the future.
    I do not think that correctness, completeness, or performance of the implementation needs to be a blocking issue as long as we've gotten it suitably decoupled, though I'm happy to continue pointing out potential issues if you find it useful.

The main thing that prevents this from being ready for checkin are a number of places where it adds some confusion/complexity - some easy to remedy but others might take a little more legwork:

  1. Throughout our code base the term we usually use for deleting code is 'code pitching', including various places in public documented APIs. It would be good to use matching terminology in this feature.
  2. How about renaming CalledMethods to PitchingCandidateMethods? I think that better reflects the intended usage of that set no?
  3. We already have a CodeHeap implementation that supports releasing/reallocating methods - HostCodeHeap in dynamicmethod.cpp. Rather than add a new 3rd kind of heap allocation it would be good to understand what it would take to make the existing m_isDynamicDomain in CodeHeapRequestInfo suitable for your needs. This could eliminate a lot of #ifdefs that are otherwise scattered throughout the allocator and code manager.
  4. A lot of the logic in prestub.cpp should probably be moved into a new dedicated file/type, perhaps CodePitchingManager.cpp? This manager and the various sets being tracked should exist per-AppDomain, as AppDomain is the natural container for code in general. I haven't thought far enough ahead to figure out if this is a likely to be a long term stable location for this functionality, but it seems reasonably compartmentalized for now.
  5. Can we avoid snaking an extra argument through MethodDesc::DoPrestub and MethodDesc::MakeJitWorker? It appears it roughly corresponds to the define FEATURE_JIT_DROPPING being true, so I am hoping extra runtime arguments aren't necessary.
  6. In method.cpp we should try to avoid adding yet more complexity to TryGetMultiCallableAddrOfCode. In my PR I'm about to change IsPointingToNativeCode() -> IsPointingToStableNativeCode() and it would be a natural fit to mark pitchable methods as always returning FALSE from that method. In that second case it appears this PR created a condition in which the method claims to have a stable entry point but it doesn't actually have a stable entry point. Rather than redefining the meaning of HasStableEntryPoint() its probably better to modify the logic elsewhere so that we actually give those methods stable entrypoints (with a fixup precode) or we avoid ever marking the entrypoint as stable.
  7. It would be good to add some comments describing the current state of the feature to the code, so that other devs encountering it are able to form the right expectations about it.

@sergign60
Copy link
Author

@noahfalk thanks for your comments. I'll look at them more closely at the nearest weekend.

@sergign60 sergign60 changed the title Jitted Code Dropping Feature implemented Jitted Code Ptiching Feature implemented Apr 8, 2017
@sergign60 sergign60 changed the title Jitted Code Ptiching Feature implemented Jitted Code Pitching Feature implemented Apr 8, 2017
@sergign60
Copy link
Author

@noahfalk thanks again for your comments. As you can see I did renaming work.
3. & 5. & 6. Sure, I'll think through this variants

@sergign60 sergign60 force-pushed the jitdrop branch 2 times, most recently from f518059 to f2677ff Compare April 10, 2017 14:02
@sergign60
Copy link
Author

@noahfalk
about 5. As I see another way is using CORJIT_FLAGS (+CORJIT_FLAG_JIT_PITCHING=41) by setting it in PreStubWorker. But this way requires additional argument too. What do you think?

@sergign60 sergign60 force-pushed the jitdrop branch 4 times, most recently from f9a9dc4 to 88dd36d Compare April 14, 2017 09:17
@sergign60 sergign60 force-pushed the jitdrop branch 5 times, most recently from 1238b23 to d656778 Compare June 25, 2017 21:27
@sergign60
Copy link
Author

sergign60 commented Jun 26, 2017

@noahfalk @janvorli
I've removed some special cases from avoided methods. Also I've fixed statistics calculation, previous data were wrong :( Now the most of tests give above 50% percentage.

=======================
     Test Results
=======================
# CoreCLR Bin Dir  : ./coreclr/bin/Product/Linux.x64.Debug
# Tests Discovered : 7104
# Passed           : 6394
# Failed           : 0
# Skipped          : 710
=======================

@noahfalk
Copy link
Member

I'm happy 👍

@janvorli - I know you are pretty busy at the moment. Could you let us know if you intended to review this, if you'd like me to find someone else in your stead, or you want to go ahead and merge this without further review? Thanks!

@janvorli
Copy link
Member

@noahfalk I'd like to find some time and review it too. I hope I'll be able to do that tomorrow morning.

RETAIL_CONFIG_DWORD_INFO(INTERNAL_JitPitchEnabled, W("JitPitchEnabled"), (DWORD)0, "Set it to 1 to enable Jit Pitching")
RETAIL_CONFIG_DWORD_INFO(INTERNAL_JitPitchMemThreshold, W("JitPitchMemThreshold"), (DWORD)0, "Pitching jits when code heap usage is larger than this (in bytes)")
RETAIL_CONFIG_DWORD_INFO(INTERNAL_JitPitchMethodSizeThreshold, W("JitPitchMethodSizeThreshold"), (DWORD)0, "Pitching jit for methods whose native code size larger than this (in bytes)")
RETAIL_CONFIG_DWORD_INFO(INTERNAL_JitPitchMaxLevel, W("JitPitchMaxLevel"), (DWORD)0, "Pitching jits for all methods as it possible")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the description.

RETAIL_CONFIG_DWORD_INFO(INTERNAL_JitPitchMethodSizeThreshold, W("JitPitchMethodSizeThreshold"), (DWORD)0, "Pitching jit for methods whose native code size larger than this (in bytes)")
RETAIL_CONFIG_DWORD_INFO(INTERNAL_JitPitchMaxLevel, W("JitPitchMaxLevel"), (DWORD)0, "Pitching jits for all methods as it possible")
RETAIL_CONFIG_DWORD_INFO(INTERNAL_JitPitchTimeInterval, W("JitPitchTimeInterval"), (DWORD)0, "Time interval between jit pitchings in ms")
RETAIL_CONFIG_DWORD_INFO(INTERNAL_JitPitchPrintStat, W("JitPitchPrintStat"), (DWORD)0, "Print statistics about Jit Pitching")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit - can you please unify the casing of the "pitching jit" in the descriptions? At some places it is "Jit Pitching", at others "jit pitching".

(CLRConfig::GetConfigValue(CLRConfig::INTERNAL_JitPitchMemThreshold) == 0))
return FALSE;

if (this == NULL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to introduce more of checks of this to NULL which is against the C++ standard. We have the warning for such compares disabled since it would be difficult to modify historical code at some places to not to do that, but it would be better to not to add more of those.
C++ standard says that calling a method with null this pointer causes undefined behavior.
Could you please add NULL checks at the caller sites where such checks are needed instead?

MethodDesc* pMD = pCf->GetFunction();

// Filter out methods we don't care about
if (pMD == nullptr || !pMD->IsPitchable())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit - your change uses nullptr and NULL kind of randomly. Could you please pick one of those and use it in your new code for the sake of consistency?

if ((CLRConfig::GetConfigValue(CLRConfig::INTERNAL_JitPitchEnabled) != 0) &&
(CLRConfig::GetConfigValue(CLRConfig::INTERNAL_JitPitchMemThreshold) != 0) &&
(CLRConfig::GetConfigValue(CLRConfig::INTERNAL_JitPitchTimeInterval) == 0 ||
((::GetTickCount() - s_JitPitchLastTick) > CLRConfig::GetConfigValue(CLRConfig::INTERNAL_JitPitchTimeInterval))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the system was running for more than 49.7, the GetTickCount result wraps around and so this interval measurement will stop working.
Using GetTickCount64 would probably be the easiest fix.

pMD->PitchNativeCode();
}
}
for (PtrHashMap::PtrIterator i = s_pExecutedMethods->begin(); !i.end();)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit - can you please reuse the i from the above? It is confusing in the debugger when there are two variables of the same name.

if (s_pPitchingCandidateMethods == NULL)
{
SimpleWriteLockHolder swlh(s_pPitchingCandidateMethodsLock);
if (s_pPitchingCandidateMethods == NULL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lazy initialization of all the maps and locks make the code harder to read, obscuring the real functionality. It doesn't seem to save much. It seems it would be better to just allocate all of them in a single initialization function where you would know that the pitching is active. Did you have any particular reason for using the lazy allocations?

MethodDesc *pMD = (MethodDesc *) i.GetValue();
UPTR key = (UPTR)GetFullHash(pMD);
++i;
s_pExecutedMethods->DeleteValue(key, (LPVOID)pMD);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really ok to delete values from the hash map during the iteration?

{
if (pMD->IsPitchable())
{
if (sizeOfCode > 0 && CLRConfig::GetConfigValue(CLRConfig::INTERNAL_JitPitchMethodSizeThreshold) < sizeOfCode)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first part of the condition is not necessary, the second would never pass for sizeOfCode being zero.

@sergign60
Copy link
Author

sergign60 commented Jun 28, 2017

@janvorli
CC: @noahfalk
Thank you a lot for your comments. They are very useful. I think that I've satisfied all of them

{
SimpleWriteLockHolder swlh(s_pPitchingCandidateMethodsLock);
s_pPitchingCandidateMethods->DeleteValue(key, (LPVOID)this);
s_pPitchingCandidateMethods->Compact();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sergign60 Last detail I wonder about - do we need to run the Compact after each DeleteValue? Does the s_pExecutedMethods->Clear() call happen that rarely that we care about the fact that the hash map is not compact?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli I think it's not needed. I'll remove it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli I need some time for testing

@sergign60 sergign60 force-pushed the jitdrop branch 2 times, most recently from e2a8c87 to 5dfaf38 Compare June 30, 2017 20:34
@sergign60
Copy link
Author

sergign60 commented Jun 30, 2017

@noahfalk @janvorli
The latest CoreCLR test suite result:

=======================
     Test Results
=======================
# CoreCLR Bin Dir  : ./coreclr/bin/Product/Linux.x64.Debug
# Tests Discovered : 7107
# Passed           : 6397
# Failed           : 0
# Skipped          : 710
=======================
39 minutes and 2 seconds taken to run CoreCLR tests.
$ env | grep COMPl
COMPlus_JitPitchMemThreshold=1
COMPlus_JitPitchMethodSizeThreshold=1
COMPlus_JitPitchEnabled=1
COMPlus_JitPitchPrintStat=1

@sergign60 sergign60 force-pushed the jitdrop branch 3 times, most recently from 0310171 to 1ee7be6 Compare July 4, 2017 07:58
@noahfalk
Copy link
Member

@sergign60 - sorry I had forgotten about this for a while, are you ready to have this merged now? I think earlier there were some CI issues but it appears you got those addressed and both JanV and I are happy with it.

I have another PR that is going to have merge conflicts so I'd like to get yours in ASAP and then I can merge on top of it. Otherwise if yours doesn't go in now you will need to do some merge work after my PR commits.

@sergign60
Copy link
Author

@noahfalk yes, I'm ready to merge it

@noahfalk noahfalk merged commit f21db50 into dotnet:master Jul 20, 2017
@sergign60
Copy link
Author

@noahfalk many thanks for your help!

@noahfalk
Copy link
Member

@sergign60 - very welcome and thanks for all your work : )

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants