-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve call counting mechanism #1457
Changes from all commits
f36f0dd
3ca0185
39a025f
62f1d21
ef02925
348772d
76a0b20
9fddf98
2ba422e
6aa3c70
c32a562
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -330,11 +330,7 @@ to update the active child at either of those levels (ReJIT uses SetActiveILCode | |
In order to do step 3 the `CodeVersionManager` relies on one of three different mechanisms, a `FixupPrecode`, a `JumpStamp`, or backpatching entry point slots. In [method.hpp](https://github.com/dotnet/coreclr/blob/master/src/vm/method.hpp) these mechanisms are described in the `MethodDesc::IsVersionableWith*()` functions, and all methods have been classified to use at most one of the techniques, based on the `MethodDesc::IsVersionableWith*()` functions. | ||
|
||
### Thread-safety ### | ||
CodeVersionManager is designed for use in a free-threaded environment, in many cases by requiring the caller to acquire a lock before calling. This lock can be acquired by constructing an instance of the | ||
|
||
``` | ||
CodeVersionManager::TableLockHolder(CodeVersionManager*) | ||
``` | ||
CodeVersionManager is designed for use in a free-threaded environment, in many cases by requiring the caller to acquire a lock before calling. This lock can be acquired by constructing an instance of `CodeVersionManager::LockHolder`. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ideally the commit description wouldn't need to be so large and all the text that describes how the feature works is added to appropriate places in the docs, either for code versioning, for tiered compilation, or a brand new doc specific to call counting. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll fold some summary-level comments from the description into the code |
||
in some scope for the CodeVersionManager being operated on. CodeVersionManagers from different domains should not have their locks taken by the same thread with one exception, it is OK to take the shared domain manager lock and one AppDomain manager lock in that order. The lock is required to change the shape of the tree or traverse it but not to read/write configuration properties from each node. A few special cases: | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comments did a great job sketching out the design you ended up with, but I think the rationale as to why you arrived at this design as opposed to something different could be equally illuminating. Typically before making a change of this scale there would be some discussion about options or a write up in a doc so I'm not sure if I just missed that? At this point I am not expecting that we'd make large deviations from the approach in this PR unless we found a serious issue given how much effort I assume you have invested in it and the perf gains. However it would still be useful to understand among the other design options what has already been eliminated via thought experiment or performance testing and what is still interesting to experiment with in the future.
Some design alternatives that come to mind:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing. There hasn't been a doc, most of the discussions happened in-person.
The most expensive parts of reaching the call count threshold are:
#1 and #2 are now done in the background, otherwise they were showing up in the spike when methods reach the call count threshold. Doing or avoiding #2 is a tradeoff between some background work and some foreground work, I wasn't particularly trying to change the way it works currently and favored to avoid the extra foreground work.
Will talk about this separately
Separate stubs allow counting only tiered methods and at specific times. Changing an existing stub like a precode would increase the size unnecessarily.
At the moment it doesn't make much difference. If there were to be a tier 0.5 in the future then it would probably want to be counted separately from tier 0, and due to the unlocked counting it's not easy to deterministically reset the remaining call count. The stub could be per-MethodDesc instead but that would entail making it larger and slower such that it's patchable.
I did fine-grained synchronously first and after seeing the large spikes, changed it to coarse-grained asynchronous. It's not very fine-grained though, methods typically reach the call count threshold in bursts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to add, I'm not familiar with Andy's JIT prototypes for OSR, but it may make sense to count calls in the jitted code for that. There is direct compensation in that jitting methods containing loops at tier 0 would decrease startup time, and since it's a loop it likely will be called at least a few times anyway and may be more worth optimizing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kouvel! My goal was to capture your existing thinking on the topic while it is fresh and that has now been done : )