Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search native code in all R2R ni.dll in version bubble #57277

Closed
wants to merge 15 commits into from

Conversation

y-yamshchikov
Copy link
Contributor

This PR fixes part of #44948 and fixes #46160.

This code simply traverses through assemblies in the application
domain. For each assembly (module) it realizes is it Ready To Run and is
it in the same bubble with (does it deliberately bubbling the) module
from which generic function originates. If so, it makes request for code
is Ready To Run and hopes there is some in the module. If the request
succeeds it proceeds with found pointer to the bare native code.

Generic instantiations like System.Collections.Concurrent.ConcurrentDictionary`2[System.Int32,System.__Canon] contained within same version bubble are either a bug in crossgen2 or should be taken care of by having PGO data.

Now such methods use their Ready To Run code.

We have got significant performance gain on startup: 7% average on our representative set on Tizen.

This PR worked out notices in PR below:
#47269
about linear search through the set of RangeSections. In this new PR we propose storing of RangeSections in sorted array (with number of optimizations inspired by prior linked list based solution).

We have extensively tested this PR on armel/Tizen platform so in this case we are confident in reliability and profitability of the solution.

Dear colleagues @jkotas @alpencolt @gbalykov @t-mustafin, please take a look.

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Aug 12, 2021
@y-yamshchikov y-yamshchikov changed the title Partial fix 44948 Search native code in all R2R ni.dll in version bubble Aug 12, 2021
@y-yamshchikov y-yamshchikov force-pushed the partial-fix-44948 branch 7 times, most recently from c2d2f5d to 91e753a Compare August 13, 2021 17:50
src/coreclr/vm/codeman.cpp Outdated Show resolved Hide resolved
src/coreclr/vm/codeman.cpp Outdated Show resolved Hide resolved
src/coreclr/vm/codeman.h Outdated Show resolved Hide resolved
src/coreclr/vm/codeman.h Outdated Show resolved Hide resolved
src/coreclr/vm/prestub.cpp Outdated Show resolved Hide resolved
src/coreclr/vm/codeman.h Outdated Show resolved Hide resolved
src/coreclr/vm/codeman.h Outdated Show resolved Hide resolved
src/coreclr/vm/codeman.h Outdated Show resolved Hide resolved
src/coreclr/vm/codeman.cpp Outdated Show resolved Hide resolved
src/coreclr/vm/codeman.cpp Show resolved Hide resolved
src/coreclr/vm/codeman.cpp Outdated Show resolved Hide resolved
src/coreclr/vm/codeman.cpp Outdated Show resolved Hide resolved
src/coreclr/vm/codeman.cpp Show resolved Hide resolved
src/coreclr/vm/codeman.cpp Outdated Show resolved Hide resolved
src/coreclr/vm/codeman.cpp Show resolved Hide resolved
src/coreclr/vm/codeman.cpp Show resolved Hide resolved
src/coreclr/vm/codeman.cpp Show resolved Hide resolved
src/coreclr/vm/codeman.cpp Show resolved Hide resolved
src/coreclr/vm/codeman.cpp Outdated Show resolved Hide resolved
src/coreclr/vm/codeman.cpp Outdated Show resolved Hide resolved
@y-yamshchikov y-yamshchikov force-pushed the partial-fix-44948 branch 3 times, most recently from da2f1cd to 8dc0aa9 Compare September 3, 2021 12:50
@alpencolt
Copy link

@jkotas had you time to check PR?

@@ -1264,6 +1290,14 @@ class ExecutionManager
SIZE_T Size,
Module * pModule);

static int FindRangeSectionHandleHelper(TADDR addr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs comment on what this returns. It does not return a simple index...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can it be in the private section (before public:)?

enum
{
RangeSectionHandleArrayInitialSize = 100,
RangeSectionHandleArrayExpansionFactor = 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In debug/checked builds, we should make the initial size small and make the array grow on element at a time to ensure that the array resize algorithm and the potential race conditions are exercised.

TADDR LowAddress;
TADDR HighAddress;
#ifndef DACCESS_COMPILE
Volatile<RangeSection *> pRS;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to be volatile?

struct RangeSectionHandle
{
TADDR LowAddress;
TADDR HighAddress;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think that this needs to store HighAddress. It can just store the LowAddress and the method that is binary searching inside the array can just return the candidate RS index. We can then validate that the candidate RS index actually fits by comparing the address with RangeSection::HighAddress.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side-effect of this change is going to be that the size of this struct will be power of two that will make the binary search faster.

FrameWithCookie<ExternalMethodFrame> frame(pTransitionBlock);
ExternalMethodFrame * pEMFrame = &frame;
pModule = ExecutionManager::FindReadyToRunModule((TADDR)(((BYTE*)pEMFrame->GetReturnAddress())-1));

BEGIN_PRESERVE_LAST_ERROR;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BEGIN_PRESERVE_LAST_ERROR should be before the newly added code.

@@ -1853,6 +1860,11 @@ extern "C" PCODE STDCALL PreStubWorker(TransitionBlock* pTransitionBlock, Method
{
PCODE pbRetVal = NULL;

Module* pModule = NULL;
FrameWithCookie<ExternalMethodFrame> frame(pTransitionBlock);
Copy link
Member

@jkotas jkotas Sep 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not look right to create a dummy ExternalMethodFrame just to get the return address.

The fixups from R2R modules should come via ExternalMethodFixupWorker. Can we just use the module that is computed there?

@jkotas
Copy link
Member

jkotas commented Sep 3, 2021

RangeSections now stored in sorted array

BTW: I do like this change - it should be general performance improvement for any scenario with large number of modules.

@y-yamshchikov y-yamshchikov force-pushed the partial-fix-44948 branch 5 times, most recently from 3d7cad1 to fe70768 Compare September 16, 2021 11:24
pLast = pCurr;
pCurr = pCurr->pnext;
}
ReaderLockHolder rlh(HostCallPreference::NoHostCalls);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is important for this path to be lock-free. Adding a lock here is very likely to regress GC stackwalking performance on machines with a lot of cores.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is so important, we will investigate another way to isolate readers from writers. We think we could find one with two interchanging arrays.

pCurr = pCurr->pnext;
}
ReaderLockHolder rlh(HostCallPreference::NoHostCalls);
while (!rlh.Acquired()); // rlh con did not yield and we have HOST_NOCALLS limitation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what this is trying to achieve, but it does not look right.

Copy link
Contributor Author

@y-yamshchikov y-yamshchikov Oct 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are obliged to ask if the lock has been aquired after NoHostCalls constructor of ReaderLockHolder. NoHostCalls is necessary because of contract requirement from very above on call stack. We have encountered contract violation if we use default constructor (and the source of violation is not GetRangeSection, it is much higher).

@mangod9
Copy link
Member

mangod9 commented Nov 15, 2021

hi @y-yamshchikov, assume you are still working on this PR feedback?

@mangod9
Copy link
Member

mangod9 commented Jan 10, 2022

hi @y-yamshchikov, checking if this PR still needs to be kept open since it hasnt been updated in a couple of months. Thx

@y-yamshchikov
Copy link
Contributor Author

y-yamshchikov commented Jan 13, 2022

hi @y-yamshchikov, checking if this PR still needs to be kept open since it hasnt been updated in a couple of months. Thx

Dear @mangod9, we are working on new approach granting lockfree reading of RangeSections and still benefiting from binary search capabilities of sorted array, fighting all the contradictions dictating by the multiprocessor environment. We end up with new version of synchronization mechanics and for now working on comprehensive testing environment which grants extensive coverage of all possible synchronization issues. To be presented soon, both the code and the testing model.

@mangod9
Copy link
Member

mangod9 commented Jan 13, 2022

thanks for the update, should we close this PR for now and open a new one once ready? Thx!

@y-yamshchikov
Copy link
Contributor Author

thanks for the update, should we close this PR for now and open a new one once ready? Thx!

There is lot of useful feedback conversation still not outdated, I think the change is made in a successive way to prolong current PR.

@mangod9
Copy link
Member

mangod9 commented Feb 7, 2022

I would suggest we close the PR and reopen when more changes are added. This helps to keep the "Open PRs" list manageable.

@mangod9
Copy link
Member

mangod9 commented Mar 14, 2022

Closing for now so it doesnt get flagged in stale PRs. Please reopen when ready to merge again. Thx.

@mangod9 mangod9 closed this Mar 14, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Apr 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-crossgen2-coreclr community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Version Bubble implementation flaw: some methods are forced to rejit
5 participants