Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

[R2R] add a virtualStubParamInfo as an argument #15910

Merged
merged 3 commits into from
Feb 2, 2018

Conversation

sandreenko
Copy link

@sandreenko sandreenko commented Jan 18, 2018

This PR fixes part of #15150 and #15552.

@sandreenko
Copy link
Author

@dotnet-bot test Windows_NT arm Cross Checked r2r Build and Test
@dotnet-bot test Windows_NT arm Cross Checked jitstressregs1 Build and Test

@sandreenko
Copy link
Author

@dotnet-bot test Windows_NT arm Cross Checked r2r_jitstress1 Build and Test
@dotnet-bot test Windows_NT arm Cross Checked r2r_jitstress2 Build and Test

@sandreenko
Copy link
Author

sandreenko commented Jan 18, 2018

arm r2r testing results:

Tests.lst=_speed_dbglcsbox.cmd_1771, , , Smrt00000001, , # JIT\Methodical\Arrays\lcs\_speed_dbglcsbox\_speed_dbglcsbox.cmd  CATS: EXPECTED_PASS
Tests.lst=straccess2_cs_ro.cmd_2232, , , Smrt00000001, , # JIT\Directed\StrAccess\straccess2_cs_ro\straccess2_cs_ro.cmd  CATS: EXPECTED_PASS
Tests.lst=_speed_rellcsbox.cmd_6513, , , Smrt00000001, , # JIT\Methodical\Arrays\lcs\_speed_rellcsbox\_speed_rellcsbox.cmd  CATS: EXPECTED_PASS
Tests.lst=fasta-1.cmd_11039, , , Smrt00000001, , # JIT\Performance\CodeQuality\BenchmarksGame\fasta\fasta-1\fasta-1.cmd  CATS: EXPECTED_PASS

were fixed.

Tests.lst=CscBench.cmd_3424, , , Smrt00000001, , # JIT\Performance\CodeQuality\Roslyn\CscBench\CscBench.cmd  CATS: EXPECTED_PASS
Tests.lst=Deserialize.cmd_8437, , , Smrt00000001, , # JIT\Performance\CodeQuality\Serialization\Deserialize\Deserialize.cmd  CATS: EXPECTED_PASS
Tests.lst=Serialize.cmd_9128, , , Smrt00000001, , # JIT\Performance\CodeQuality\Serialization\Serialize\Serialize.cmd  CATS: EXPECTED_PASS
Tests.lst=Compilation.cmd_11084, , , Smrt00000001, , # managed\Compilation\Compilation\Compilation.cmd  CATS: EXPECTED_PASS

continue to fail.

@sandreenko
Copy link
Author

sandreenko commented Jan 18, 2018

Some strange things about this morphing:

  1. does anybody remember where did this come from? It was introduced before tfs. (this code is special for arm).
    // Change the call type, so we can add the extra indirection here, rather than in codegen
    call->gtCallAddr = gtNewIconHandleNode(addr, GTF_ICON_FTN_ADDR);
    call->gtStubCallStubAddr = NULL;
    call->gtCallType = CT_INDIRECT;

    Found and deleted.

  2. For tail call VS stubAddrArg is not considered as nonStandardArgs, should it? Do we need to push it into nonStandardArgs for non-tail call case?

  3. In lower.cpp we have such code (since 2014):

            GenTree* cellAddr = AddrGen(addr);
            GenTree* indir    = Ind(cellAddr);
            // For arm64, we dispatch code same as VSD using X11 for indirection cell address,
            // which ZapIndirectHelperThunk expects.
            if (call->IsR2RRelativeIndir())
            {
                cellAddr->gtRegNum = REG_R2R_INDIRECT_PARAM;
                indir->gtRegNum    = REG_JUMP_THUNK_PARAM;
            }

where 'addr' is result of VM call, can we move it to morph? Looks like it is the same additional non-standard argument.

@sandreenko sandreenko force-pushed the armR2Rtest branch 2 times, most recently from e6c24a0 to 85742c2 Compare January 19, 2018 09:03
@sandreenko
Copy link
Author

@dotnet-bot test Windows_NT arm Cross Checked r2r Build and Test
@dotnet-bot test Windows_NT arm Cross Checked r2r_jitstress1 Build and Test
@dotnet-bot test Windows_NT arm Cross Checked r2r_jitstress2 Build and Test

@sandreenko sandreenko changed the title [WIP][R2R] add a virtualStubParamInfo as an argument [R2R] add a virtualStubParamInfo as an argument Jan 19, 2018
@sandreenko
Copy link
Author

I need to add a function header and collect SPMI diffs, but we can start review.
PTAL @BruceForstall @dotnet/arm32-contrib, @dotnet/jit-contrib .

@sandreenko
Copy link
Author

There are many asm diffs for amd64, they will degrade our performance, but they are correct.
before

       mov      r11, 0xD1FFAB1E
       cmp      dword ptr [rcx], ecx
       call     gword ptr [r11]hackishModuleName:hackishMethodName():ref:this

after

       mov      r11, 0xD1FFAB1E
       mov      rax, 0xD1FFAB1E
       cmp      dword ptr [rcx], ecx
       call     gword ptr [rax]hackishModuleName:hackishMethodName():ref:this

so out jit can't fold it. Does anybody know how to fix that, @AndyAyersMS? I will try to use lclVar, but expect that it won't help.

@AndyAyersMS
Copy link
Member

Not really my area of expertise, but if everyone else is tied up I will try and take a look.

It's not clear to me from the description above or the linked issue what the problem is and how you are attempting to solve it -- can you say a bit more about this?

@sandreenko
Copy link
Author

It's not clear to me from the description above or the linked issue what the problem is and how you are attempting to solve it -- can you say a bit more about this?

So when we have virtual stub-dispatch calls we have a special calling convention, that says where we pass the address of the stub (doc). This special calling convention was not reflected in the jit properly, in some cases we did not add this special argument to the call.
It worked in the past, because there was hack in LowerVirtualStubCall, that set addr->gtRegNum = comp->virtualStubParamInfo->GetReg(); and hoped that nobody will rewrite comp->virtualStubParamInfo->GetReg() until the call.
So this assumption becomes wrong when we have, for example, a scheduler, that moves some code between them, or the call generates a null check, that decides to use comp->virtualStubParamInfo->GetReg(). The second case was exposed with arm r2r tests.

So my change:

  1. deletes this hack;
  2. add this non-standard argument to the call in all cases (and deletes code duplications);

but because I changed our trees from:

      /--stubAddr
   /--indirection
/--call control expression
call

to

/--stubAddr
|  /--stubAddr
|  |--indirection
|--call control expression
call

jit now generates additional move to load stubAddr that is call's argument and can't allocate both stubAddr on the same register (stubAddr is just a constant).

and we have

       mov      r11, 0xD1FFAB1E
       mov      rax, 0xD1FFAB1E

the question is how to force jit to load them to the same register and delete this extra move? In other words how to get SSA value in our SDSU model.

@mikedn
Copy link

mikedn commented Jan 19, 2018

Probably I'm missing something but can't we simply keep the original tree and just require LSRA to assign r11 to stubAddr?

@sandreenko
Copy link
Author

Probably I'm missing something but can't we simply keep the original tree and just require LSRA to assign r11 to stubAddr?

It is how iit was before, but then how can you guarantee that r11 won't be rewritten before the call? For example call's null check can do it. We have to show that stubAddr is call's argument.

@CarolEidt
Copy link

Would it be possible to mark the control expression and its indirection both contained for this case? Then the stubAddr is used by the call, and if it is marked as its src requiring it to be in r11, that register will be reserved at the call, even if stubAddr somehow doesn't get r11, or it gets spilled.

@mikedn
Copy link

mikedn commented Jan 19, 2018

It is how iit was before,

As far as I can tell before the lowering code is simply pulling R11 out of the hat and assigns it to gtRegNum. It doesn't seem to actually tell LSRA that R11 is used anywhere. There is code in TreeNodeInfoInitCall that does that by calling setSrcCandidates... only on x86.

@sandreenko
Copy link
Author

Would it be possible to mark the control expression and its indirection both contained for this case?

It won't work, because this instructions chain is not a atomic instr. The control expression generates real instruction and can't be marked as contained. I think there is an assert that prevents is:

        // A call target can not be a contained indirection
        assert(!target->isContainedIndir());

@sandreenko
Copy link
Author

sandreenko commented Jan 20, 2018

I tried to use lclVars and it did not help, because the jit replaces lclVar with constant.

@mikedn
Copy link

mikedn commented Jan 20, 2018

The control expression generates real instruction and can't be marked as contained.

What's this real instruction that it generates? For typical VSD calls I see that control expr is a contained indir:

Generating: N025 (  3, 10) [000023] ------------        t23 =    CNS_INT(h) long   0x7ffdb6070028 ftn REG r11
IN0007:        mov      r11, 0x7FFDB6070028
                                                             /--*  t23    long   
Generating: N027 (  5, 12) [000024] -c----------        t24 = *  IND       long   REG NA
                                                             /--*  t21    int    arg1 in rdx
                                                             +--*  t22    ref    this in rcx
                                                             +--*  t24    long   control expr
Generating: N029 ( 46, 22) [000004] --CXG-------              *  CALLV stub void   System.Collections.Generic.ICollection`1[Int32][System.Int32].Add $VN.Void
							GC regs: 00000002 {rcx} => 00000000 {}
IN0008:        cmp      dword ptr [rcx], ecx
							Call: GCvars=0000000000000000 {}, gcrefRegs=00000000 {}, byrefRegs=00000000 {}
IN0009:        call     qword ptr [r11]System.Collections.Generic.ICollection`1[Int32][System.Int32]:Add(int):this

assert(!target->isContainedIndir());

I can't find such an assert. There's the opposite - assert(target->isContainedIndir()) and it's inside an x86 specific ifdef.

@sandreenko
Copy link
Author

@mikedn This assert is from armarch:

assert(!target->isContainedIndir());

@mikedn
Copy link

mikedn commented Jan 22, 2018

This assert is from armarch:

Right, I was only looking at x64 code. Of course, it's not contained on ARM because there's nothing like x64's call [mem]. On ARM it has to be something like ldr x, mem; bl x;.

But maybe it's possible to emulate memory operand "containedness" on ARM? Make the indir contained, request an internal register for the call node and emit the ldr during the codegen of the call node?

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to trigger some armlb R2R testing

@@ -3081,38 +3081,27 @@ GenTreeCall* Compiler::fgMorphArgs(GenTreeCall* call)

nonStandardArgs.Add(cns, REG_PINVOKE_COOKIE_PARAM);
}
else if (call->IsVirtualStub() && (call->gtCallType == CT_INDIRECT) && !call->IsTailCallViaHelper())
else if (call->IsVirtualStub())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to remove the check for CT_INDIRECT? So the non-CT_INDIRECT calls will also get a non-standard arg?

If so, couldn't you just keep the condition else if (call->IsVirtualStub() && !call->IsTailCallViaHelper()) to avoid nesting the ifs?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was one of the problems, CT_DIRECT also needs this special argument.

If I merge nested ifs, then I won't have place for comment about 'call->IsTailCallViaHelper()` that I think is useful.

@@ -3081,38 +3081,27 @@ GenTreeCall* Compiler::fgMorphArgs(GenTreeCall* call)

nonStandardArgs.Add(cns, REG_PINVOKE_COOKIE_PARAM);
}
else if (call->IsVirtualStub() && (call->gtCallType == CT_INDIRECT) && !call->IsTailCallViaHelper())
else if (call->IsVirtualStub())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all these changes going to work with the code generator for LEGACY_BACKEND?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I am not sure that we can test it in ci.

@@ -8072,6 +8051,25 @@ void Compiler::fgMorphTailCall(GenTreeCall* call)
DISPTREE(call);
}

GenTree* Compiler::fgGetStubAddrArg(GenTreeCall* call)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function needs a header comment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

}
else
{
assert(call->gtCallMoreFlags & GTF_CALL_M_VIRTSTUB_REL_INDIRECT);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did we handle this case before? Or was that the bug?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, it looks like this was extracted from the amd64 version of fgMorphTailCall(). But it's not the same logic as was previously in fgMorphArgs (now calling this). Is that ok/good?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the bug was in unimplemented case for CT_DIRECT.
Also the was a workaround in lower.cpp for legacy arm, that did not work write with non-legacy.

#endif
indir->gtFlags |= GTF_IND_REQ_ADDR_IN_REG;
#endif
result = indir;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOVE seeing all this get deleted!

GenTree* stubAddrArg = fgGetStubAddrArg(call);
// And push the stub address onto the list of arguments
call->gtCallArgs = gtNewListNode(stubAddrArg, call->gtCallArgs);
#else // !LEGACY_BACKEND
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is backwards: remove the !. Better, put the LEGACY_BACKEND case first, to avoid double negative logic.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks


#ifndef LEGACY_BACKEND
GenTree* stubAddrArg = fgGetStubAddrArg(call);
// And push the stub address onto the list of arguments
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't you change stubAddrArg to just arg and then tail merge the next two lines of this #if with the same two lines of the #endif?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will delete the legacy soon, so I think it is better to keep this change that way (do not want to move GenTree* stubAddrArg out of ifdef or do call->gtCallArgs = gtNewListNode(stubAddrArg, call->gtCallArgs); that uses stubAddrArg that is defined twice (under different #ifdefs).

@sandreenko
Copy link
Author

So the failing tests are failing because of the same issue with incorrect liveness information for call->IsR2RRelativeIndir(). We create stubAddr argument too late to add it into call->gtArgsList.
I can't find a simple way to fix that, so I think to do it in a separate PR.

@sandreenko
Copy link
Author

@dotnet-bot test Windows_NT arm Cross Checked r2r Build and Test
@dotnet-bot test Windows_NT arm Cross Checked jitstressregs1 Build and Test
@dotnet-bot test Windows_NT armlb Cross Checked jitstress1 Build and Test

@sandreenko
Copy link
Author

Make sure to trigger some armlb R2R testing

@BruceForstall, Do we have armlb r2r jobs in ci? I have not found them.

PR was updated. Please take another look.

@sandreenko
Copy link
Author

@dotnet/jit-contrib ping.

@CarolEidt
Copy link

This looks reasonable to me - but just to make sure I understand: this latest round of changes still cause the stub address to be loaded into two registers, right? Would it make sense to file an issue to eliminate that duplication?

@sandreenko
Copy link
Author

sandreenko commented Feb 1, 2018

This looks reasonable to me - but just to make sure I understand: this latest round of changes still cause the stub address to be loaded into two registers, right? Would it make sense to file an issue to eliminate that duplication?

Yes, you are right. The issue was filed (#16037). It shows a general problem with out constant values, because the issue existed before this PR.

@CarolEidt
Copy link

@sandreenko - thanks! I think it would be good to get sign-off from Bruce as well, but this LGTM.

Sergey Andreenko added 3 commits February 1, 2018 00:10
expect many failures for arm in lower because we delete: // Change the call type, so we can add the extra indirection here, rather than in codegen
@sandreenko
Copy link
Author

@dotnet-bot test Windows_NT arm Cross Checked r2r Build and Test
@dotnet-bot test Windows_NT arm Cross Checked jitstressregs1 Build and Test
@dotnet-bot test Windows_NT armlb Cross Checked jitstress1 Build and Test

@mikedn
Copy link

mikedn commented Feb 1, 2018

Yes, you are right. The issue was filed (#16037). It shows a general problem with out constant values, because the issue existed before this PR.

So let's see if I understand this correctly. On x64 we'll generate worse code than before due to a problem that seems to be ARM specific and we hope that code gen may improve sometime in the future, if ever.

@sandreenko
Copy link
Author

So let's see if I understand this correctly. On x64 we'll generate worse code than before due to a problem that seems to be ARM specific and we hope that code gen may improve sometime in the future, if ever.

The worse code generation is not ARM specific. If you want I can create amd64 example.
The original issue is wrong liveness issue, this issue also is not target specific, but we hit its consequences only on arm.

I do not know a better way how to fix liveness problem without this CQ regression. Different contained hacks do not change liveness and do not fix the real issue here.

Copy link

@briansull briansull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good

@sandreenko sandreenko merged commit c4943e0 into dotnet:master Feb 2, 2018
@sandreenko sandreenko deleted the armR2Rtest branch February 2, 2018 06:59
@AndyAyersMS
Copy link
Member

For something like this I would have liked to have seen jit diffs results and may be some kind of performance impact assessment. Given our current policy of distributing framework code as R2R's assemblies we need to pay attention to R2R perf.

It sounds like what we ended up with is a bit of a compromise and in such cases it is good to get an idea of what we are paying for getting this bug fixed. As things stand it is hard to know if this change is something we can live with or whether we should go back and try and root out what is preventing us from getting to the code we'd like to emit.

@CarolEidt
Copy link

@AndyAyersMS - excellent points.

As things stand it is hard to know if this change is something we can live with or whether we should go back and try and root out what is preventing us from getting to the code we'd like to emit.

At a minimum, we should file an issue, with pointers to impacted test cases, so that we can investigate the options.

@RussKeldorph
Copy link

RussKeldorph commented Feb 2, 2018

@sandreenko Please add diffs and perf analysis.

@BruceForstall
Copy link
Member

We really need to implement CI-triggered all-plat asm diffs, as well as all-plat always-available SuperPMI collections that enable SuperPMI asm diffs.

@sandreenko sandreenko restored the armR2Rtest branch February 5, 2018 21:32
briansull added a commit to briansull/coreclr that referenced this pull request Feb 5, 2018
briansull added a commit that referenced this pull request Feb 5, 2018
Revert #15910 [R2R] add a virtualStubParamInfo as an argument
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* add a virtualStubParamInfo as an argument

* Revert legacy workaround from lower.


Commit migrated from dotnet/coreclr@c4943e0
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Revert dotnet/coreclr#15910 [R2R] add a virtualStubParamInfo as an argument

Commit migrated from dotnet/coreclr@1733bdd
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants