Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement System.Array functions used by MD arrays as intrinsics #60816

Merged
merged 2 commits into from
Oct 26, 2021

Conversation

BruceForstall
Copy link
Member

@BruceForstall BruceForstall commented Oct 24, 2021

Specifically, implement:

  • GetLength
  • GetLowerBound
  • GetUpperBound

These are implemented as intrinsics for MD arrays only, when given
a constant dimension argument that is less than the array rank (that is,
in range).

Thus, exceptional cases fall back to the IL code which throws an exception
as appropriate.

Example x64 codegen for 2-D int[,] array (instead of call or inlined call):

GetLength(1):

mov      r9d, dword ptr [rcx+20]

GetLowerBound(1):

mov      r9d, dword ptr [rcx+28]

GetUpperBound(1):

mov      r9d, dword ptr [rcx+28]
mov      r10d, dword ptr [rcx+20]
lea      r10d, [r9+r10-1]

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Oct 24, 2021
@ghost
Copy link

ghost commented Oct 24, 2021

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

Specifically, implement:

  • get_Rank
  • GetLength
  • GetLowerBound
  • GetUpperBound

get_Rank is implemented for both SZ and MD arrays, and replaces the result
with a constant. It falls back to IL for generic System.Array types.

The others are implemented as intrinsics for MD arrays only, when given
a constant dimension argument that is less than the array rank (that is,
in range).

Thus, exceptional cases fall back to the IL code which throws an exception
as appropriate.

Example x64 codegen for 2-D int[,] array (instead of call or inlined call):

GetLength(1):

mov      r9d, dword ptr [rcx+20]

GetLowerBound(1):

mov      r9d, dword ptr [rcx+28]

GetUpperBound(1):

mov      r9d, dword ptr [rcx+28]
mov      r10d, dword ptr [rcx+20]
lea      r10d, [r9+r10-1]
Author: BruceForstall
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@BruceForstall
Copy link
Member Author

The recently added MDGeneralArray benchmark shows 4x speedsups:

Method Job Toolchain Mean Error StdDev Median Min Max Ratio Allocated
MDGeneralArray Job-RYCWUF \runtime2\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\CoreRun.exe 55.47 ms 0.584 ms 0.546 ms 55.45 ms 54.67 ms 56.58 ms 1.00 8 KB
MDGeneralArray Job-CLIYYD \runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\CoreRun.exe 14.61 ms 0.106 ms 0.099 ms 14.62 ms 14.40 ms 14.79 ms 0.26 8 KB
MDGeneralArray2 Job-RYCWUF \runtime2\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\CoreRun.exe 55.30 ms 0.552 ms 0.516 ms 55.24 ms 54.64 ms 56.74 ms 1.00 9 KB
MDGeneralArray2 Job-CLIYYD \runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\CoreRun.exe 14.68 ms 0.145 ms 0.136 ms 14.63 ms 14.49 ms 15.00 ms 0.27 8 KB

This is no surprise, as the loops being measured look like:

for (int i = s.GetLowerBound(0); i <= s.GetUpperBound(0); i++) {
    for (int j = s.GetLowerBound(1); j <= s.GetUpperBound(1); j++) {
        for (int k = s.GetLowerBound(2); k <= s.GetUpperBound(2); k++) {
            if (s[i,j,k] != d[i,j,k]) {
                return false;
            }
        }
    }
}

var td = HandleToObject(cls) as ArrayType;
Debug.Assert(td != null);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like crossgen2 was too restrictive: the VM code passes returns zero if it's not a MD or SZ array.

@BruceForstall
Copy link
Member Author

@dotnet/jit-contrib PTAL
cc @jkotas

@BruceForstall
Copy link
Member Author

/azp run runtime-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@BruceForstall
Copy link
Member Author

I do not think Array.Rank is used on hot paths and/or with concrete array types. It may not be worth the trouble to treat it as JIT intrinsic.

Good point. The JIT inlines and optimizes the function call anyway for the general case. Will remove the intrinsic handling.

@BruceForstall
Copy link
Member Author

/azp run runtime-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@EgorBo
Copy link
Member

EgorBo commented Oct 25, 2021

Good to see improvements in this area, I think I saw a couple of public benchmarks where .NET was in a very bad place because author decided to benchmark multi-dimension arrays 🙂 Any spmi diffs? just to make sure this area is well covered (tests)

@BruceForstall
Copy link
Member Author

Any spmi diffs? just to make sure this area is well covered (tests)

There are a few tests that cover MD arrays, such as my newly added MDGeneralArray. And I've been using a little test program as well. However, no SPMI diffs, but that's because the new NamedIntrinsic and library attributions changes the JIT-EE interface. There are jit-diff diffs on the tests that use these functions.

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM.

One somewhat tangential question.

src/coreclr/jit/importer.cpp Outdated Show resolved Hide resolved
// See the comments at the definition of CORINFO_Array for a description of how arrays are laid out in memory.
//
// static
unsigned Compiler::eeGetArrayDataOffset()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize the code below is pre-existing but does this work right when cross-targeting (say 64 bit hosted 32 bit targeted)?

Seems like it must, somehow.

Copy link
Member Author

@BruceForstall BruceForstall Oct 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, OFFSETOF__CORINFO_Array__data is ifdef'ed on TARGET_64BIT but everything else is target/bitness-independent, and we build compile-time target-bitness-specific JITs, so seems like it's all set up to work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I missed that this is not offsetof(CORINFO_Array, i1Elems) or similar, but a separately defined thing.

Specifically, implement:
- get_Rank
- GetLength
- GetLowerBound
- GetUpperBound

get_Rank is implemented for both SZ and MD arrays, and replaces the result
with a constant. It falls back to IL for generic System.Array types.

The others are implemented as intrinsics for MD arrays only, when given
a constant dimension argument that is less than the array rank (that is,
in range).

Thus, exceptional cases fall back to the IL code which throws an exception
as appropriate.

Example x64 codegen for 2-D `int[,]` array (instead of call or inlined call):

GetLength(1):
```
mov      r9d, dword ptr [rcx+20]
```

GetLowerBound(1):
```
mov      r9d, dword ptr [rcx+28]
```

GetUpperBound(1):
```
mov      r9d, dword ptr [rcx+28]
mov      r10d, dword ptr [rcx+20]
lea      r10d, [r9+r10-1]
```
@BruceForstall BruceForstall merged commit 13d6458 into dotnet:main Oct 26, 2021
@BruceForstall BruceForstall deleted the MDFunctionIntrinsics branch October 26, 2021 01:42
@ghost ghost locked as resolved and limited conversation to collaborators Nov 25, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants