[hackathon] `ValueArray` - a compliment type to the `Span`, which owns its data without indirections. #60519

VSadov · 2021-10-17T15:53:29Z

ValueArray<T, R> is a data type that represents an indexable fixed-sized chunk of data containing N elements of type T.

The R type argument must be an object array type, that is not used for anything other than to convey the length of the value array declaratively - via the R's rank.

The only "magical" part of the ValueArray type is that the VM logically replicates the storage N times and adjusts the instance layout and GC tracking accordingly.

In all other ways this is a regular struct - it can be copied by value, embedded in another struct/class, passed in/out of methods, stored on the heap, boxed...

dotnet-issue-labeler · 2021-10-17T15:53:40Z

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

dotnet-issue-labeler · 2021-10-17T15:53:40Z

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

VSadov · 2021-10-17T15:56:48Z

@davidwrighton @stephentoub - please take a look. Thanks!

jkotas · 2021-10-17T22:45:47Z

The R type argument must be an object array type, that is not used for anything other than to convey the length of the value array declaratively - via the R's rank.

The maximum array rank is constrained by runtime implementation limitations (MAX_RANK - currently set to 32). Does this mean that the value arrays cannot have more than 32 elements?

VSadov · 2021-10-18T14:46:56Z

@jkotas - I have relaxed the dimension limit in this change.
It seems to be artificial. Neither Mono nor Roslyn have a limit. I could not find that in the spec too.

jkotas · 2021-10-18T14:58:31Z

This limit exists to avoid running in stackoverflows. Given how many of the type loader algorithms are implemented, it is not hard to hit stackoverflow for arrays with high rank.

jkotas · 2021-10-18T15:00:09Z

There are number of implementation limits like this. For example, there is an implementation limit on size of arguments or size of local variables. These implementation limits are not mentioned in the ECMA spec.

jkotas · 2021-10-18T15:09:59Z

I am wondering whether there are other hacky ways to encode the array that have log(N) complexity. For example,

ValueTypeArray<EncodedLength<EncodedLength<EncodedLength, byte>, object>, byte> can encode 5 (101 in binary).

VSadov · 2021-10-18T15:18:30Z

I just meant that we are not required by the spec to have a limit. It is an implementation detail that we can change, carefully of course.

I will check if we have any kind of recursion when processing the rank.
I think for the most part we just store the rank as a number and use in instance size computation when allocating.

tannergooding · 2021-10-18T15:45:51Z

Just wanted to note that there are a few examples in the Windows SDK that have "value arrays" going up to 1025 elements. The largest of these, that I'm aware of, consists of 12-byte structs (so you end up with a 12kb value type).

I think the biggest stack-allocation in the BCL (for floating-point parsing/formatting) is 768 elements or so (used to be char, but we switched this to byte a while back).

VSadov · 2021-10-18T15:47:21Z

There are certainly other ways to encode the length with different tradeofs.

I just picked array rank as the most straightforward. It seems to be fairly cheap -in terms of instantiations/types required, and in terms of implementation difficulty and changes to runtime.

Also it is easily expressible in C# - very little typing, if you look through tests. That is nice at prototyping/experimenting stage.
I can see how I could start using it in libraries and the code will be fairly readable.

VSadov · 2021-10-18T17:42:37Z

@tannergooding - there are some limits on how large variables on the stack can be. The limit will apply regardless how you declare the variable. I am not sure if 12K comes close to such limit.

I suspect somebody somewhere is already using such type by manually specifying a struct with 1025 fields.

tannergooding · 2021-10-18T17:55:58Z

there are some limits on how large a variable on the stack can be

Right. There is the general limit of the stack itself (1MB by default on Windows; 8MB by default on 64-bit Unix) and then some limit on max field count/size in the VM (MAX_SIZE_FOR_INTEROP=0x7ffffff0, MAX_SIZE_FOR_VALUECLASS_IN_ARRAY=0xffff, etc

I was just giving these as examples of real world limits we might want to account for.

Rank seems good for the prototype because its something that the JIT and C# already support. Outside of this, the JIT (but not C#) also supports T[size] and so you could imagine ValueArray<int[5]> representing an int-based value array of 5 elements, etc.

ArrayShape in general supports the following, so we have a number of options if we wanted to efficiently encode this information given the existing metadata support:

VSadov · 2021-10-18T18:04:42Z

On the Libraries we use hacks such as private struct FourStackStrings

runtime/src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexReplacement.cs

Line 97 in 57bfe47

    
           private struct FourStackStrings // used to do the equivalent of: Span<string> strings = stackalloc string[4];

And use them unsafely :

runtime/src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexReplacement.cs

Line 42 in 57bfe47

    
           var strings = new ValueListBuilder<string>(MemoryMarshal.CreateSpan(ref stackStrings.Item1!, 4));

As you notice there are a few short stackalloc-d buffers with constant size around that place as well. They all could just be value arrays.

stephentoub · 2021-10-18T18:09:01Z

On the Libraries we use hacks such as private struct FourStackStrings

Why wouldn't we just use Span<string> span = stackalloc string[4]; once that works? i.e. what's the benefit of ValueArray in such cases?

jkotas · 2021-10-18T18:52:25Z

That is nice at prototyping/experimenting stage.

I agree that the current hack is ok for experimentation. I have been thinking what this may look like if we were to productize this idea, and the hacks mentioned here look pretty ugly. We should be thinking about the clean way to represent it: either add support for parametrizing generics using constants or add a new ELEMENT_TYPE for the fixed array. There was actually ET value for fixed arrays in the early days of .NET https://github.com/search?q=ELEMENT_TYPE_VALUEARRAY_INTERNAL&type=code , but it did not make it.

VSadov · 2021-10-18T19:21:58Z

Why wouldn't we just use Span span = stackalloc string[4]; once that works

I would ask the question the other way - Why would you use stackalloc if a ValueArray is sufficient? - because ValueArray is conceptually simpler.

stackalloc is a powerful feature and when you need it, you really need it. You need it, for example, when allocation has a variable length or when allocation is large, but is rarely necessary.

There are downsides also:

stackalloc is more error prone. You may accidentally move it into a loop as a part of refactoring causing occasional SO.
We had such bugs. Maybe even in this same code that I referenced. It looks familiar.
stackalloc will limit your design choices.
You cannot put stackallock-d variable into a field if evolution of your code requires it. It is difficult to mix with async or lambdas.

If a ValueArray is available I'd ask myself every time I use stackalloc for the reasons to use it - vs. just using ValueArray.

VSadov · 2021-10-18T19:43:59Z

re ELEMENT_TYPE_VALUEARRAY_INTERNAL

yes, I saw it and figured that we probably had value arrays pre-1.0 in some form, but it did not ship. Possibly because before Span the usefulness would be limited.
We discussed this in our hackaphone Teams thing. It was a curious find.

Current assumption is that we are not changing metadata version for this feature, so we had to find ways that fit it into v2.0 spec.

stephentoub · 2021-10-18T20:10:54Z

I would ask the question the other way - Why would you use stackalloc if a ValueArray is sufficient? - because ValueArray is conceptually simpler.

To play devil's advocate, because it's one tool that suffices whether or not the length is variable and it's an existing feature that doesn't increase concept count.

VSadov · 2021-10-18T20:15:15Z

either add support for parametrizing generics using constants or add a new ELEMENT_TYPE for the fixed array

I think these are orthogonal things and both have their uses.
They would complement each other nicely, but one would still need to express how a constant type param flows into a value array length.

class C1<T, int L>
{
    T[L] Data;   // ELEMENT_TYPE_VALUEARRAY <T> <L> Data
}

The features do not require one another though, so value array does not need to wait for generic constants.

If there is a way to make ELEMENT_TYPE_VALUEARRAY happen, for starters, I'd vote with two hands.

VSadov · 2021-10-18T20:22:08Z

To play devil's advocate, because it's one tool that suffices whether or not the length is variable and it's an existing feature that doesn't increase concept count.

The range of usefulness of stackalloc is not a superset of value array. You can advocate the other way too :-)

stephentoub · 2021-10-18T20:27:16Z

The range of usefulness of stackalloc is not a superset of value array.

It's not, it's a Venn diagram. That's kind of my point. I think the argument for ValueArray<T> should be focused on the things outside of stackalloc's abilities. We shouldn't cite cases like the FourStackStrings one mentioned earlier as being the motivation, as such cases are entirely doable with stackalloc and I don't see a good reason to use something else. We should instead focus on finding good examples in dotnet/runtime where stackalloc couldn't be used, e.g. places where we'd embed ValueArray<T> in other types, where we'd use it in lambdas and async methods, etc.

tannergooding · 2021-10-18T20:41:58Z

That's kind of my point. I think the argument for ValueArray should be focused on the things outside of stackalloc's abilities

@stephentoub, I think one thing to note is that there is potential optimization opportunities and other static analysis that can be done for ValueArray that can't be done for stackalloc.

When the size is well known, ValueArray can potentially avoid stack cookies and can ensure that the size requirement is part of the API contract for a method.

This has downsides as well (its a breaking change to go from ValueArray<int, 5> to ValueArray<int, 4> or ValueArray<int, 6>) but for internal usage its likely something we'd want to actively consider and potentially ensure the JIT handles well.

Today the JIT can only make assumptions about Span<T> when inlining occurs and the constructor for the span was constant. When inlining can't occur, or where the surrounding Span logic is more complex, you get the safety in C#, but no real additional safety or benefits in IL (you still get stack cookies, you still get an actual stack allocation, you get unknown lengths and so only minimal bounds checking, etc).

tannergooding · 2021-10-18T20:45:03Z

So, for example, things like stackalloc string[4] are likely better handled via ValueArray<string, 4> IMO, particularly if the consuming API is not generally accepting larger inputs.

Likewise, the floating-point formatting/parsing logic always allocates a 768 byte buffer and tracks the used length separately, so its likely beneficial to switch to ValueArray here as well so the JIT can potentially treat it as "known length" rather than "unknown length".

jkotas · 2021-10-18T20:54:19Z

The features do not require one another though, so value array does not need to wait for generic constants.

The generic constants can be be used to represent the value array size like ValueArray<E, int Length> and ELEMENT_TYPE_VALUEARRAY is not needed at all.

stephentoub · 2021-10-18T20:57:07Z

So, for example, things like stackalloc string[4] are likely better handled via ValueArray<string, 4> IMO

The cited example was:
https://github.com/dotnet/runtime/blob/57bfe474518ab5b7cfe6bf7424a79ce3af9d6657/src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexReplacement.cs#L41-42

FourStackStrings stackStrings = default;
var strings = new ValueListBuilder<string>(MemoryMarshal.CreateSpan(ref stackStrings.Item1!, 4));

Once stackalloc string[length] is a thing, I would write that as:

var strings = new ValueListBuilder<string>(stackalloc string[4]);

Can you help me understand why this specific case would a) be better with ValueArray and b) not be something the JIT could optimize just as well either way? I'm not seeing it.

…ixed` statements.

VSadov · 2021-10-19T00:09:44Z

We should instead focus on finding good examples in dotnet/runtime where stackalloc couldn't be used, e.g. places where we'd embed ValueArray in other types, where we'd use it in lambdas and async methods, etc.

So, should we merge this then and start using it?
Is there anything blocking in the actual implementation?

tannergooding · 2021-10-19T00:15:55Z

Can you help me understand why this specific case would a) be better with ValueArray and b) not be something the JIT could optimize just as well either way? I'm not seeing it.

I'm not sure if this particular case is more beneficial as ValueArray because its going through an indirection via ValueListBuilder and so not being consumed directly.

In either case, I'd hope the syntax for using value array ends up being decently trivial itself.

VSadov · 2021-10-19T01:29:05Z

The syntax is a relatively easy part ( relatively ;-) - compared to the runtime part).
Roslyn has a long history mapping nice syntax to underlying types - nullable, tuples, arrays, delegates, even primitive types like int.

It is work in progress (dotnet/roslyn#57286)

It would allow syntax like:

The tooltips reveal the underlying type, but will show int[4] too, eventually....

stephentoub · 2021-10-19T02:26:26Z

So, should we merge this then and start using it?

I'll defer to @davidwrighton. From my perspective, this is an experimental branch and merging this into it is fine.

VSadov · 2021-10-20T15:50:03Z

BTW, I noticed that JIT is very good at eliding and hoisting range checks when working with value array.

I.E. Since the length is statically known, arr[4] will omit the range check if the Length is 5.
Loops like for(int i = 0, i < arr.Length; i++) {DoSomething(arr[i]); } will not have any range checks either.

I was thinking that these could be good optimizations, but it looks like JIT does that without any additional work.
You get both safety and optimal performance.

VSadov · 2021-10-21T16:52:48Z

Talked to @davidwrighton - and we think it is ok to merge this.

dotnet-issue-labeler bot added the new-api-needs-documentation label Oct 17, 2021

AaronRobinsonMSFT changed the title ~~ValueArray - a compliment type to the Span, which owns its data without indirections.~~ [hackathon] ValueArray - a compliment type to the Span, which owns its data without indirections. Oct 18, 2021

VSadov mentioned this pull request Oct 18, 2021

[hackathon] Support for ValueArray in C# escape analysis. dotnet/roslyn#57218

Merged

ValueArray<T> that has its field repeated 64 times

dffd749

VSadov added 13 commits October 18, 2021 15:37

Update ValueArray API, add range checks.

3501a7b

less hacky ValueArray detection

4b3aa52

Use MD array to encode Length

de2dfdb

Tweak ValueArray so that range checks could be inlined/omitted by JIT

4c75057

comments

f8bf0f0

Make the Element0 field private

6a24f6b

Make API more similar to Span.

0005493

add a proper test

ca84c39

Add a field test

55c5295

bug fixes

230bb8a

Value semantics should consider all elements.

84b784b

Add a test for use with pointers/fixed

c1d96a9

VSadov mentioned this pull request Oct 19, 2021

[hackathon] [testonly] ValueArray - a compliment type to the Span, which owns its data without indirections. #60608

Closed

VSadov added 2 commits October 20, 2021 11:13

A few test fixes to match the changes.

e5019b3

Disable ValueArray tests on mono until implemented.

b81da28

VSadov force-pushed the fixArrHacks branch from 199df4b to b81da28 Compare October 20, 2021 18:39

VSadov requested a review from MichalStrehovsky as a code owner October 20, 2021 18:39

VSadov requested review from davidwrighton and removed request for MichalStrehovsky October 20, 2021 18:40

VSadov merged commit 1bca35b into dotnet:feature/lowlevelhackathon Oct 21, 2021

VSadov deleted the fixArrHacks branch October 21, 2021 16:52

jnm2 mentioned this pull request Nov 1, 2021

[Proposal]: Collection expressions (VS 17.7, .NET 8) dotnet/csharplang#5354

Closed

4 tasks

ghost locked as resolved and limited conversation to collaborators Nov 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[hackathon] `ValueArray` - a compliment type to the `Span`, which owns its data without indirections. #60519

[hackathon] `ValueArray` - a compliment type to the `Span`, which owns its data without indirections. #60519

VSadov commented Oct 17, 2021

dotnet-issue-labeler bot commented Oct 17, 2021

dotnet-issue-labeler bot commented Oct 17, 2021

VSadov commented Oct 17, 2021

jkotas commented Oct 17, 2021

VSadov commented Oct 18, 2021

jkotas commented Oct 18, 2021

jkotas commented Oct 18, 2021

jkotas commented Oct 18, 2021

VSadov commented Oct 18, 2021

tannergooding commented Oct 18, 2021 •

edited

Loading

VSadov commented Oct 18, 2021

VSadov commented Oct 18, 2021 •

edited

Loading

tannergooding commented Oct 18, 2021

VSadov commented Oct 18, 2021

stephentoub commented Oct 18, 2021 •

edited

Loading

jkotas commented Oct 18, 2021

VSadov commented Oct 18, 2021 •

edited

Loading

VSadov commented Oct 18, 2021

stephentoub commented Oct 18, 2021

VSadov commented Oct 18, 2021

VSadov commented Oct 18, 2021

stephentoub commented Oct 18, 2021 •

edited

Loading

tannergooding commented Oct 18, 2021

tannergooding commented Oct 18, 2021

jkotas commented Oct 18, 2021

stephentoub commented Oct 18, 2021

VSadov commented Oct 19, 2021

tannergooding commented Oct 19, 2021

VSadov commented Oct 19, 2021 •

edited

Loading

stephentoub commented Oct 19, 2021

VSadov commented Oct 20, 2021 •

edited

Loading

VSadov commented Oct 21, 2021

[hackathon] ValueArray - a compliment type to the Span, which owns its data without indirections. #60519

[hackathon] ValueArray - a compliment type to the Span, which owns its data without indirections. #60519

Conversation

VSadov commented Oct 17, 2021

dotnet-issue-labeler bot commented Oct 17, 2021

dotnet-issue-labeler bot commented Oct 17, 2021

VSadov commented Oct 17, 2021

jkotas commented Oct 17, 2021

VSadov commented Oct 18, 2021

jkotas commented Oct 18, 2021

jkotas commented Oct 18, 2021

jkotas commented Oct 18, 2021

VSadov commented Oct 18, 2021

tannergooding commented Oct 18, 2021 • edited Loading

VSadov commented Oct 18, 2021

VSadov commented Oct 18, 2021 • edited Loading

tannergooding commented Oct 18, 2021

VSadov commented Oct 18, 2021

stephentoub commented Oct 18, 2021 • edited Loading

jkotas commented Oct 18, 2021

VSadov commented Oct 18, 2021 • edited Loading

VSadov commented Oct 18, 2021

stephentoub commented Oct 18, 2021

VSadov commented Oct 18, 2021

VSadov commented Oct 18, 2021

stephentoub commented Oct 18, 2021 • edited Loading

tannergooding commented Oct 18, 2021

tannergooding commented Oct 18, 2021

jkotas commented Oct 18, 2021

stephentoub commented Oct 18, 2021

VSadov commented Oct 19, 2021

tannergooding commented Oct 19, 2021

VSadov commented Oct 19, 2021 • edited Loading

stephentoub commented Oct 19, 2021

VSadov commented Oct 20, 2021 • edited Loading

VSadov commented Oct 21, 2021

[hackathon] `ValueArray` - a compliment type to the `Span`, which owns its data without indirections. #60519

[hackathon] `ValueArray` - a compliment type to the `Span`, which owns its data without indirections. #60519

tannergooding commented Oct 18, 2021 •

edited

Loading

VSadov commented Oct 18, 2021 •

edited

Loading

stephentoub commented Oct 18, 2021 •

edited

Loading

VSadov commented Oct 18, 2021 •

edited

Loading

stephentoub commented Oct 18, 2021 •

edited

Loading

VSadov commented Oct 19, 2021 •

edited

Loading

VSadov commented Oct 20, 2021 •

edited

Loading