Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ICollection<T> on several LINQ iterators #88249

Merged
merged 1 commit into from
Jul 1, 2023

Conversation

stephentoub
Copy link
Member

@stephentoub stephentoub commented Jun 30, 2023

This adds ICollection<T> implementations to several LINQ iterator types. This allows operations like list.AddRange(enumerable) to be significantly more efficient, as AddRange queries for ICollection<T> and then uses its Count/CopyTo methods to directly populate the backing array.

I think the first commit here is pure goodness. It handles Enumerable.Empty, Enumerable.Repeat, Enumerable.Range, and subsetting operations of IList, e.g. Take/Skip.

The second commit is more valuable but also may need a little more scrutiny. It adds support for Enumerable.Select over arrays, lists, and ranges. This makes operations like list.AddRange(array.Select(i => ...)) much more efficient. However, because ICollection<T> provides Contains, invoking ICollection<T>.Contains will also invoke the selector function for every element examined. That's no different from if you used Enumerable.Contains, but in theory someone might be doing something like if (source is ICollection<T> c) { ... issue multiple c.Contains calls }, in which case they might silently start running the selector multiple times where they weren't previously.

@stephentoub stephentoub added area-System.Linq tenet-performance Performance related issue labels Jun 30, 2023
@stephentoub stephentoub added this to the 8.0.0 milestone Jun 30, 2023
@ghost ghost assigned stephentoub Jun 30, 2023
@ghost
Copy link

ghost commented Jun 30, 2023

Tagging subscribers to this area: @dotnet/area-system-linq
See info in area-owners.md if you want to be subscribed.

Issue Details

This adds ICollection<T> implementations to several LINQ iterator types. This allows operations like list.AddRange(enumerable) to be significantly more efficient, as AddRange queries for ICollection<T> and then uses its Count/CopyTo methods to directly populate the backing array.

I think the first commit here is pure goodness. It handles Enumerable.Empty, Enumerable.Repeat, and Enumerable.Range.

The second commit is more valuable but also may need a little more scrutiny. It adds support for Enumerable.Select over arrays, lists, and ranges. This makes operations like list.AddRange(array.Select(i => ...)) much more efficient. However, because ICollection<T> provides Contains, invoking ICollection<T>.Contains will also invoke the selector function for every element examined. That's no different from if you used Enumerable.Contains, but in theory someone might be doing something like if (source is ICollection<T> c) { ... issue multiple c.Contains calls }, in which case they might silently start running the selector multiple times where they weren't previously.

Author: stephentoub
Assignees: -
Labels:

area-System.Linq, tenet-performance

Milestone: 8.0.0

Copy link
Member

@eiriktsarpalis eiriktsarpalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

`ICollection<T>` provides both a Count and a CopyTo, and `IList<T>` an indexer, all of which can make various consumption mechanisms more efficient. We only implement the interfaces when the underlying collection has a fixed size and all of the interface implementations are side-effect free (in particular, while appealing to do so, we don't implement them on various Select iterators).

Some of the serialization tests need to be fixed as a result. The state of Queue's array is a bit different based on how its initialized, and such private details show up in BinaryFormatter output.  Rather than special-casing the output per framework and core, I've just changed the test itself to ensure Queue can't see the size of the input collection.
@stephentoub stephentoub merged commit 0d77cf0 into dotnet:main Jul 1, 2023
@stephentoub stephentoub deleted the linqcollection branch July 1, 2023 22:14
@ghost ghost locked as resolved and limited conversation to collaborators Aug 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Linq tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants