Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ForEachAsync in System.Collections.Concurrent #14088

Closed
davidfowl opened this issue Feb 9, 2015 · 12 comments
Closed

Implement ForEachAsync in System.Collections.Concurrent #14088

davidfowl opened this issue Feb 9, 2015 · 12 comments
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Collections
Milestone

Comments

@davidfowl
Copy link
Member

@stephentoub has some great blog posts about higher level operations that can be implemented all using public API that exists today. Some of these would be great to see in the library itself:

http://blogs.msdn.com/b/pfxteam/archive/2012/03/05/10278165.aspx

In particular

public static Task ForEachAsync<T>(this IEnumerable<T> source, Func<T, Task> body) 
{ 
    return Task.WhenAll( 
        from item in source 
        select Task.Run(() => body(item))); 
}

And

public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body) 
{ 
    return Task.WhenAll( 
        from partition in Partitioner.Create(source).GetPartitions(dop) 
        select Task.Run(async delegate { 
            using (partition) 
                while (partition.MoveNext()) 
                    await body(partition.Current); 
        })); 
}
@ellismg ellismg self-assigned this Feb 9, 2015
@ellismg
Copy link
Contributor

ellismg commented Feb 13, 2015

@stephentoub I'm interested to get your take here. Are there things we can do which do not pull in too much policy to the runtime (or places where we can hit the 99% case where some default choices for policy)?

@stephentoub
Copy link
Member

I think it'd be a reasonable addition, though I think it'd make sense in a separate library of some kind, along with other more specialized combinators over Tasks. I've been thinking of starting such a project on https://github.com/dotnet/corefxlab. There is a fair amount of policy that would be included, e.g. what to do in the face of exceptions, how much parallelism to employ, whether that's a statement about the number of activities allowed to run concurrently or a statement on how much CPU-bound code to execute concurrently, etc. But I also think we could have some reasonable defaults.

@terrajobst terrajobst assigned stephentoub and unassigned ellismg Sep 15, 2015
@terrajobst
Copy link
Member

@stephentoub, what should be the next steps here?

@stephentoub
Copy link
Member

@stephentoub, what should be the next steps here?

There are a ton of "1-to-6-liners" like these that can be valuable. The trouble is, different folks need different things with slightly different behaviors here and there. The fact that there are multiple examples earlier in this thread (and there were more in the original blog post(s), plus more that could be written to address some of the considerations I called out in my previous response) highlight that there are lots of different variations folks could want.

In such cases, I'm inclined to put the examples out there and let folks piece the primitives together in whatever manner fits their needs best. As I noted, a corefxlabs project could be built (with a NuGet package distributed for it) that contains a bunch of these, and folks can use that NuGet package as they see fit if the included implementations meet their needs. At some point we could promote some of those implementations into the core libraries if we found them to be particularly widely desired and used. I just don't know right now which would actually be important to include in the core libraries, and I'd hate to get it wrong and lead folks down the wrong path.

@davidfowl, are you using any of these as-is? Can you elaborate on which and in what scenarios? Is ASP.NET using them?

@davidfowl
Copy link
Member Author

@stephentoub We use some of these today in DNU to do things like restore the specified projects in parallel (as an example).

@stephentoub
Copy link
Member

We use some of these

Which ones?

@clrjunkie
Copy link

@stephentoub

Your blog posts concerning tasks, async/await including the discussions comments are indispensable
A corefxlab repo containing your documented samples summarizing the motivation and limitations would be super valuable!! :)

@GSPP
Copy link

GSPP commented May 8, 2017

I have used this primitive in a more fleshed out form for years now. It should be possible to pick sensible policies and make this into a stable API. There are not that many policies to pick at all. When I wrote this I found the design to be almost forced.

For example the DOP must be specified by the caller since there are no good heuristics possible for IO work. Even hill climbing would not work because it might overwhelm the resource you are using/calling. A human must pick the DOP for IO.

In case of any exception abort the loop. If users don't want that they are free to catch themselves. Parallel.* does that already.

There's also a need for cancellation and TaskScheduler support.

Many users are using AsParallel or Parallel.* for this but all of those are problematic. For IO it is necessary to control the DOP exactly (a max or a min would not do).

Some users are splitting their input into chunks and then processing the chunk with Parallel.*. This does not manage to keep the DOP at runtime steady. It oscillates between the max and 1.

I think the community would benefit from a high quality and well done implementation of this. I see this need all the time on Stack Overflow.

The most awesome thing of all would be an open sourcing of the old ParallelExtensionExtras! @stephentoub It could give a home to this helper.

@stephentoub
Copy link
Member

The most awesome thing of all would be an open sourcing of the old ParallelExtensionExtras

Yeah, I really need to make time to do that. I'm adding it to my list.

@svick
Copy link
Contributor

svick commented May 8, 2017

The most awesome thing of all would be an open sourcing of the old ParallelExtensionExtras!

I think the way to go about doing that is:

  1. Open source the existing ParallelExtensionExtras code, either in a dedicated repo or in something like corefxlab.
  2. Update it to .Net Standard.
  3. Move worthwhile parts to corefx, going through the normal API review process.

Step 1 has to be made by someone from MS (because ParallelExtensionExtras are currently licensed under Microsoft Limited Public License, which is not open source). The community could help with steps 2 and 3.

Also, if packages from step 2 are published to NuGet, then step 3 becomes less important.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 5.0 milestone Jan 31, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@eiriktsarpalis eiriktsarpalis removed the untriaged New issue has not been triaged by the area owner label Mar 25, 2020
@eiriktsarpalis eiriktsarpalis modified the milestones: 5.0, Future Mar 25, 2020
@eiriktsarpalis
Copy link
Member

Echoing @stephentoub's comments, I think it might make sense to encourage users to author their own variations of the above helper methods. I'm going to close this issue.

@stephentoub
Copy link
Member

This is a dup anyway of #1946, where much more discussion has happened.

@ghost ghost locked as resolved and limited conversation to collaborators Jan 7, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Collections
Projects
None yet
Development

No branches or pull requests

10 participants