-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Scoped threads, take 2 #1084
Conversation
Note that this is just a sketch: the use of a linked list here, and `RefCell` | ||
internally, could both be switched for something more interesting. | ||
|
||
# Drawbacks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More than the ergonomics, the fundamental drawback of this approach is that it is an inversion of the normal RAII flow: functions acquiring some resource return a guard (representing the resource itself) which can be dropped to release the resource. If you acquire an RAII resource internally, you can provide the same API by forwarding the guard.
With APIs like the above, this is reversed - to use an RAII resource, you accept a guard as an argument rather than returning it, and you cannot acquire the resource internally and return it, you must have it passed to you.
This is not necessarily a death blow, just a significant drawback in terms of consistency. For instance, the original thread::scoped
API meant that thread::spawn
was really just a special case of scoped
where 'a
is 'static
and the default behavior of the guard is different. This API is fundamentally different, and encourages totally different patterns.
I think we should consider this even if |
This pattern seems more general than just "thread::scoped": there's no reason why the "scope" functionality should be part of "thread". What if there was a general purpose "scope" method, and then a "thread::spawn_scoped" method which took the scope as the first parameter. That way you can reuse the functionality for other APIs/attach your own callbacks to run at the end of the scope. |
|
||
# Unresolved questions | ||
|
||
Is there any reason that `Scope` should be `Send` or `Sync`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recursion without defining new scopes and unnecessarily tying up system resources by keeping old threads around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't see any reason for Send
(we never get an owned Scope
so we can't possibly send it anywhere) but we definitely want Sync
, so we can pass the Scope
to a fork-join thread in case it needs to conditionally spawn more threads (or otherwise register "finalizers" that work on shared stack data).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add that the correct question should be "is there a reason not to make Scope
Send
and/or Sync
." Rust should deliver maximum flexibility for concurrency, so long as it can do so safely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pythonesque Fair point, although if making it Send
and/or Sync
requires using more expensive operations (e.g. it's internally mutable through a &
-reference so any mutations would need to be thread-safe), then it's a valid question to ask.
I don't like being stuck on an ugly API like this one forever just because versioon 1.0 didn't provide a way to have a good API. Imagine in five years someone asking "why is thread::scoped so weird?" and having to answer that it's because of some language features were missing at that time. |
Languages evolve. There are enormous portions of the current standard library that would look different if they had been designed in the presence of future language features. That doesn't mean we're going to delete e.g. We already have many "destined to be deprecated" APIs like |
A general 👍 to this. I hope someday we can recover the more "straightforward" RAII style API, but this provides a good solution in the short term. |
I really like how the same guard can be reused by any library which needs hard RAII requirements. |
I really like the design, just have two small comments: |
I prefer this proposal to other two. It might be because coming from Haskell, such a pattern is very common in API which manage resources with a limited lifetime. Haskell’s syntax makes that pattern very viable too, but I digress. Several notes in response to both comments and RFC: While the pattern is itself general, generalising I also don’t think this is an ugly/unergonomic API. It is just a pattern which is uncommon in the Rust standard library (at the moment). Also, I’d want that // this would make some sense from API standpoint, not sure about implementation
thread::Builder::new().stack_size(1024).scoped(|s|{
for i in 0..10 {
s.name(format!("worker-{}", i)).spawn(||{ … });
}
}); |
I think being able to use more-natural RAII patterns rather than |
Additionally to feeling more natural and simpler (but that might be subjective), RAII style scoped threads have one another advantage – it's really straightforward to return values from from threads, check their exit status, etc. I guess it would be a lot more bloated with this approach. And I'm pretty sure we'll find some major limitations to the closure based approach as we found in internal iterators. |
this is a much more complex, non-straightforward and uncommon api design, than the old |
This pattern is actually more similar to Haskell's ST monad than to the with-pattern, despite the actual use not really needing HRTB. |
The scope guard is only needed here to ensure the lifetime bound - you create a single scope guard, and can use it multiple times. |
Alternative is having neither.
“Forgetting” a |
What happens if I do this? use std::thread::{self, JoinHandle};
let wrong_handle: JoinHandle<&i32> = thread::scope(|s| {
let inner_handle: JoinHandle<&i32> = s.spawn(|| { 42 });
inner_handle
// the spawned thread should be joined here
};
// but I can still access the spawned thread here!
drop(wrong_handle);
// Oops. I detached a thread that is supposedly scoped.
// Or alternatively, this:
// wrong_handle.join().unwrap();
// Oops. I joined a thread twice. My suggestion is to
impl<'defer> Scope<'defer> {
pub fn spawn<'body, F, T>(&self, f: F) -> thread::JoinHandle<T, 'body> where
'defer: 'body, // anything borrowed by the body must outlive the Scope
F: FnOnce() -> T + Send + 'body,
T: Send;
} |
@nagisa the alternative is not neither, and it's a false dilemma to phrase the problem that way. |
@theemathas The whole point of this proposal is that you only get a handle, not a guard, so what you do with it isn't important. The scope ensures that the thread has joined by the time it exits, with no involvement from the handle. The handle is just a way to access the return value of the thread if you want. Whether that happens before the scope exits or after can hardly make a difference. The thread could've exited already once you access it either way. |
I agree with @kballard and others that having a distinct return type that does not use FWIW, I am also leaning toward @nikomatsakis's suggestion (made elsewhere) that we should probably not stabilize any of this soon, in any case -- we should iterate on it in crates.io before bringing it into |
@kballard Yes, that makes sense. I hadn't thought about it in terms of introducing panicking into the JoinHandle API. Having separate types did seem like the better option. I was trying to show that there might be a tradeoff involved, but that makes it more clear cut. |
Would the following be allowed by the proposed API? // using the proposed thread::scope API
fn new_trpl_example2() {
thread::scope(|s| {
s.spawn(|| {
s.spawn(|| { /* some code */ });
/* some more code */
});
}
})
} This is trying to use the "outer" scope within an "inner" thread. The result is that the innermost thread could potentially outlive the "middle" thread. The reason I am asking is that if this is not allowed, then it seems like the scoped-spawn API restricts concurrency to "series-parallel" concurrency (which, I believe, the old scoped API did). That would have some very nice theoretical benefits and simply some analysis (e.g., https://cims.nyu.edu/~ejk/papers/crd.pdf). |
@RalfJung That code would be invalid if |
Which is a question that the RFC leaves unanswered. So I guess we'll have to wait till that point settles. |
@RalfJung That's why I said earlier that it should be |
I might have an application of this API in a single threaded async case. Imagine you're writing bindings to a foreign library (e.g. GTK) that supports callbacks like this: extern fn connect(instance: *mut Object,
callback: extern fn(receiver: *mut Object, data: *mut ()),
data: *mut (),
destroy_data: extern fn(data: *mut())) -> usize;
extern fn disconnect(handle: usize); You could support using a To support |
I would like to propose an amendment to this RFC in order to reduce the difference between
impl<'a> Scope<'a> {
pub fn spawn<F, T>(&self, f: F) -> thread::ScopedJoinHandle<'a, T> where
F: FnOnce() -> T + Send + 'a,
T: Send + 'a;
}
impl<T: Send + 'static> ScopedJoinHandle<'static, T> {
pub fn detach(self) -> JoinHandle<T> {...}
}
impl<'a> Scope<'a> {
pub fn attach<T: Send + 'static>(&self, handle: JoinHandle<T>) -> ScopedJoinHandle<'static, T> {...}
}
|
Oh, I hadn't considered that limitation. Adding a |
@theemathas what would attaching a JoinHandle to a scope mean? That it's automatically joined once the scope exits? @aturon I'm quite happy with the API, apart from the part where we require allocation. Would it be possible to add an implementation of FixedScope that can only spawn a limited number of threads (possibly giving up with Note that it may even be possible to create a |
@llogiq: Alternatively, there could be a library that allows caching the threads and allocations, maybe thats good enough? |
@Kimundi: Probably yes. FixedScope would be incompatible to Scope anyway, so we're not losing any extensibility (that is, code using Scope would have to be changed to use FixedScope wherever needed). I don't expect Scope to be used much in |
@llogiq I don't immediately buy that the allocation is a major impact. Not compared to creating new OS threads. But it does bring to mind -- to what extent does this RFC support scoped thread pools, how can they be built upon this? |
@bluss: Good point (though it may somewhat depend on the OS). A ScopedThreadPool would limit the number of concurrently spawned threads by construction (and thus make the limitation of a theoretical FixedScope a moot point). |
I actually have a prototype implementation of scoped with a thread pool: https://github.com/Kimundi/scoped_thread_pool_cache/blob/master/src/lib.rs. Right now its memory unsafe though due to a bug mentioned further up in the comments that makes the scoped API as proposed here not work on todays Rust. |
If you can't afford allocation, |
I don't understand why you build struct DtorChain {
jgs: Vec<JoinGuard>
} |
Because statically sizing that Vec may not be feasible in all Casey; you'd need to prove an upper bound at compile time. For simple cases, it could be quite useful though, and there is a design space in conjunction with a thread pool that people currently explore out of tree if I'm correctly informed. |
@llogiq, notice that i don't put Am i missing something obvious? |
Yes. The JoinGuards may be statically sized. But a Vec usually isn't. To avoid heap allocation, you'd need a statically sized Array instead. |
True. I should have phrased it better. I was talking about the excess of scattered allocations caused by |
I'm going to close this RFC for the time being. We need to solve the soundness issue pointed out by @kballard at the language level first, and I'd also like to prototype this in an external crate before landing in std. |
The borrowck bug making this unsound seems to have been fixed by rust-lang/rust#27641. @kballard's prototype now fails to compile with the expected error:
|
```rust | ||
pub struct Scope<'a> { | ||
dtors: RefCell<Option<DtorChain<'a>>> | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's point out that Scope<'a>
must be invariant in 'a
(like it is in this sketch, since it is nested inside a Cell).
This RFC proposes an alternative to the
thread::scoped
API that does not relyon RAII, and is therefore memory safe even if destructors can be leaked.
This RFC was inspired by ideas from @nikomatsakis and @arielb1.
Rendered