-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Operational Semantics for Generators #367
Comments
Although this is certainly very far from "pain free," I was surprised that this seems to be possible while requiring relatively few changes to the rest of our semantics to support. Definitely at least a little jank though. |
Also see the Zulip discussion here. Does your proposal summarize that discussion or is it a separate proposal? |
To my knowledge that discussion did not lead to any ideas that did not have unsolved blocking issues. The idea above does not have any that I know about |
Interesting thought I just had: This does not only open the door for an opsem for That would in turn open the door to pre-generator lowering inlining of async blocks in cases like this: async {
async { foo }.await
} Which could yield massive wins for compile-time and run-time performance. |
One issue this does not seem to resolve is how to ensure that other, non-machine-native accesses to the memory where a generator might store its state are UB (e.g. someone just writing into the middle of a Also I don't quite understand how this solves the problem of storing locals in the machine-visible generator state. The document discusses this before discussing what all the state even looks like (which is confusing), and also somehow associates this with stack frames? I don't see how this is a stack frame thing, shouldn't that |
Eh, sorry, I realize I didn't end up actually writing this anywhere, but I intended that when we pop a protector off a borrow stacks, we check the frames in the generator table as well as those currently on the call stack. This will make all such writes and reads UB if the generator has not been polled to completion.
Indeed,
It does not, it just sidesteps the problem. We 1) prevent the generator state from being accessed while the generator is suspended and 2) allow the allocations for locals to have addresses that overlap that state. Storing the locals in the generator state is then a sound "optimization" |
If we don't make generators movable, I think we can just make all accesses to the given region via the "outer" AllocId insta-UB -- no Stacked Borrows shenanigans required. The actual Stack Frame will use a different Alloc Id (one per local) and only those are allowed to access this memory range. |
Maybe, but I'd want to be very careful about inventing a new way that things can be UB. On the one hand, we can't lock the memory down completely, since I think we still want to allow retags of the |
We can allow reads and make them return
The way I am imagining the aliasing model will change, a pointer will only really become 'unique' once it is written through. (This is 2phase-borrows.) Any write to this special region of memory via the wrong AllocId is UB. Ergo There cannot be an "activated" unique pointer for this region of memory. The problem with protectors is, which reference's tag should be used for that? We better make sure that tag (or any tag derived from it) is never leaked to unknown code, because if that ever happens that code would be allowed to read and write this memory region. |
On the first |
Also, thought about this more, I'm not so sure this suffices. Presumably we still want to activate 2PB references upon being passed as an argument. Maybe having
The more I think about it, I think the main contribution of this proposal is that it demonstrates a way for us to never have to admit that we store locals within the generator state while handling all the shenanigans around address observability. The question of how we ban writes to the generator state then becomes secondary (details aside, SB or some other mechanism are all in principle fine). Now that I've said this out loud, I realize I actually also have an idea for how we can support |
I am actually not sure about that -- also see the Zulip discussion here. Not activating does allow a lot more code and only loses optimizations whose benefit is rather speculative.
I don't understand this proposal, so I'll wait until you fleshed it out a bit more. ;) The key challenge is preventing the user from reverting the generator back to an old state (by memcpying a previous state over the current state), and from mixing multiple state (by copying half of a previous state over the current state). Solutions have been proposed on Zulip that achieve this, but they are not pretty. |
Have they? Do you have a link? (To be clear, I'm aware there were proposals, I'm not aware of any that work) |
See the discussion starting here, in particular around here. It's just a sketch but I think it can be made to work. |
Rereading the conversation it is interesting to see some of the ideas I had forgotten about. That being said, I don't see how anything suggested there can be made to work. First, there is the address equality issue for the newly created locals. I was able to avoid this here, but especially with But even then, I am fairly sure that any strategy which allocates future locals normally cannot be made to work with this. Specifically, you need to justify this being a data race: let fut = async {
let mut x = 0;
let p = addr_of_mut!(x);
thread::spawn(|| loop { *p = 1; });
yield;
};
// Skip some pins and pointer casts
fut.poll();
forget(ptr::read(addr_of!(fut)));
loop {} You can't make the I've read through that thread again and if there were any proposals for memory that behaves weirdly enough to make this work, I did not understand them |
Oh yeah we did not consider other threads I think. So basically, the locals that have their address assigned inside the future itself, also have to have their data race semantics applied there. |
So one thing to note about this proposal is that it makes the One option that we have for let fut = async {
let mut x = 0;
let p = addr_of_mut!(x);
thread::spawn(|| loop { *p = 1; });
yield;
}; Because |
I don't quite think so? We already have language UB if you make the type |
Fair, although I was operating under the assumption that that was eventually going to go away in favor of |
As an async note before the meeting, lccc's abi already spec's one optimization with regards to external captures: let x = 5_u8;
async {
use_value(x);
} would presently store the value 5 in it's value representation (it would't elide the 5, notably, but it's storing it by-move, despite it being captured by-ref). I would personally like for at least this optimization to be permitted (storing copies instead of references when address is not material). (As a note, this optimization is spec'd on closures, but it defines generator and future layout in terms of enums of closures presently - some work is needed to ensure local variable stability, and reuses the capture-type rules from that section) |
We need to spec what generators are operationally. One option is to simply include the current generator lowering that rustc does in the spec. This has a number of downsides though, the biggest one being that it significantly inhibits optimizations. Specifically, for an async block like this:
We would like to justify two optimizations:
5
into thedbg!
invocation.Future
that results from this async block having size less thansize_of::<u64>()
.If our spec simply says that async blocks are subjected to the standard generator lowering, neither of these optimizations is correct. The
&mut Future
that the caller ofpoll
has when the.await
point is hit might be used to modifyx
, so we cannot const-prop, andx
is clearly alive across an await point, and so we must allocate space for it in the future.Specifically, we are looking to prove two theorems:
_ret = yield(_arg) [resume => bbA, drop => bbB]
as a_ret = call UNKNOWN_FN(_arg) [return => bbA, unwind => bbB]
. HereUNKNOWN_FN
is some unknown function that is unique to this yield point.cc @tmandry and @saethlin with whom I talked about this issue at some length during rustconf
The text was updated successfully, but these errors were encountered: