Operational Semantics for Generators #367

JakobDegen · 2022-10-11T02:16:26Z

We need to spec what generators are operationally. One option is to simply include the current generator lowering that rustc does in the spec. This has a number of downsides though, the biggest one being that it significantly inhibits optimizations. Specifically, for an async block like this:

async {
    let x = 5_u64;
    something.await;
    dbg!(x);
}

We would like to justify two optimizations:

Const propagating the 5 into the dbg! invocation.
The Future that results from this async block having size less than size_of::<u64>().

If our spec simply says that async blocks are subjected to the standard generator lowering, neither of these optimizations is correct. The &mut Future that the caller of poll has when the .await point is hit might be used to modify x, so we cannot const-prop, and x is clearly alive across an await point, and so we must allocate space for it in the future.

Specifically, we are looking to prove two theorems:

An optimization can soundly treat a _ret = yield(_arg) [resume => bbA, drop => bbB] as a _ret = call UNKNOWN_FN(_arg) [return => bbA, unwind => bbB]. Here UNKNOWN_FN is some unknown function that is unique to this yield point.
The generator lowering pass in MIR today is sound.

cc @tmandry and @saethlin with whom I talked about this issue at some length during rustconf

The text was updated successfully, but these errors were encountered:

JakobDegen · 2022-10-11T02:18:31Z

Proposal.

Although this is certainly very far from "pain free," I was surprised that this seems to be possible while requiring relatively few changes to the rest of our semantics to support. Definitely at least a little jank though.

RalfJung · 2022-10-11T08:03:17Z

Also see the Zulip discussion here. Does your proposal summarize that discussion or is it a separate proposal?

JakobDegen · 2022-10-11T14:30:18Z

To my knowledge that discussion did not lead to any ideas that did not have unsolved blocking issues. The idea above does not have any that I know about

JakobDegen · 2022-10-12T00:32:00Z

Interesting thought I just had: This does not only open the door for an opsem for yield, but also for an opsem for .await. I don't think that has any spec uses, but it might be desirable for rustc to preserve .awaits in MIR since they are much easier to analyze than the loop they turn into.

That would in turn open the door to pre-generator lowering inlining of async blocks in cases like this:

async {
    async { foo }.await
}

Which could yield massive wins for compile-time and run-time performance.

RalfJung · 2022-10-31T11:11:19Z

To my knowledge that discussion did not lead to any ideas that did not have unsolved blocking issues. The idea above does not have any that I know about

One issue this does not seem to resolve is how to ensure that other, non-machine-native accesses to the memory where a generator might store its state are UB (e.g. someone just writing into the middle of a struct Generator using unsafe code), but accesses to locals of the generator to that same memory are fine. After all this is "overlapping address space" so we better make sure only one of these is actually allowed to perform accesses. But if we ever want to support Unpin generators we do have to allow doing a full copy of the generator state to some other location. Some schemes were proposed on Zulip to achieve this but they got very complicated. This is where most of the discussion time on Zulip was spent IIRC so I am surprised to see the issue not even mentioned here.

Also I don't quite understand how this solves the problem of storing locals in the machine-visible generator state. The document discusses this before discussing what all the state even looks like (which is confusing), and also somehow associates this with stack frames? I don't see how this is a stack frame thing, shouldn't that generator_info be part of GeneratorSuspendState (except that I expected some state to be kept around also for the running generators).

JakobDegen · 2022-10-31T17:16:48Z

One issue this does not seem to resolve is how to ensure that other, non-machine-native accesses to the memory where a generator might store its state are UB

Eh, sorry, I realize I didn't end up actually writing this anywhere, but I intended that when we pop a protector off a borrow stacks, we check the frames in the generator table as well as those currently on the call stack. This will make all such writes and reads UB if the generator has not been polled to completion.

But if we ever want to support Unpin generators we do have to allow doing a full copy of the generator state to some other location.

Indeed, Unpin generators are unsupported by this and are going to be an interesting challenge.

Also I don't quite understand how this solves the problem of storing locals in the machine-visible generator state.

It does not, it just sidesteps the problem. We 1) prevent the generator state from being accessed while the generator is suspended and 2) allow the allocations for locals to have addresses that overlap that state. Storing the locals in the generator state is then a sound "optimization"

RalfJung · 2022-10-31T18:21:07Z

Indeed, Unpin generators are unsupported by this and are going to be an interesting challenge.

If we don't make generators movable, I think we can just make all accesses to the given region via the "outer" AllocId insta-UB -- no Stacked Borrows shenanigans required. The actual Stack Frame will use a different Alloc Id (one per local) and only those are allowed to access this memory range.

JakobDegen · 2022-10-31T18:59:36Z

I think we can just make all accesses to the given region via the "outer" AllocId insta-UB

Maybe, but I'd want to be very careful about inventing a new way that things can be UB. On the one hand, we can't lock the memory down completely, since I think we still want to allow retags of the Pin<&mut Future> as SRW (keeping in mind that the Future has an AliasCell). On the other hand, we need to disallow retags of that pointer as Unique since otherwise we lose some important theorems about what Unique means. This feels to me like it's going to need a SB solution anyway.

RalfJung · 2022-10-31T19:05:37Z

On the one hand, we can't lock the memory down completely, since I think we still want to allow retags of the Pin<&mut Future> as SRW (keeping in mind that the Future has an AliasCell).

We can allow reads and make them return Uninit.

On the other hand, we need to disallow retags of that pointer as Unique since otherwise we lose some important theorems about what Unique means. This feels to me like it's going to need a SB solution anyway.

The way I am imagining the aliasing model will change, a pointer will only really become 'unique' once it is written through. (This is 2phase-borrows.) Any write to this special region of memory via the wrong AllocId is UB. Ergo There cannot be an "activated" unique pointer for this region of memory.

The problem with protectors is, which reference's tag should be used for that? We better make sure that tag (or any tag derived from it) is never leaked to unknown code, because if that ever happens that code would be allowed to read and write this memory region.

JakobDegen · 2022-10-31T20:46:20Z

The problem with protectors is, which reference's tag should be used for that? We better make sure that tag (or any tag derived from it) is never leaked to unknown code, because if that ever happens that code would be allowed to read and write this memory region.

On the first poll call we retag the argument as Unique and throw out the resulting tag for precisely this purpose. We don't ever need it again later, so there's no danger of accidentally leaking it

JakobDegen · 2022-11-01T01:38:49Z

The way I am imagining the aliasing model will change, a pointer will only really become 'unique' once it is written through. (This is 2phase-borrows.) Any write to this special region of memory via the wrong AllocId is UB. Ergo There cannot be an "activated" unique pointer for this region of memory.

Also, thought about this more, I'm not so sure this suffices. Presumably we still want to activate 2PB references upon being passed as an argument. Maybe having AliasCell is enough though?

One issue this does not seem to resolve is how to ensure that other, non-machine-native accesses to the memory where a generator might store its state are UB (e.g. someone just writing into the middle of a struct Generator using unsafe code), but accesses to locals of the generator to that same memory are fine. After all this is "overlapping address space" so we better make sure only one of these is actually allowed to perform accesses. But if we ever want to support Unpin generators we do have to allow doing a full copy of the generator state to some other location. Some schemes were proposed on Zulip to achieve this but they got very complicated. This is where most of the discussion time on Zulip was spent IIRC so I am surprised to see the issue not even mentioned here.

The more I think about it, I think the main contribution of this proposal is that it demonstrates a way for us to never have to admit that we store locals within the generator state while handling all the shenanigans around address observability. The question of how we ban writes to the generator state then becomes secondary (details aside, SB or some other mechanism are all in principle fine).

Now that I've said this out loud, I realize I actually also have an idea for how we can support Unpin generators (this is going to be vague, so bear with me): The generator sidetable is currently "keyed" by the pointer of the future. For Unpin generators, we can instead store the key within the generator. Probably this can be an AllocId within a single pointer byte or something like that. Then we don't add the protector for those generators and we're done. I'm trying to find a nice way to unify this, but it's not hitting me just yet. Will update if it does.

RalfJung · 2022-11-01T07:38:45Z

Presumably we still want to activate 2PB references upon being passed as an argument.

I am actually not sure about that -- also see the Zulip discussion here. Not activating does allow a lot more code and only loses optimizations whose benefit is rather speculative.

Now that I've said this out loud, I realize I actually also have an idea for how we can support Unpin generators

I don't understand this proposal, so I'll wait until you fleshed it out a bit more. ;) The key challenge is preventing the user from reverting the generator back to an old state (by memcpying a previous state over the current state), and from mixing multiple state (by copying half of a previous state over the current state). Solutions have been proposed on Zulip that achieve this, but they are not pretty.

JakobDegen · 2022-11-01T07:44:23Z

Solutions have been proposed on Zulip that achieve this, but they are not pretty.

Have they? Do you have a link? (To be clear, I'm aware there were proposals, I'm not aware of any that work)

RalfJung · 2022-11-01T08:09:00Z

See the discussion starting here, in particular around here. It's just a sketch but I think it can be made to work.

JakobDegen · 2022-11-01T08:43:51Z

Rereading the conversation it is interesting to see some of the ideas I had forgotten about. That being said, I don't see how anything suggested there can be made to work.

First, there is the address equality issue for the newly created locals. I was able to avoid this here, but especially with Unpin futures that comes back to bite.

But even then, I am fairly sure that any strategy which allocates future locals normally cannot be made to work with this. Specifically, you need to justify this being a data race:

let fut = async {
    let mut x = 0;
    let p = addr_of_mut!(x);
    thread::spawn(|| loop { *p = 1; });
    yield;
};

// Skip some pins and pointer casts
fut.poll();
forget(ptr::read(addr_of!(fut)));

loop {}

You can't make the read unconditionally wrong, because - since the generator is Unpin - it would be ok if the loop were to be stopped first by some external synchronization.

I've read through that thread again and if there were any proposals for memory that behaves weirdly enough to make this work, I did not understand them

RalfJung · 2022-11-05T16:20:40Z

Oh yeah we did not consider other threads I think.

So basically, the locals that have their address assigned inside the future itself, also have to have their data race semantics applied there.

JakobDegen · 2023-02-19T23:53:44Z

So one thing to note about this proposal is that it makes the Pin-ness of async blocks a validity issue, instead of just a safety issue. Normally, violating the Pin requirements is only unsound, and can be done without UB if the the pinned type is a known type. Under the proposed semantics here, even if the future is known not to hold any references across await points, violating the pin requirements is still UB.

One option that we have for Unpin futures is to go in a direction that feels a bit similar to that. Specifically, imagine that for an Unpin generator, we always reallocate all local variables upon hitting a yield point. That would make the first poll call on this future immediately UB:

let fut = async {
    let mut x = 0;
    let p = addr_of_mut!(x);
    thread::spawn(|| loop { *p = 1; });
    yield;
};

Because x gets reallocated as soon as yield gets hit. This seems to get us out of the above problem without having to do more complicated things. It's also trivially correct as long as we are ok with making an incorrect Unpin impl on a generator lead to immediate UB, even when the generator is not actually moved

RalfJung · 2023-03-12T16:54:10Z

So one thing to note about this proposal is that it makes the Pin-ness of async blocks a validity issue, instead of just a safety issue.

I don't quite think so? We already have language UB if you make the type Unpin when you shouldn't, due to aliasing rules. So nothing fundamentally new happens here I think?

JakobDegen · 2023-03-12T23:44:50Z

I don't quite think so? We already have language UB if you make the type Unpin when you shouldn't, due to aliasing rules. So nothing fundamentally new happens here I think?

Fair, although I was operating under the assumption that that was eventually going to go away in favor of MaybeDangling

chorman0773 · 2023-05-02T15:38:45Z

As an async note before the meeting, lccc's abi already spec's one optimization with regards to external captures:

let x = 5_u8;
async {
    use_value(x);
}

would presently store the value 5 in it's value representation (it would't elide the 5, notably, but it's storing it by-move, despite it being captured by-ref).

I would personally like for at least this optimization to be permitted (storing copies instead of references when address is not material).

(As a note, this optimization is spec'd on closures, but it defines generator and future layout in terms of enums of closures presently - some work is needed to ensure local variable stability, and reuses the capture-type rules from that section)

b-naber mentioned this issue Apr 17, 2023

Kill moved locals in borrowed locals analysis rust-lang/rust#110420

Closed

JakobDegen mentioned this issue Apr 18, 2023

Meeting Proposal: Generator optimizations, specifically size optimizations rust-lang/opsem-team#3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Operational Semantics for Generators #367

Operational Semantics for Generators #367

JakobDegen commented Oct 11, 2022

JakobDegen commented Oct 11, 2022

RalfJung commented Oct 11, 2022

JakobDegen commented Oct 11, 2022 •

edited

Loading

JakobDegen commented Oct 12, 2022

RalfJung commented Oct 31, 2022 •

edited

Loading

JakobDegen commented Oct 31, 2022

RalfJung commented Oct 31, 2022

JakobDegen commented Oct 31, 2022

RalfJung commented Oct 31, 2022

JakobDegen commented Oct 31, 2022

JakobDegen commented Nov 1, 2022 •

edited

Loading

RalfJung commented Nov 1, 2022 •

edited

Loading

JakobDegen commented Nov 1, 2022 •

edited

Loading

RalfJung commented Nov 1, 2022

JakobDegen commented Nov 1, 2022 •

edited

Loading

RalfJung commented Nov 5, 2022

JakobDegen commented Feb 19, 2023 •

edited

Loading

RalfJung commented Mar 12, 2023

JakobDegen commented Mar 12, 2023

chorman0773 commented May 2, 2023 •

edited

Loading

Operational Semantics for Generators #367

Operational Semantics for Generators #367

Comments

JakobDegen commented Oct 11, 2022

JakobDegen commented Oct 11, 2022

RalfJung commented Oct 11, 2022

JakobDegen commented Oct 11, 2022 • edited Loading

JakobDegen commented Oct 12, 2022

RalfJung commented Oct 31, 2022 • edited Loading

JakobDegen commented Oct 31, 2022

RalfJung commented Oct 31, 2022

JakobDegen commented Oct 31, 2022

RalfJung commented Oct 31, 2022

JakobDegen commented Oct 31, 2022

JakobDegen commented Nov 1, 2022 • edited Loading

RalfJung commented Nov 1, 2022 • edited Loading

JakobDegen commented Nov 1, 2022 • edited Loading

RalfJung commented Nov 1, 2022

JakobDegen commented Nov 1, 2022 • edited Loading

RalfJung commented Nov 5, 2022

JakobDegen commented Feb 19, 2023 • edited Loading

RalfJung commented Mar 12, 2023

JakobDegen commented Mar 12, 2023

chorman0773 commented May 2, 2023 • edited Loading

JakobDegen commented Oct 11, 2022 •

edited

Loading

RalfJung commented Oct 31, 2022 •

edited

Loading

JakobDegen commented Nov 1, 2022 •

edited

Loading

RalfJung commented Nov 1, 2022 •

edited

Loading

JakobDegen commented Nov 1, 2022 •

edited

Loading

JakobDegen commented Nov 1, 2022 •

edited

Loading

JakobDegen commented Feb 19, 2023 •

edited

Loading

chorman0773 commented May 2, 2023 •

edited

Loading