Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operational Semantics for Generators #367

Open
JakobDegen opened this issue Oct 11, 2022 · 20 comments
Open

Operational Semantics for Generators #367

JakobDegen opened this issue Oct 11, 2022 · 20 comments

Comments

@JakobDegen
Copy link
Contributor

We need to spec what generators are operationally. One option is to simply include the current generator lowering that rustc does in the spec. This has a number of downsides though, the biggest one being that it significantly inhibits optimizations. Specifically, for an async block like this:

async {
    let x = 5_u64;
    something.await;
    dbg!(x);
}

We would like to justify two optimizations:

  1. Const propagating the 5 into the dbg! invocation.
  2. The Future that results from this async block having size less than size_of::<u64>().

If our spec simply says that async blocks are subjected to the standard generator lowering, neither of these optimizations is correct. The &mut Future that the caller of poll has when the .await point is hit might be used to modify x, so we cannot const-prop, and x is clearly alive across an await point, and so we must allocate space for it in the future.

Specifically, we are looking to prove two theorems:

  1. An optimization can soundly treat a _ret = yield(_arg) [resume => bbA, drop => bbB] as a _ret = call UNKNOWN_FN(_arg) [return => bbA, unwind => bbB]. Here UNKNOWN_FN is some unknown function that is unique to this yield point.
  2. The generator lowering pass in MIR today is sound.

cc @tmandry and @saethlin with whom I talked about this issue at some length during rustconf

@JakobDegen
Copy link
Contributor Author

Proposal.

Although this is certainly very far from "pain free," I was surprised that this seems to be possible while requiring relatively few changes to the rest of our semantics to support. Definitely at least a little jank though.

@RalfJung
Copy link
Member

Also see the Zulip discussion here. Does your proposal summarize that discussion or is it a separate proposal?

@JakobDegen
Copy link
Contributor Author

JakobDegen commented Oct 11, 2022

To my knowledge that discussion did not lead to any ideas that did not have unsolved blocking issues. The idea above does not have any that I know about

@JakobDegen
Copy link
Contributor Author

Interesting thought I just had: This does not only open the door for an opsem for yield, but also for an opsem for .await. I don't think that has any spec uses, but it might be desirable for rustc to preserve .awaits in MIR since they are much easier to analyze than the loop they turn into.

That would in turn open the door to pre-generator lowering inlining of async blocks in cases like this:

async {
    async { foo }.await
}

Which could yield massive wins for compile-time and run-time performance.

@RalfJung
Copy link
Member

RalfJung commented Oct 31, 2022

To my knowledge that discussion did not lead to any ideas that did not have unsolved blocking issues. The idea above does not have any that I know about

One issue this does not seem to resolve is how to ensure that other, non-machine-native accesses to the memory where a generator might store its state are UB (e.g. someone just writing into the middle of a struct Generator using unsafe code), but accesses to locals of the generator to that same memory are fine. After all this is "overlapping address space" so we better make sure only one of these is actually allowed to perform accesses. But if we ever want to support Unpin generators we do have to allow doing a full copy of the generator state to some other location. Some schemes were proposed on Zulip to achieve this but they got very complicated. This is where most of the discussion time on Zulip was spent IIRC so I am surprised to see the issue not even mentioned here.

Also I don't quite understand how this solves the problem of storing locals in the machine-visible generator state. The document discusses this before discussing what all the state even looks like (which is confusing), and also somehow associates this with stack frames? I don't see how this is a stack frame thing, shouldn't that generator_info be part of GeneratorSuspendState (except that I expected some state to be kept around also for the running generators).

@JakobDegen
Copy link
Contributor Author

One issue this does not seem to resolve is how to ensure that other, non-machine-native accesses to the memory where a generator might store its state are UB

Eh, sorry, I realize I didn't end up actually writing this anywhere, but I intended that when we pop a protector off a borrow stacks, we check the frames in the generator table as well as those currently on the call stack. This will make all such writes and reads UB if the generator has not been polled to completion.

But if we ever want to support Unpin generators we do have to allow doing a full copy of the generator state to some other location.

Indeed, Unpin generators are unsupported by this and are going to be an interesting challenge.

Also I don't quite understand how this solves the problem of storing locals in the machine-visible generator state.

It does not, it just sidesteps the problem. We 1) prevent the generator state from being accessed while the generator is suspended and 2) allow the allocations for locals to have addresses that overlap that state. Storing the locals in the generator state is then a sound "optimization"

@RalfJung
Copy link
Member

Indeed, Unpin generators are unsupported by this and are going to be an interesting challenge.

If we don't make generators movable, I think we can just make all accesses to the given region via the "outer" AllocId insta-UB -- no Stacked Borrows shenanigans required. The actual Stack Frame will use a different Alloc Id (one per local) and only those are allowed to access this memory range.

@JakobDegen
Copy link
Contributor Author

I think we can just make all accesses to the given region via the "outer" AllocId insta-UB

Maybe, but I'd want to be very careful about inventing a new way that things can be UB. On the one hand, we can't lock the memory down completely, since I think we still want to allow retags of the Pin<&mut Future> as SRW (keeping in mind that the Future has an AliasCell). On the other hand, we need to disallow retags of that pointer as Unique since otherwise we lose some important theorems about what Unique means. This feels to me like it's going to need a SB solution anyway.

@RalfJung
Copy link
Member

On the one hand, we can't lock the memory down completely, since I think we still want to allow retags of the Pin<&mut Future> as SRW (keeping in mind that the Future has an AliasCell).

We can allow reads and make them return Uninit.

On the other hand, we need to disallow retags of that pointer as Unique since otherwise we lose some important theorems about what Unique means. This feels to me like it's going to need a SB solution anyway.

The way I am imagining the aliasing model will change, a pointer will only really become 'unique' once it is written through. (This is 2phase-borrows.) Any write to this special region of memory via the wrong AllocId is UB. Ergo There cannot be an "activated" unique pointer for this region of memory.

The problem with protectors is, which reference's tag should be used for that? We better make sure that tag (or any tag derived from it) is never leaked to unknown code, because if that ever happens that code would be allowed to read and write this memory region.

@JakobDegen
Copy link
Contributor Author

The problem with protectors is, which reference's tag should be used for that? We better make sure that tag (or any tag derived from it) is never leaked to unknown code, because if that ever happens that code would be allowed to read and write this memory region.

On the first poll call we retag the argument as Unique and throw out the resulting tag for precisely this purpose. We don't ever need it again later, so there's no danger of accidentally leaking it

@JakobDegen
Copy link
Contributor Author

JakobDegen commented Nov 1, 2022

The way I am imagining the aliasing model will change, a pointer will only really become 'unique' once it is written through. (This is 2phase-borrows.) Any write to this special region of memory via the wrong AllocId is UB. Ergo There cannot be an "activated" unique pointer for this region of memory.

Also, thought about this more, I'm not so sure this suffices. Presumably we still want to activate 2PB references upon being passed as an argument. Maybe having AliasCell is enough though?

One issue this does not seem to resolve is how to ensure that other, non-machine-native accesses to the memory where a generator might store its state are UB (e.g. someone just writing into the middle of a struct Generator using unsafe code), but accesses to locals of the generator to that same memory are fine. After all this is "overlapping address space" so we better make sure only one of these is actually allowed to perform accesses. But if we ever want to support Unpin generators we do have to allow doing a full copy of the generator state to some other location. Some schemes were proposed on Zulip to achieve this but they got very complicated. This is where most of the discussion time on Zulip was spent IIRC so I am surprised to see the issue not even mentioned here.

The more I think about it, I think the main contribution of this proposal is that it demonstrates a way for us to never have to admit that we store locals within the generator state while handling all the shenanigans around address observability. The question of how we ban writes to the generator state then becomes secondary (details aside, SB or some other mechanism are all in principle fine).

Now that I've said this out loud, I realize I actually also have an idea for how we can support Unpin generators (this is going to be vague, so bear with me): The generator sidetable is currently "keyed" by the pointer of the future. For Unpin generators, we can instead store the key within the generator. Probably this can be an AllocId within a single pointer byte or something like that. Then we don't add the protector for those generators and we're done. I'm trying to find a nice way to unify this, but it's not hitting me just yet. Will update if it does.

@RalfJung
Copy link
Member

RalfJung commented Nov 1, 2022

Presumably we still want to activate 2PB references upon being passed as an argument.

I am actually not sure about that -- also see the Zulip discussion here. Not activating does allow a lot more code and only loses optimizations whose benefit is rather speculative.

Now that I've said this out loud, I realize I actually also have an idea for how we can support Unpin generators

I don't understand this proposal, so I'll wait until you fleshed it out a bit more. ;) The key challenge is preventing the user from reverting the generator back to an old state (by memcpying a previous state over the current state), and from mixing multiple state (by copying half of a previous state over the current state). Solutions have been proposed on Zulip that achieve this, but they are not pretty.

@JakobDegen
Copy link
Contributor Author

JakobDegen commented Nov 1, 2022

Solutions have been proposed on Zulip that achieve this, but they are not pretty.

Have they? Do you have a link? (To be clear, I'm aware there were proposals, I'm not aware of any that work)

@RalfJung
Copy link
Member

RalfJung commented Nov 1, 2022

See the discussion starting here, in particular around here. It's just a sketch but I think it can be made to work.

@JakobDegen
Copy link
Contributor Author

JakobDegen commented Nov 1, 2022

Rereading the conversation it is interesting to see some of the ideas I had forgotten about. That being said, I don't see how anything suggested there can be made to work.

First, there is the address equality issue for the newly created locals. I was able to avoid this here, but especially with Unpin futures that comes back to bite.

But even then, I am fairly sure that any strategy which allocates future locals normally cannot be made to work with this. Specifically, you need to justify this being a data race:

let fut = async {
    let mut x = 0;
    let p = addr_of_mut!(x);
    thread::spawn(|| loop { *p = 1; });
    yield;
};

// Skip some pins and pointer casts
fut.poll();
forget(ptr::read(addr_of!(fut)));

loop {}

You can't make the read unconditionally wrong, because - since the generator is Unpin - it would be ok if the loop were to be stopped first by some external synchronization.

I've read through that thread again and if there were any proposals for memory that behaves weirdly enough to make this work, I did not understand them

@RalfJung
Copy link
Member

RalfJung commented Nov 5, 2022

Oh yeah we did not consider other threads I think.

So basically, the locals that have their address assigned inside the future itself, also have to have their data race semantics applied there.

@JakobDegen
Copy link
Contributor Author

JakobDegen commented Feb 19, 2023

So one thing to note about this proposal is that it makes the Pin-ness of async blocks a validity issue, instead of just a safety issue. Normally, violating the Pin requirements is only unsound, and can be done without UB if the the pinned type is a known type. Under the proposed semantics here, even if the future is known not to hold any references across await points, violating the pin requirements is still UB.

One option that we have for Unpin futures is to go in a direction that feels a bit similar to that. Specifically, imagine that for an Unpin generator, we always reallocate all local variables upon hitting a yield point. That would make the first poll call on this future immediately UB:

let fut = async {
    let mut x = 0;
    let p = addr_of_mut!(x);
    thread::spawn(|| loop { *p = 1; });
    yield;
};

Because x gets reallocated as soon as yield gets hit. This seems to get us out of the above problem without having to do more complicated things. It's also trivially correct as long as we are ok with making an incorrect Unpin impl on a generator lead to immediate UB, even when the generator is not actually moved

@RalfJung
Copy link
Member

So one thing to note about this proposal is that it makes the Pin-ness of async blocks a validity issue, instead of just a safety issue.

I don't quite think so? We already have language UB if you make the type Unpin when you shouldn't, due to aliasing rules. So nothing fundamentally new happens here I think?

@JakobDegen
Copy link
Contributor Author

I don't quite think so? We already have language UB if you make the type Unpin when you shouldn't, due to aliasing rules. So nothing fundamentally new happens here I think?

Fair, although I was operating under the assumption that that was eventually going to go away in favor of MaybeDangling

@chorman0773
Copy link
Contributor

chorman0773 commented May 2, 2023

As an async note before the meeting, lccc's abi already spec's one optimization with regards to external captures:

let x = 5_u8;
async {
    use_value(x);
}

would presently store the value 5 in it's value representation (it would't elide the 5, notably, but it's storing it by-move, despite it being captured by-ref).

I would personally like for at least this optimization to be permitted (storing copies instead of references when address is not material).

(As a note, this optimization is spec'd on closures, but it defines generator and future layout in terms of enums of closures presently - some work is needed to ensure local variable stability, and reuses the capture-type rules from that section)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants