Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

every match arm reads the match's borrowed input #50783

Merged

Conversation

pnkfelix
Copy link
Member

@pnkfelix pnkfelix commented May 15, 2018

This PR changes the match codegen under NLL (and just NLL, at least for now) to make the following adjustments:

  • It adds a -Z disable-ast-check-for-mutation-in-guard which, as described, turns off the naive (conservative but also not 100% sound) check for mutation in guards of match arms.
  • We now borrow the match input at the outset and emit a special ReadForMatch statement (that, according to the static semantics, reads that borrowed match input) at the start of each match arm. The intent here is to catch cases where the match guard mutates the match input, either via an independent borrow or via ref mut borrows in that arm's pattern.
  • In order to ensure that ref mut borrows do not actually conflict with the emitted ReadForMatch statements, I expanded the two-phase-borrow system slightly, and also changed the MIR code gen so that under NLL, when there is a guard on a match arm, then each pattern variable ends up having three temporaries associated with it:
    1. The first temporary will hold the substructure being matched; this is what we will move the (substructural) value into if the guard succeeds.
    2. The second temporary also corresponds to the same value as the first, but we are just constructing this temporarily for use during the scope of the guard; it is unaliased and its sole referrer is the third temporary.
    3. The third temporary is a reference to the second temporary.

Fix #27282

This also provides a path towards resolving #24535 aka rust-lang/rfcs#1006, at least once the -Z disable-ast-check-for-mutation-in-guard is just turned on by default (under NLL, that is. It is not sound under AST-borrowck).

  • But I did not want to make #![feature(nll)] imply that as part of this PR; that seemed like too drastic a change to me.

@rust-highfive
Copy link
Collaborator

r? @estebank

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 15, 2018
@pnkfelix pnkfelix changed the title match borrows its input match borrows its input, and every arm reads that borrowed input May 15, 2018
@pnkfelix pnkfelix changed the title match borrows its input, and every arm reads that borrowed input every match arm reads the match's borrowed input May 15, 2018
@pnkfelix
Copy link
Member Author

(To be clear: hypothetically, it should be fine to make the changes to the MIR codegen happen even without #![feature(nll)]. And maybe that would be a better way to make sure this code gets exercised. But I also wanted to restrict the risk associated with this change to just NLL.)

Copy link
Contributor

@estebank estebank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks correct from a quick skim. As this is gated behind a nightly flag, I'm leaning towards merge as-is, although I need to read through it again before approving.

CC @nikomatsakis and @arielb1 for second and third pair of eyes that were involved in the original discussion.

--> $DIR/issue-27282-move-ref-mut-into-guard.rs:24:18
|
LL | if { (|| { let bar = foo; bar.take() })(); false } => {},
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot move out of borrowed content
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't dug too much on the desugaring code, but it seems to me that there could be a way to use foo's span for this error, which would be more appropriate for this and the following error.

| || -
| ||_____|
| |______borrow of `b` occurs here
| borrow later used here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self, dig for other cases of diagnostics emitting the same span and rework the emitter to just use one span with multiple labels

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the AST based errors avoid emitting the two labels if they end up in the same span, NLL should do the same.

--> $DIR/issue-41962.rs:17:9
|
LL | if let Some(thing) = maybe {
| ^ ----- value moved here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned below for match errors, this span is suboptimal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self, improve the multiline spans when multiple spans are involved

@nikomatsakis nikomatsakis self-assigned this May 16, 2018
@pnkfelix
Copy link
Member Author

I agree the diagnostics are suboptimal

My preference: land these changes, since they should only affect the diagnostics for NLL, and file a follow-up bug to address them.

@bors
Copy link
Contributor

bors commented May 17, 2018

☔ The latest upstream changes (presumably #50615) made this pull request unmergeable. Please resolve the merge conflicts.

Reservation(WriteKind::MutableBorrow(BorrowKind::Mut { .. }))
| Write(WriteKind::MutableBorrow(BorrowKind::Mut { .. })) => {
Reservation(WriteKind::MutableBorrow(borrow_kind @ BorrowKind::Unique))
| Reservation(WriteKind::MutableBorrow(borrow_kind @ BorrowKind::Mut { .. }))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't match the style of the surrounding code (| aligned to the above line).

@estebank
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented May 18, 2018

📌 Commit ea9afcb has been approved by estebank

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 18, 2018
// FIXME: is there some dynamic semantics we should attach to
// these? Or am I correct in thinking that the inerpreter
// is solely intended for borrowck'ed code?
ReadForMatch(..) => {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, no dynamic semantics.

@@ -2559,6 +2567,7 @@ enum ContextKind {
CallDest,
Assert,
Yield,
ReadForMatch,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may want to optimize this out when we do other lowerings...maybe as part of the transform/erase_regions.rs file (it only exists for regions, anyway?) probably there's a better spot

@@ -55,16 +66,34 @@ impl<'a, 'gcx, 'tcx> Builder<'a, 'gcx, 'tcx> {
// HACK(eddyb) Work around the above issue by adding a dummy inspection
// of `discriminant_place`, specifically by applying `Rvalue::Discriminant`
// (which will work regardless of type) and storing the result in a temp.
//
// FIXME: would just the borrow into `borrowed_input_temp`
// also achieve the desired effect here? TBD.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this still TBD?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the last time I tried removing eddyb's code (labelled "HACK"), it was with a version of my own code that was totally broken, and so I ended up drawing an unjustified conclusion about needing to keep eddyb's code.

So, I would like to revisit the question. My plan is this: I'll try removing eddyb's hack. If I discover that doing so breaks something, then I'll put it back and revise (or remove) this FIXME comment.

If I discover that removing eddyb's code does not cause any tests to break ... then I might revise the PR accordingly and request a follow-up review of the result.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, never mind: I cannot replace eddyb's hack with the borrowed input temp. @eddyb is just injecting a dummy temp that is a read of the discriminant. I'm trying to inject a borrow of the input, but I'm relying on NLL to deal with the fact that I plug in re_empty for the region.

So if we were to remove eddyb's hack, it would need to wait until after we turn NLL on.

/// Injects a borrow of `place`. The region is unknown at this point; we rely on NLL
/// inference to find an appropriate one. Therefore you can only call this when NLL
/// is turned on.
fn inject_borrow<'a, 'gcx, 'tcx>(tcx: ty::TyCtxt<'a, 'gcx, 'tcx>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Total nit: I would call this injected_borrow, since it just returns the borrow, it doesn't actually insert it into any data structure

// the variant for an enum while you are in
// the midst of matching on it.
//
// FIXME: I originally had put this at the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a FIXME? It is an interesting note though =)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the fixme was for the follow-on bullet about fine tuning where the ReadForMatch belongs.

Luckily, @arielb1 seems to have already done the hard work of evaluating some seemingly simpler alternatives and then uncovering reasons why they won't work.

I think I'll just remove the FIXME at this point.

| Write(WriteKind::MutableBorrow(borrow_kind @ BorrowKind::Mut { .. })) =>
{
let is_local_mutation_allowed = match borrow_kind {
BorrowKind::Unique => LocalMutationIsAllowed::Yes,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, this flag blows my mind every time. I don't quite get what Yes makes sense here, in any case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find a better name for it. No means that this access is a "direct mutation", and therefore is illegal to do for a non-mut local, while Yes means that this is an "indirect mutation", and therefore is legal to do to a non-mut` local.

Ah - maybe that is the way to name it: MutationDirectness::Direct/MutationDirectness::Indirect?

@nikomatsakis
Copy link
Contributor

Urgh, I had those comments hanging around, apparently.

@estebank
Copy link
Contributor

@bors r-, to address @nikomatsakis' comments and the merge conflict

@@ -53,14 +53,25 @@ impl<'tcx> Index<BorrowIndex> for BorrowSet<'tcx> {
}
}

/// Every two-phase borrow has *exactly one* use (or else it is not a
/// proper two-phase borrow under our current definition. However, not
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing end parentheses ( proper two-phase borrow under our current definition).)

// the midst of matching on it.
//
// FIXME: I originally had put this at the
// start of each elem of arm_blocks (see
Copy link
Contributor

@arielb1 arielb1 May 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The theoretically "right" place to put this is "after all arms", just before the unreachable of the FalseEdges, where we witness the exhaustiveness of the match. This is incorrect, we need the discriminant intact in more places

Note that this (as well as your code, and soundly) allows for mutating the discriminant in a guard that later diverges:

let mut foo = Some(4);
match foo {
    _ if { if maybe {
        // modify the guard, then *diverge*
        foo = None;
        panic!()
    } else { false } } => {},
    _ => {}
}
  1. The unreachable at the end of the false-edges chain. That's the "canonical" unreachable for borrowck purposes, but is not accessible to LLVM:
    https://github.com/pnkfelix/rust/blob/ea9afcb6a7c49752271652d40ca7d365057c82fc/src/librustc_mir/build/matches/mod.rs#L195
  2. The unreachable at the end of the real edges chain. That's the "canonical" unreachable for trans purposes:
    https://github.com/pnkfelix/rust/blob/ea9afcb6a7c49752271652d40ca7d365057c82fc/src/librustc_mir/build/matches/mod.rs#L215
  3. An unreachable at the end of the true edge of each guard that is in unreachable code
    https://github.com/pnkfelix/rust/blob/ea9afcb6a7c49752271652d40ca7d365057c82fc/src/librustc_mir/build/matches/mod.rs#L585

The unureachable at (3) can only be reached through a false edge, and I think we can replace it by going to the same place that the false edge goes - i.e. to candidate.next_candidate_pre_binding_block, and therefore eventualy to a (2) unreachable - to avoid adding "unnecessary" unreachable terminators.

At every point in "user code" the (2) or (3) unreachable can be reached, following the false edges will reach the (1) unreachable, so we can add the test there.

Copy link
Contributor

@arielb1 arielb1 May 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, we still need the check to make sure you are not "cutting the branch you are standing on" and then diverging:

This would be a good test:

#![feature(nll)]

struct ForceFnOnce;

fn main() {
    let mut x = &mut Some(&2);
    let force_fn_once = ForceFnOnce;
    match x {
        &mut Some(&_) if {
            // ForceFnOnce needed to exploit #27282
            (|| { *x = None; drop(force_fn_once); })();
            false
        } => {}
        &mut Some(&a) if {
            println!("{}", a);
            panic!()
        } => {}
        &mut _ => {}
    }
}

Copy link
Contributor

@arielb1 arielb1 May 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, if you still think you can put the check just before the guard:

#![feature(nll)]

struct ForceFnOnce;

fn main() {
    let mut x = &mut Some(&2);
    let force_fn_once = ForceFnOnce;
    match x {
        &mut Some(&_) if {
            // ForceFnOnce needed to exploit #27282
            (|| { *x = None; drop(force_fn_once); })();
            false
        } => {}
        &mut Some(&2) /* this segfaults if we corrupted the discriminant */ if {
            panic!()
        } => {}
        &mut _ => {}
    }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arielb1 I'm having a little trouble parsing your comment at the top here.

I think that when you wrote:

Note that this (as well as your code, and soundly) allows for mutating the discriminant in a guard that later diverges

the "this" in that sentence is referencing your original (now struckout) suggestion of

The theoretically "right" place to put this is "after all arms", just before the unreachable of the FalseEdges

(and then the rest of that text, including the enumerated list with three different unreachable's, was you working out a justification for the struckout suggestion, right?)

Is the above understanding correct?


And so the summary conclusion is ... we certainly cannot rely on just injecting the ReadForMatch one time at the end of the whole match (at one of the unreachable points), because we need to retain integrity of the data structure in the face of arms that diverge.

The examples in your most recent comments: Those are meant to be concrete tests showing that the currently implemented strategy (of injecting a ReadForMatch at the start of each arm) is *necessary.*, right? I've been trying to figure out if either of them are illustrating a flaw/oversight in the current strategy.

@pnkfelix
Copy link
Member Author

Hmm okay I'm going to need to incorporate @arielb1 's feedback into this PR.

@estebank
Copy link
Contributor

r? @arielb1

@pnkfelix pnkfelix added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 23, 2018
@pnkfelix pnkfelix force-pushed the issue-27282-match-borrows-its-input-take-three branch from ea9afcb to e0236d9 Compare May 24, 2018 11:14
@pnkfelix pnkfelix added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 24, 2018
pnkfelix added 13 commits May 29, 2018 23:01
(This is just the data structure changes and some boilerplate match
code that followed from it; the actual emission of these statements
comes in a follow-up commit.)
Also, turn on ReadForMatch emission by default (when using NLL).
…to my code.

In particular, I am adding an implicit injected borrow on the pattern
matches, and when we go around the loop, the compiler is reporting
that this injected borrow is conflicting with the move of the original
value when the match succeeds.
For some reason, allowing restricted mutation in match arms exposed an
obvious case where a unique borrow can indeed fail, namely something
like:

```rust
match b {
    ...
    ref mut r if { (|| { let bar = &mut *r; **bar = false; })(); false } => { &mut *r }
    //                             ~~~~~~~
    //                                |
    // This ends up holding a `&unique` borrow of `r`, but there ends up being an
    // implicit shared borrow in the guard thanks to rust-lang#49870
    ...
}
```
```rust
fn main() {
    fn reuse<X>(_: &mut X) {}
    let mut t = 2f64;
    match t {
        ref mut _b if { false } => { reuse(_b); }
        _ => {}
    }
}
```

Note: The way this is currently written is confusing; when `autoref`
is off, then the arm body bindings (introduced by
`bind_matched_candidate_for_arm_body`) are *also* used for the guard.
(Any attempt to fix this needs to still ensure that the bindings used
by the guard are introduced before the guard is evaluated.)

(Once we turn NLL on by default, we can presumably simplify all of
that.)
…post-NLL future.

Driveby: just inline the two-line `fn inject_borrow` into its one call
site and remove its definition.
@pnkfelix pnkfelix force-pushed the issue-27282-match-borrows-its-input-take-three branch from d3a931a to 9d5cdc9 Compare May 29, 2018 23:51
@pnkfelix
Copy link
Member Author

@bors r=nikomatsakis

@bors
Copy link
Contributor

bors commented May 29, 2018

📌 Commit 9d5cdc9 has been approved by nikomatsakis

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 29, 2018
@bors
Copy link
Contributor

bors commented May 30, 2018

⌛ Testing commit 9d5cdc9 with merge 8372e7b...

bors added a commit that referenced this pull request May 30, 2018
…ake-three, r=nikomatsakis

every match arm reads the match's borrowed input

This PR changes the `match` codegen under NLL (and just NLL, at least for now) to make the following adjustments:
 * It adds a `-Z disable-ast-check-for-mutation-in-guard` which, as described, turns off the naive (conservative but also not 100% sound) check for mutation in guards of match arms.
 * We now borrow the match input at the outset and emit a special `ReadForMatch` statement (that, according to the *static* semantics, reads that borrowed match input) at the start of each match arm. The intent here is to catch cases where the match guard mutates the match input, either via an independent borrow or via `ref mut` borrows in that arm's pattern.
 * In order to ensure that `ref mut` borrows do not actually conflict with the emitted `ReadForMatch` statements, I expanded the two-phase-borrow system slightly, and also changed the MIR code gen so that under NLL, when there is a guard on a match arm, then each pattern variable ends up having *three* temporaries associated with it:
   1. The first temporary will hold the substructure being matched; this is what we will move the (substructural) value into *if* the guard succeeds.
   2. The second temporary also corresponds to the same value as the first, but we are just constructing this temporarily for use during the scope of the guard; it is unaliased and its sole referrer is the third temporary.
   3. The third temporary is a reference to the second temporary.
   * (This sounds complicated, I know, but its actually *simpler* than what I was doing before and had checked into the repo, which was to sometimes construct the final value and then take a reference to it before evaluating the guard. See also PR #49870.)

Fix #27282

This also provides a path towards resolving #24535 aka rust-lang/rfcs#1006, at least once the `-Z disable-ast-check-for-mutation-in-guard` is just turned on by default (under NLL, that is. It is not sound under AST-borrowck).
 * But I did not want to make `#![feature(nll)]` imply that as part of this PR; that seemed like too drastic a change to me.
@bors
Copy link
Contributor

bors commented May 30, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: nikomatsakis
Pushing 8372e7b to master...

@bors bors merged commit 9d5cdc9 into rust-lang:master May 30, 2018
kennytm added a commit to kennytm/rust that referenced this pull request Jun 3, 2018
Rebase of rust-lang#50783 has accidentally revived the flag (which should be
renamed to `-Z codegen-time-graph` by rust-lang#50615).
Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this pull request Jun 4, 2018
Remove the unused `-Z trans-time-graph` flag.

Rebase of rust-lang#50783 has accidentally revived the flag (which should be renamed to `-Z codegen-time-graph` by rust-lang#50615).
bors added a commit that referenced this pull request Jul 27, 2018
…ate-in-arm, r=nikomatsakis

[NLL] make temp for each candidate in `match` arm

In NLL, `ref mut` patterns leverage the two-phase borrow infrastructure to allow the shared borrows within a guard before the "activation" of the mutable borrow when we begin execution of the match arm's body. (There is further discussion of this on PR #50783.)

To accommodate the restrictions we impose on two-phase borrows (namely that there is a one-to-one mapping between each activation and the original initialization), this PR is making separate temps for each candidate pattern. So in an arm like this:
```rust
PatA(_, ref mut ident) |
PatB(ref mut ident) |
PatC(_, _, ref mut ident) |
PatD(ref mut ident) if guard_stuff(ident) => ...
```

instead of 3 temps (two for the guard and one for the arm body), we now have 4 + 2 temps associated with `ident`: one for each candidate plus the actual temp that the guard uses directly, and then the sixth is the temp used in the arm body.

Fix #51348
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants