Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[nll] _ patterns should not count as borrows #53114

Closed
nikomatsakis opened this issue Aug 6, 2018 · 41 comments
Closed

[nll] _ patterns should not count as borrows #53114

nikomatsakis opened this issue Aug 6, 2018 · 41 comments
Assignees
Labels
A-NLL Area: Non-lexical lifetimes (NLL) NLL-complete Working towards the "valid code works" goal P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-nll Working group: Non-lexical lifetimes

Comments

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Aug 6, 2018

Historically, we have considered let _ = foo to be a no-op. That is, it does not read or "access" foo in any way. This is why the following code compiles normally. However, it does NOT compile with NLL, because we have an "artificial read" of the matched value (or so it seems):

#![feature(nll)]
#![allow(unused_variables)]

struct Vi<'a> {
    ed: Editor<'a>,
}

struct Editor<'a> {
    ctx: &'a mut Context,
}

struct Context {
    data: bool,
}

impl Context {
    fn read_line(&mut self) {
        let ed = Editor { ctx: self };
        
        match self.data {
            _ => {
                let vi = Vi { ed };
            }
        }
    }
}

fn main() {
}

Found in liner-0.4.4.

I believe that we added this artificial read in order to fix #47412, which had to do with enum reads and so forth. It seems like that fix was a bit too strong (cc @eddyb).

UPDATE: Current status as of 2018-10-02

Thing AST MIR Want Example Issue
let _ = <unsafe-field> 💚 💚 playground #54003
match <unsafe_field> { _ => () } playground #54003
let _ = <moved> 💚 💚 💚 playground 💯
match <moved> { _ => () } 💚 playground
let _ = <borrowed> 💚 💚 💚 playground 💯
match <borrowed> { _ => () } 💚 💚 💚 playground 💯
@nikomatsakis nikomatsakis added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-compiler-nll NLL-complete Working towards the "valid code works" goal labels Aug 6, 2018
@nikomatsakis
Copy link
Contributor Author

nikomatsakis commented Aug 6, 2018

OK, the code is sort of inconsistent right now with respect to what let _ = foo permits. Here is my analysis of the current state:

Thing AST MIR Example
let _ = <unsafe-field> 💚 💚 playground
match <unsafe_field> { _ => () } playground
let _ = <moved> 💚 💚 playground
match <moved> { _ => () } playground
let _ = <borrowed> 💚 💚 playground
match <borrowed> { _ => () } 💚 playground

The explanations for what is going on here in the code are:

  • let _ = ... processes the pattern in a simplified, irrefutable mode during MIR processing. This mode does not insert the dummy accesses that the match mode does (because there is no need, generally speaking). This means that let _ = some.path is always a no-op and just doesn't appear in the MIR at all.
  • match <place> { _ => () } in contrast uses the "match mode" which can handle any number of arms.
  • Unsafe code checking happens on MIR, which I now think is mildly wrong (HAIR seems correct to me)

I'm not 100% sure why the old borrow check acepts match <borrowed> { _ => () }; the EUV does seem to issue a "match discriminant" borrow for match <..>. I'd have to dig in more.

@nikomatsakis
Copy link
Contributor Author

Given the chart above, it seems like we could call the current behavior of NLL a kind of "bug fix" when it comes to match, though I personally find the inconsistency with let _ = <borrowed> disquieting.

@nikomatsakis
Copy link
Contributor Author

cc @rust-lang/lang -- I'd like opinions on what behavior we think should happen around _ patterns. In general, my mental model has been that _ patterns are "no-ops". They do not "access or touch" the value that they are matching against. And that is certainly true some of the time. For example, we accept this code:

fn main() {
    let foo = (Box::new(22), Box::new(44));
    match foo { (_, y) => () }
    drop(foo.0); // ok, foo.0 was not moved
}

However, we seem to require that you can only match on things are that are "fully present" and not (e.g.) partially moved. Hence it is an error to swap the drop and match in that example.

This requirement that the match discriminant be valid is actually consistent with NLL's view on matching: NLL views the match desugaring as first borrowing the value that we are going to match upon (with a shared borrow). This borrow persists until an arm is chosen. This prevents match guards from doing weird stuff like mutating the value we are matching on. Requiring the match discriminant to be valid is also consistent with some of the discussions we've had about how to think about exhaustiveness and matches with zero arms like match x { }.

That said I also think that let _ = ... and match <block> { _ => () } should be equivalent. So perhaps we should "fix" the behavior of let_ = ...? Or do we special-case match expressions with a single arm to say that they do not have a "tracking borrow" (in which case we would accept the examples above).

I would also -- personally -- potentially draw a distinction between the "unsafe check" and the checks for what data is moved. I don't think of the unsafe check as a "flow sensitive" check but rather something very simple -- if you access the field, even in dead code, you must be in an unsafe block. Hence I am not too thrilled that let _ = <unsafe fields> would type-check, and I consider that a regression from the MIR-based borrow check. In fact, this is basically why I don't think that MIR is a good fit for unsafe checking (HAIR would be better).

@scottmcm
Copy link
Member

scottmcm commented Aug 6, 2018

While I've learned that _ is a complete no-op, I've always found it weird that

let _ = mutex.lock().unwrap();

and

let _guard = mutex.lock().unwrap();

do fundamentally different things.

I expected _ to behave as if the compiler gave it a name, like how argument-position '_ works.

@nikomatsakis
Copy link
Contributor Author

nikomatsakis commented Aug 8, 2018

@scottmcm understood — but I think too late to change that. =)

Update: (It would also mean that there is no way to extract some fields of a struct without moving the rest... unless we had some other pattern for "don't touch")

@nikomatsakis
Copy link
Contributor Author

This is what I personally think the table should look like:

Thing AST MIR Example
let _ = <unsafe-field> playground
match <unsafe_field> { _ => () } playground
let _ = <moved> 💚 💚 playground
match <moved> { _ => () } 💚 💚 playground
let _ = <borrowed> 💚 💚 playground
match <borrowed> { _ => () } 💚 💚 playground

My reasoning is this:

  • I think "unsafe checks" are not "flow-sensitive" checks -- they are just based on what you do and where, not whether the code is reachable etc.
  • On the other hand, initialization and borrow checkers are flow-sensitive, and hence let _ = is known not to touch the thing that is being matched against (same with match with one _ pattern).
    • This is concordant with the "most likely" way to understand exhaustiveness, I think, which is that match foo { _ => () } basically is a non-access and hence cannot be UB.

I think the way I would implement this in MIR desugaring is:

  • Move unsafe checking to operate on HAIR, I guess, or else add some kind of "unsafe_check()" instruction that hangs around just so that the unsafe checker can see it.
  • If you are lowering a match with one arm, then use the "irrefutable" logic for handling it (since the pattern must be irrefutable) -- this will, in turn, mean that match foo { _ => () } doesn't add anything extra
    • In particular, it doesn't add the implicit borrows or other accesses that would otherwise result

This is not entirely "backwards compatible", but accepting let _ = <unsafe field> is a regression in any case, one which was incompletely fixed. In general the move to doing unsafe checking on MIR led to a number of complications around dead code, which is why I would prefer to move away from that.

@eddyb
Copy link
Member

eddyb commented Aug 18, 2018

@nikomatsakis so I think the packed field access checking needs to be on MIR (to have access to the final place operations), but raw pointer dereferences and union field accesses could trivially be on HAIR (and we should maybe also move some linting from HIR to HAIR).

@nikomatsakis nikomatsakis added this to the Rust 2018 RC milestone Aug 21, 2018
@nikomatsakis
Copy link
Contributor Author

nikomatsakis commented Aug 21, 2018

so I think the packed field access checking needs to be on MIR (to have access to the final place operations)

Hmm, that's surprising to me. Maybe I'm forgetting which bits of desugaring are done on HAIR vs MIR -- I would think that the field accesses would be quite easy to spot. @eddyb can you give me an example of where MIR would be better?

@eddyb
Copy link
Member

eddyb commented Aug 22, 2018

@nikomatsakis Patterns are probably the biggest thing that still exists in HAIR but not MIR. But if we have types in every node then it's probably plausible to handle it.

@nikomatsakis
Copy link
Contributor Author

@eddyb ah, I see, you mean that we'd have to check in two places basically? (e.g., struct patterns)

Seems true, but yes since the patterns themselves are fully explicit and have types also like it's not that hard to handle.

@nikomatsakis nikomatsakis added A-NLL Area: Non-lexical lifetimes (NLL) and removed WG-compiler-nll labels Aug 27, 2018
@pnkfelix
Copy link
Member

Or do we special-case match expressions with a single arm to say that they do not have a "tracking borrow" (in which case we would accept the examples above).

oh my, what a glorious hack... I'm halfway tempted to give it a whirl...

@nikomatsakis
Copy link
Contributor Author

nikomatsakis commented Sep 5, 2018

@pnkfelix

oh my, what a glorious hack... I'm halfway tempted to give it a whirl...

Another option: special case matches with a single arm to use the "irrefutable code path":

pub fn place_into_pattern(&mut self,
mut block: BasicBlock,
irrefutable_pat: Pattern<'tcx>,
initializer: &Place<'tcx>,
set_match_place: bool)
-> BlockAnd<()> {

At least I think that's the irrefutable code path...

@arielb1
Copy link
Contributor

arielb1 commented Sep 6, 2018

I would also -- personally -- potentially draw a distinction between the "unsafe check" and the checks for what data is moved. I don't think of the unsafe check as a "flow sensitive" check but rather something very simple -- if you access the field, even in dead code, you must be in an unsafe block. Hence I am not too thrilled that let _ = would type-check, and I consider that a regression from the MIR-based borrow check. In fact, this is basically why I don't think that MIR is a good fit for unsafe checking (HAIR would be better).

The reason there's an unsafety check in MIR is because of the packed-deref check, because I didn't want to track the packed-status of lvalues (e.g., in patterns) in HAIR.

EDIT: saw @eddyb said that already.

Another option: special case matches with a single arm to use the "irrefutable code path":

That (having whether the "irrefutable code path" is used be determined by the number of arms, rather than by let vs. match) looks like a not that terrible idea.

@nikomatsakis
Copy link
Contributor Author

I think I'll open a separate issue to discuss the unsafefy checker, it seems orthogonal.

@nikomatsakis
Copy link
Contributor Author

I'm assigning to @matthewjasper since I think that handling this correctly will fall out from the work they've been doing to improve how we formulate the "borrow match values". The TL;DR of that work is that find the set of "place paths" that the match examines:

  • This includes constants but also discriminants that get switched

For each place path, we will ensure that the place path does not change. This is something of a new concept: in particular, if you have e.g. a place path like (*x).y where x: &u32, we still need to prevent you from re-assigning x, since we will "re-evaluate" x as we try out other match arms. The mechanism we had discussed for doing this is introducing the idea of a "shallow borrow" (needs a better name). A shallow borrow borrows and restricts a particular Place but not extensions of it: so if you shallow borrow e.g. x, then you forbid mutations of x, but not x.0. To freeze a place path P, we shallow borrow each prefix of P.

bors added a commit that referenced this issue Sep 24, 2018
[NLL] Be more permissive when checking access due to Match

Partially addresses #53114. notably, we should now have parity with AST borrowck. Matching on uninitialized values is still forbidden.

* ~~Give fake borrows for match their own `BorrowKind`~~
* ~~Allow borrows with this kind to happen on values that are already mutably borrowed.~~
* ~~Track borrows with this type even behind shared reference dereferences and consider all accesses to be deep when checking for conflicts with this borrow type. See [src/test/ui/issues/issue-27282-mutate-before-diverging-arm-3.rs](cb5c989#diff-a2126cd3263a1f5342e2ecd5e699fbc6) for an example soundness issue this fixes (a case of #27282 that wasn't handled correctly).~~
* Create a new `BorrowKind`: `Shallow` (name can be bike-shed)
* `Shallow` borrows differ from shared borrows in that
  * When we check for access we treat them as a `Shallow(Some(_))` read
  * When we check for conflicts with them, if the borrow place is a strict prefix of the access place then we don't consider that a conflict.
    * For example, a `Shallow` borrow of `x` does not conflict with any access or borrow of `x.0` or `*x`
* Remove the current fake borrow in matches.
* When building matches, we take a `Shallow` borrow of any `Place` that we switch on or bind in a match, and any prefix of those places. (There are some optimizations where we do fewer borrows, but this shouldn't change semantics)
  * `match x { &Some(1) => (),  _ => (), }` would `Shallow` borrow `x`, `*x` and `(*x as Some).0` (the `*x` borrow is unnecessary, but I'm not sure how easy it would be to remove.)
* Replace the fake discriminant read with a `ReadForMatch`.
* Change ReadForMatch to only check for initializedness (to prevent `let x: !; match x {}`), but not conflicting borrows. It is still considered a use for liveness and `unsafe` checking.
* Give special cased error messages for this kind of borrow.

Table from the above issue after this PR

| Thing | AST | MIR | Want | Example |
| --- | --- | --- | --- |---|
| `let _ = <unsafe-field>` | 💚  | 💚  | ❌ |  [playground](https://play.rust-lang.org/?gist=bb7843e42fa5318c1043d04bd72abfe4&version=nightly&mode=debug&edition=2015) |
| `match <unsafe_field> { _ => () }` | ❌  | ❌ | ❌ | [playground](https://play.rust-lang.org/?gist=3e3af05fbf1fae28fab2aaf9412fb2ea&version=nightly&mode=debug&edition=2015) |
| `let _ = <moved>` | 💚  | 💚 | 💚 | [playground](https://play.rust-lang.org/?gist=91a6efde8288558e584aaeee0a50558b&version=nightly&mode=debug&edition=2015) |
| `match <moved> { _ => () }` | ❌ | ❌  | 💚 | [playground](https://play.rust-lang.org/?gist=804f8185040b2fe131f2c4a64b3048ca&version=nightly&mode=debug&edition=2015) |
| `let _ = <borrowed>` | 💚  | 💚 | 💚 | [playground](https://play.rust-lang.org/?gist=0e487c2893b89cb772ec2f2b7c5da876&version=nightly&mode=debug&edition=2015) |
| `match <borrowed> { _ => () }` | 💚  | 💚 | 💚 | [playground](https://play.rust-lang.org/?gist=0e487c2893b89cb772ec2f2b7c5da876&version=nightly&mode=debug&edition=2015) |

r? @nikomatsakis
@pnkfelix pnkfelix self-assigned this Jan 23, 2020
@pnkfelix pnkfelix added P-high High priority and removed I-nominated P-medium Medium priority labels Jan 23, 2020
@pnkfelix
Copy link
Member

Assigning self and @Centril for work on testing.

Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Apr 10, 2020
…entril

tests encoding current behavior for various cases of "binding" to _.

The `_` binding form is special, in that it encodes a "no-op": nothing is actually bound, and thus nothing is moved or borrowed in this scenario. Usually we do the "right" thing in all such cases. The exceptions are explicitly pointed out in this test case, so that we keep track of whether they are eventually fixed.

Cc rust-lang#53114.

(This does not close the aforementioned issue; it just adds the tests encoding the current behavior, which we hope to eventually fix.)
Centril added a commit to Centril/rust that referenced this issue Apr 10, 2020
…entril

tests encoding current behavior for various cases of "binding" to _.

The `_` binding form is special, in that it encodes a "no-op": nothing is actually bound, and thus nothing is moved or borrowed in this scenario. Usually we do the "right" thing in all such cases. The exceptions are explicitly pointed out in this test case, so that we keep track of whether they are eventually fixed.

Cc rust-lang#53114.

(This does not close the aforementioned issue; it just adds the tests encoding the current behavior, which we hope to eventually fix.)
bors added a commit to rust-lang-ci/rust that referenced this issue Apr 11, 2020
…tril

tests encoding current behavior for various cases of "binding" to _.

The `_` binding form is special, in that it encodes a "no-op": nothing is actually bound, and thus nothing is moved or borrowed in this scenario. Usually we do the "right" thing in all such cases. The exceptions are explicitly pointed out in this test case, so that we keep track of whether they are eventually fixed.

Cc rust-lang#53114.

(This does not close the aforementioned issue; it just adds the tests encoding the current behavior, which we hope to eventually fix.)
@osa1
Copy link
Contributor

osa1 commented Feb 1, 2021

Can this be closed? The original reproducer works now, and it seems like the commits referenced above add tests.

@jackh726
Copy link
Member

The tests added still contains code that should compile but doesn't, e.g.

match m { _ => { } } // #53114: should eventually be accepted too

@jackh726
Copy link
Member

That being said, I think this can be changed to P-medium

@jackh726 jackh726 added the WG-nll Working group: Non-lexical lifetimes label Feb 1, 2022
@pnkfelix
Copy link
Member

pnkfelix commented Feb 1, 2022

@rustbot prioritize

Lets have a Zulip thread to talk about why to downgrade this to P-medium.

@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Feb 1, 2022
@pnkfelix
Copy link
Member

pnkfelix commented Feb 1, 2022

discussed with @Centril during pre-triage.

At this point, I think we should put P-high priority on ensuring the tests are present (or add them); its embarrassing that sat so long.

as for the behavior change for match .... I probably figure that's still P-medium.

Marking P-high to reflect the above.

Ah, look at this, I'm the one who marked it as P-high and explicitly said that the behavior change for match itself is P-medium

@rustbot label: -E-needs-test

@rustbot rustbot removed the E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. label Feb 1, 2022
@pnkfelix
Copy link
Member

pnkfelix commented Feb 1, 2022

As noted in the table in the issue description, there are essentially two tasks that remain here:

  • change let _ = <unsafe-field> to be rejected instead of accepted. This has its own dedicated issue in let _ = <access to unsafe field> currently type-checks #54003
  • change match <moved> { _ => () } to be accepted instead of rejected. This does not have its own dedicated issue, but its a pretty niche feature request and seems easy to categorize as P-medium.

@pnkfelix
Copy link
Member

pnkfelix commented Feb 1, 2022

@rustbot label: +P-medium -P-high

@rustbot rustbot added P-medium Medium priority and removed P-high High priority labels Feb 1, 2022
@pnkfelix
Copy link
Member

pnkfelix commented Feb 1, 2022

@rustbot label: -I-prioritize

@rustbot rustbot removed the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Feb 1, 2022
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 27, 2023
…RalfJung

Allow partially moved values in match

This PR attempts to unify the behaviour between `let _ = PLACE`, `let _: TY = PLACE;` and `match PLACE { _ => {} }`.
The logical conclusion is that the `match` version should not check for uninitialised places nor check that borrows are still live.

The `match PLACE {}` case is handled by keeping a `FakeRead` in the unreachable fallback case to verify that `PLACE` has a legal value.

Schematically, `match PLACE { arms }` in surface rust becomes in MIR:
```rust
PlaceMention(PLACE)
match PLACE {
  // Decision tree for the explicit arms
  arms,
  // An extra fallback arm
  _ => {
    FakeRead(ForMatchedPlace, PLACE);
    unreachable
  }
}
```

`match *borrow { _ => {} }` continues to check that `*borrow` is live, but does not read the value.
`match *borrow {}` both checks that `*borrow` is live, and fake-reads the value.

Continuation of ~rust-lang#102256 ~rust-lang#104844

Fixes rust-lang#99180 rust-lang#53114
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Oct 28, 2023
Allow partially moved values in match

This PR attempts to unify the behaviour between `let _ = PLACE`, `let _: TY = PLACE;` and `match PLACE { _ => {} }`.
The logical conclusion is that the `match` version should not check for uninitialised places nor check that borrows are still live.

The `match PLACE {}` case is handled by keeping a `FakeRead` in the unreachable fallback case to verify that `PLACE` has a legal value.

Schematically, `match PLACE { arms }` in surface rust becomes in MIR:
```rust
PlaceMention(PLACE)
match PLACE {
  // Decision tree for the explicit arms
  arms,
  // An extra fallback arm
  _ => {
    FakeRead(ForMatchedPlace, PLACE);
    unreachable
  }
}
```

`match *borrow { _ => {} }` continues to check that `*borrow` is live, but does not read the value.
`match *borrow {}` both checks that `*borrow` is live, and fake-reads the value.

Continuation of ~rust-lang/rust#102256 ~rust-lang/rust#104844

Fixes rust-lang/rust#99180 rust-lang/rust#53114
@WaffleLapkin
Copy link
Member

Closing this since #103208 is merged. cc @cjgillot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-NLL Area: Non-lexical lifetimes (NLL) NLL-complete Working towards the "valid code works" goal P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-nll Working group: Non-lexical lifetimes
Projects
None yet
Development

No branches or pull requests