Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully implement or-pattern parsing #63693

Merged
merged 34 commits into from
Aug 27, 2019
Merged

Conversation

Centril
Copy link
Contributor

@Centril Centril commented Aug 18, 2019

Builds upon the initial parsing in #61708 to fully implement or-pattern (p | q) parsing as specified in the grammar section of RFC 2535.

Noteworthy:

  • We allow or-patterns in [p | q, ...].
  • We allow or-patterns in let statements and for expressions including with leading |.
  • We improve recovery for p || q (+ tests for that in multiple-pattern-typo.rs).
  • We improve recovery for | p | q in inner patterns (tests in or-patterns-syntactic-fail.rs).
  • We rigorously test or-pattern parsing (in or-patterns-syntactic-{pass,fail}.rs).
  • We harden the feature gating tests.
  • We do not change ast.rs. That is, ExprKind::Let.0 and Arm.pats still accept Vec<P<Pat>>.
    I was starting work on that but it would be cleaner to do this in a separate PR so this one has a narrower scope.

cc @dlrobertson
cc the tracking issue #54883.

r? @estebank

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 18, 2019
@Centril Centril added the F-or_patterns `#![feature(or_patterns)]` label Aug 18, 2019
@@ -1448,7 +1450,7 @@ impl<'a> Parser<'a> {

Ok(ast::Arm {
attrs,
pats,
pats: pat, // FIXME(or_patterns, Centril | dlrobertson): this should just be `pat,`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would this just be a singular pat? Foo(A | B) will now be a single pattern that contains an or-pattern, but woudn't we still want to parse Foo(A) | Bar(B) as multiple patterns?

Copy link
Contributor Author

@Centril Centril Aug 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having arm.pat: P<Pat> instead of arm.pats: Vec<P<Pat>> simplifies the overall structure of the compiler by using a uniform mechanism for or-patterns instead of having a distinction between or-patterns at the top-level and in nested positions. As an example, you can see that the parser pat.rs parser has, in this PR, a hack to unpack an or-pattern at the top level. There are many other places (e.g. in resolve, save-analysis, etc.) in which it would simplify (and reduce duplication of logic) to have one uniform mechanism. So we should represent p1 | p2 at the top of a match arm simply as PatKind::Or.

Edit: As a further consideration, using Vec<P<Pat>> pessimizes what I think is the most common case of an arm with just Foo => . Moreover, let now permits or-patterns so there's no good reason to have a top-level distinction sometimes and sometimes not. In terms of preventing bugs, it is also better to have one mechanism since you can ensure that way that the semantics at the top and inner levels do not diverge.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! This makes a lot of sense and I'd love to see this happen.The lowering in MIR will require quite a bit of work and thought in order to get to this point, but AFAIK unifying the two could also greatly simplify librustc_mir/build/matches.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh absolutely. I think basically we want to work from both sides and meet in the middle. That is, we fix the AST and add a hack to lowering. Then we fix lowering + HIR and add a hack to HAIR. Eventually we get down to MIR and the hack is no more.

src/libsyntax/parse/parser/pat.rs Outdated Show resolved Hide resolved
src/libsyntax/parse/parser/pat.rs Outdated Show resolved Hide resolved

// If the next token is not a `|`,
// this is not an or-pattern and we should exit here.
if !self.check(&token::BinOp(token::Or)) && self.token != token::OrOr {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see this becoming a diagnostics problem in a similar vein to { foo() } * bar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you can look-ahead for the absence of various tokens like , ] } ) and try to recover? I'd prefer to do more diagnostics polish in a follow up tho.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't disagree, it's just that I noticed how hard it was to deal on expression parsing recovery because of similar things so I wanted to bring it up in case you might have some ideas on how to restrict it. Not critical for this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the heads-up. :)

A cheap solution code-wise would be to reuse the generic list parsing infra and use | as the separator. Then you also need to check that trailing == false and error otherwise. (that's actually a good opportunity for recovery in case of a trailing |). The drawback to this is that it will unconditionally allocate a Vec<P<Pat>> which you will need to then convert back to P<Pat> in the case of a single element. That might pessimize parsing perf but it is probably negligible in the grand scheme of things... But let's revisit in a different PR. ^^

src/libsyntax/parse/parser/pat.rs Outdated Show resolved Hide resolved
src/libsyntax/parse/parser/pat.rs Show resolved Hide resolved
src/test/ui/or-patterns/or-patterns-syntactic-fail.stderr Outdated Show resolved Hide resolved
src/test/ui/parser/mut-patterns.stderr Show resolved Hide resolved
@Centril
Copy link
Contributor Author

Centril commented Aug 25, 2019

@estebank I've addressed the comments (see #63693 (comment) for moving things to diagnostics.rs which I think isn't the best idea).

I'm not super happy with the introduction of fn parse_fn_param_pat and that the pattern parser now has 4 entry points (well 3 after refactoring...). I think it should probably just be replaced with a call to parse_top_pat and accepting fn foo(A | B: typ) since there are no technical drawbacks nor readability ambiguities in that... but let's leave that for a different PR.

@Centril
Copy link
Contributor Author

Centril commented Aug 26, 2019

@estebank Added a commit just now to fix an issue (that I just noticed) with a wrong span.

--> $DIR/or-patterns-syntactic-fail.rs:50:17
|
LL | let NS { f: || A | B };
| ^^ help: remove the `||`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these two last at the end be parsed as a closure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. This is in a pattern context so $field_name: $pat and there is no way to parse a closure expression in a pattern context.

@estebank
Copy link
Contributor

@Centril I think it looks good. I'm only slightly worried about the S { f: || A | B } case, where I would expect it to be identified as a malformed closure as opposed to an or-pattern, particularly because I would expect S { f: || A } to be an S with a field f that holds an Fn() -> A, and I'm not 100% sure that is the case.

Beyond that, r=me 🎉

@Centril
Copy link
Contributor Author

Centril commented Aug 26, 2019

Thanks for the review. :)

I think we can always revisit S { f: || A } but imo it is too much of a special case to deserve special logic (which I think will be invasive to the overall parsing logic here) and you'd also need to mix up expression and pattern contexts to confuse let S { f: || A } for a closure.

@bors r=estebank

@bors
Copy link
Contributor

bors commented Aug 26, 2019

📌 Commit 2bd27fb has been approved by estebank

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 26, 2019
Centril added a commit to Centril/rust that referenced this pull request Aug 26, 2019
…ebank

Fully implement or-pattern parsing

Builds upon the initial parsing in rust-lang#61708 to fully implement or-pattern (`p | q`) parsing as specified in [the grammar section of RFC 2535](https://github.com/rust-lang/rfcs/blob/master/text/2535-or-patterns.md#grammar).

Noteworthy:

- We allow or-patterns in `[p | q, ...]`.
- We allow or-patterns in `let` statements and `for` expressions including with leading `|`.
- We improve recovery for `p || q` (+ tests for that in `multiple-pattern-typo.rs`).
- We improve recovery for `| p | q` in inner patterns (tests in `or-patterns-syntactic-fail.rs`).
- We rigorously test or-pattern parsing (in `or-patterns-syntactic-{pass,fail}.rs`).
- We harden the feature gating tests.
- We do **_not_** change `ast.rs`. That is, `ExprKind::Let.0` and `Arm.pats` still accept `Vec<P<Pat>>`.
   I was starting work on that but it would be cleaner to do this in a separate PR so this one has a narrower scope.

cc @dlrobertson
cc the tracking issue rust-lang#54883.

r? @estebank
bors added a commit that referenced this pull request Aug 27, 2019
Rollup of 6 pull requests

Successful merges:

 - #63317 (Do not complain about unused code when used in `impl` `Self` type)
 - #63693 (Fully implement or-pattern parsing)
 - #63836 (VxWorks does not provide a way to set the task name except at creation time)
 - #63845 (Removed a confusing FnOnce example)
 - #63855 (Refactor feature gates)
 - #63921 (add link to FileCheck docs)

Failed merges:

r? @ghost
@bors bors merged commit 2bd27fb into rust-lang:master Aug 27, 2019
@Centril Centril deleted the polish-parse-or-pats branch August 27, 2019 03:56
Centril added a commit to Centril/rust that referenced this pull request Sep 5, 2019
…ochenkov

or-patterns: Uniformly use `PatKind::Or` in AST & Fix/Cleanup resolve

Following up on work in rust-lang#63693 and rust-lang#61708, in this PR we:

- Uniformly use `PatKind::Or(...)` in AST:

   - Change `ast::Arm.pats: Vec<P<Pat>>` => `ast::Arm.pat: P<Pat>`

   - Change `ast::ExprKind::Let.0: Vec<P<Pat>>` => `ast::ExprKind::Let.0: P<Pat>`

- Adjust `librustc_resolve/late.rs` to correctly handle or-patterns at any level of nesting as a result.

  In particular, the already-bound check which rejects e.g. `let (a, a);` now accounts for or-patterns. The consistency checking (ensures no missing bindings and binding mode consistency) also now accounts for or-patterns. In the process, a bug was found in the current compiler which allowed:

   ```rust
   enum E<T> { A(T, T), B(T) }
   use E::*;
   fn foo() {
       match A(0, 1) {
           B(mut a) | A(mut a, mut a) => {}
       }
   }
   ```

   The new algorithms took a few iterations to get right. I tried several clever schemes but ultimately a version based on a stack of hashsets and recording product/sum contexts was chosen since it is more clearly correct.

- Clean up `librustc_resolve/late.rs` by, among other things, using a new `with_rib` function to better ensure stack dicipline.

- Do not push the change in AST to HIR for now to avoid doing too much in this PR. To cope with  this, we introduce a temporary hack in `rustc::hir::lowering` (clearly marked in the diff).

cc rust-lang#54883
cc @dlrobertson @matthewjasper
r? @petrochenkov
Centril added a commit to Centril/rust that referenced this pull request Sep 5, 2019
…ochenkov

or-patterns: Uniformly use `PatKind::Or` in AST & Fix/Cleanup resolve

Following up on work in rust-lang#63693 and rust-lang#61708, in this PR we:

- Uniformly use `PatKind::Or(...)` in AST:

   - Change `ast::Arm.pats: Vec<P<Pat>>` => `ast::Arm.pat: P<Pat>`

   - Change `ast::ExprKind::Let.0: Vec<P<Pat>>` => `ast::ExprKind::Let.0: P<Pat>`

- Adjust `librustc_resolve/late.rs` to correctly handle or-patterns at any level of nesting as a result.

  In particular, the already-bound check which rejects e.g. `let (a, a);` now accounts for or-patterns. The consistency checking (ensures no missing bindings and binding mode consistency) also now accounts for or-patterns. In the process, a bug was found in the current compiler which allowed:

   ```rust
   enum E<T> { A(T, T), B(T) }
   use E::*;
   fn foo() {
       match A(0, 1) {
           B(mut a) | A(mut a, mut a) => {}
       }
   }
   ```

   The new algorithms took a few iterations to get right. I tried several clever schemes but ultimately a version based on a stack of hashsets and recording product/sum contexts was chosen since it is more clearly correct.

- Clean up `librustc_resolve/late.rs` by, among other things, using a new `with_rib` function to better ensure stack dicipline.

- Do not push the change in AST to HIR for now to avoid doing too much in this PR. To cope with  this, we introduce a temporary hack in `rustc::hir::lowering` (clearly marked in the diff).

cc rust-lang#54883
cc @dlrobertson @matthewjasper
r? @petrochenkov
Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this pull request Sep 6, 2019
…ochenkov

or-patterns: Uniformly use `PatKind::Or` in AST & Fix/Cleanup resolve

Following up on work in rust-lang#63693 and rust-lang#61708, in this PR we:

- Uniformly use `PatKind::Or(...)` in AST:

   - Change `ast::Arm.pats: Vec<P<Pat>>` => `ast::Arm.pat: P<Pat>`

   - Change `ast::ExprKind::Let.0: Vec<P<Pat>>` => `ast::ExprKind::Let.0: P<Pat>`

- Adjust `librustc_resolve/late.rs` to correctly handle or-patterns at any level of nesting as a result.

  In particular, the already-bound check which rejects e.g. `let (a, a);` now accounts for or-patterns. The consistency checking (ensures no missing bindings and binding mode consistency) also now accounts for or-patterns. In the process, a bug was found in the current compiler which allowed:

   ```rust
   enum E<T> { A(T, T), B(T) }
   use E::*;
   fn foo() {
       match A(0, 1) {
           B(mut a) | A(mut a, mut a) => {}
       }
   }
   ```

   The new algorithms took a few iterations to get right. I tried several clever schemes but ultimately a version based on a stack of hashsets and recording product/sum contexts was chosen since it is more clearly correct.

- Clean up `librustc_resolve/late.rs` by, among other things, using a new `with_rib` function to better ensure stack dicipline.

- Do not push the change in AST to HIR for now to avoid doing too much in this PR. To cope with  this, we introduce a temporary hack in `rustc::hir::lowering` (clearly marked in the diff).

cc rust-lang#54883
cc @dlrobertson @matthewjasper
r? @petrochenkov
Centril added a commit to Centril/rust that referenced this pull request Sep 6, 2019
…ochenkov

or-patterns: Uniformly use `PatKind::Or` in AST & Fix/Cleanup resolve

Following up on work in rust-lang#63693 and rust-lang#61708, in this PR we:

- Uniformly use `PatKind::Or(...)` in AST:

   - Change `ast::Arm.pats: Vec<P<Pat>>` => `ast::Arm.pat: P<Pat>`

   - Change `ast::ExprKind::Let.0: Vec<P<Pat>>` => `ast::ExprKind::Let.0: P<Pat>`

- Adjust `librustc_resolve/late.rs` to correctly handle or-patterns at any level of nesting as a result.

  In particular, the already-bound check which rejects e.g. `let (a, a);` now accounts for or-patterns. The consistency checking (ensures no missing bindings and binding mode consistency) also now accounts for or-patterns. In the process, a bug was found in the current compiler which allowed:

   ```rust
   enum E<T> { A(T, T), B(T) }
   use E::*;
   fn foo() {
       match A(0, 1) {
           B(mut a) | A(mut a, mut a) => {}
       }
   }
   ```

   The new algorithms took a few iterations to get right. I tried several clever schemes but ultimately a version based on a stack of hashsets and recording product/sum contexts was chosen since it is more clearly correct.

- Clean up `librustc_resolve/late.rs` by, among other things, using a new `with_rib` function to better ensure stack dicipline.

- Do not push the change in AST to HIR for now to avoid doing too much in this PR. To cope with  this, we introduce a temporary hack in `rustc::hir::lowering` (clearly marked in the diff).

cc rust-lang#54883
cc @dlrobertson @matthewjasper
r? @petrochenkov
harpocrates added a commit to harpocrates/language-rust that referenced this pull request Sep 18, 2019
Rust got full-blown support for or-patterns (see [RFC 2535][0]). This
means a couple changes:

  * `OrP` is a new variant of `Pat`
  * `WhileLet`, `IfLet`, `Arm` now just take a `Pat` (instead of a list)
  * in the parser, or-patterns are not allowed everywhere that regular
    patterns are!

Tests cases were heavily inspired by [the PR that implemented the RFC][1].

[0]: https://github.com/rust-lang/rfcs/blob/master/text/2535-or-patterns.md#grammar
[1]: rust-lang/rust#63693
Centril added a commit to Centril/rust that referenced this pull request Sep 22, 2019
or-patterns: Push `PatKind/PatternKind::Or` at top level to HIR & HAIR

Following up on work in rust-lang#64111, rust-lang#63693, and rust-lang#61708, in this PR:

- We change `hair::Arm.patterns: Vec<Pattern<'_>>` into `hir::Arm.pattern: Pattern<'_>`.

   - `fn hair::Arm::top_pats_hack` is introduced as a temporary crutch in MIR building to avoid more changes.

- We change `hir::Arm.pats: HirVec<P<Pat>>` into `hir::Arm.pat: P<Pat>`.

   - The hacks in `rustc::hir::lowering` are removed since the representation hack is no longer necessary.

   - In some places, `fn hir::Arm::top_pats_hack` is introduced to leave some things as future work.

   - Misc changes: HIR pretty printing is adjusted to behave uniformly wrt. top/inner levels, rvalue promotion is adjusted, regionck, and dead_code is also.

   - Type checking is adjusted to uniformly handle or-patterns at top/inner levels.
      To make things compile, `p_0 | ... | p_n` is redefined as a "reference pattern" in [`fn is_non_ref_pat`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_typeck/check/struct.FnCtxt.html#method.is_non_ref_pat) for now. This is done so that reference types are not eagerly stripped from the `expected: Ty<'tcx>`.

    - Liveness is adjusted wrt. the `unused_variables` and `unused_assignments` lints to handle top/inner levels uniformly and the handling of `fn` parameters, `let` locals, and `match` arms are unified in this respect. This is not tested for now as exhaustiveness checks are reachable and will ICE.

    - In `check_match`, checking `@` and by-move bindings is adjusted. However, exhaustiveness checking is not adjusted the moment and is handled by @dlrobertson in rust-lang#63688.

    - AST borrowck (`construct.rs`) is not adjusted as AST borrowck will be removed soon.

r? @matthewjasper
cc @dlrobertson @varkor @oli-obk
bors added a commit that referenced this pull request Sep 23, 2019
or-patterns: Push `PatKind/PatternKind::Or` at top level to HIR & HAIR

Following up on work in #64111, #63693, and #61708, in this PR:

- We change `hair::Arm.patterns: Vec<Pattern<'_>>` into `hir::Arm.pattern: Pattern<'_>`.

   - `fn hair::Arm::top_pats_hack` is introduced as a temporary crutch in MIR building to avoid more changes.

- We change `hir::Arm.pats: HirVec<P<Pat>>` into `hir::Arm.pat: P<Pat>`.

   - The hacks in `rustc::hir::lowering` are removed since the representation hack is no longer necessary.

   - In some places, `fn hir::Arm::top_pats_hack` is introduced to leave some things as future work.

   - Misc changes: HIR pretty printing is adjusted to behave uniformly wrt. top/inner levels, rvalue promotion is adjusted, regionck, and dead_code is also.

   - Type checking is adjusted to uniformly handle or-patterns at top/inner levels.
      To make things compile, `p_0 | ... | p_n` is redefined as a "reference pattern" in [`fn is_non_ref_pat`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_typeck/check/struct.FnCtxt.html#method.is_non_ref_pat) for now. This is done so that reference types are not eagerly stripped from the `expected: Ty<'tcx>`.

    - Liveness is adjusted wrt. the `unused_variables` and `unused_assignments` lints to handle top/inner levels uniformly and the handling of `fn` parameters, `let` locals, and `match` arms are unified in this respect. This is not tested for now as exhaustiveness checks are reachable and will ICE.

    - In `check_match`, checking `@` and by-move bindings is adjusted. However, exhaustiveness checking is not adjusted the moment and is handled by @dlrobertson in #63688.

    - AST borrowck (`construct.rs`) is not adjusted as AST borrowck will be removed soon.

r? @matthewjasper
cc @dlrobertson @varkor @oli-obk
Centril added a commit to Centril/rust that referenced this pull request Sep 25, 2019
or-patterns: Push `PatKind/PatternKind::Or` at top level to HIR & HAIR

Following up on work in rust-lang#64111, rust-lang#63693, and rust-lang#61708, in this PR:

- We change `hair::Arm.patterns: Vec<Pattern<'_>>` into `hir::Arm.pattern: Pattern<'_>`.

   - `fn hair::Arm::top_pats_hack` is introduced as a temporary crutch in MIR building to avoid more changes.

- We change `hir::Arm.pats: HirVec<P<Pat>>` into `hir::Arm.pat: P<Pat>`.

   - The hacks in `rustc::hir::lowering` are removed since the representation hack is no longer necessary.

   - In some places, `fn hir::Arm::top_pats_hack` is introduced to leave some things as future work.

   - Misc changes: HIR pretty printing is adjusted to behave uniformly wrt. top/inner levels, rvalue promotion is adjusted, regionck, and dead_code is also.

   - Type checking is adjusted to uniformly handle or-patterns at top/inner levels.
      To make things compile, `p_0 | ... | p_n` is redefined as a "reference pattern" in [`fn is_non_ref_pat`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_typeck/check/struct.FnCtxt.html#method.is_non_ref_pat) for now. This is done so that reference types are not eagerly stripped from the `expected: Ty<'tcx>`.

    - Liveness is adjusted wrt. the `unused_variables` and `unused_assignments` lints to handle top/inner levels uniformly and the handling of `fn` parameters, `let` locals, and `match` arms are unified in this respect. This is not tested for now as exhaustiveness checks are reachable and will ICE.

    - In `check_match`, checking `@` and by-move bindings is adjusted. However, exhaustiveness checking is not adjusted the moment and is handled by @dlrobertson in rust-lang#63688.

    - AST borrowck (`construct.rs`) is not adjusted as AST borrowck will be removed soon.

r? @matthewjasper
cc @dlrobertson @varkor @oli-obk
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
F-or_patterns `#![feature(or_patterns)]` S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants