Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full pattern matching #1820

Merged
merged 9 commits into from
Feb 19, 2024
Merged

Full pattern matching #1820

merged 9 commits into from
Feb 19, 2024

Conversation

yannham
Copy link
Member

@yannham yannham commented Feb 14, 2024

Depends on #1819.

This PR finally brings the work done around patterns and ADTs together by making match expressions accept arbitrary patterns on the left hand side of the => sign. This results in both full pattern matching, and the ability to deconstruct ADTs in a type-safe way.

Beside some administrative changes (changing the parsing rule of match, switching to the pattern compilation instead of the current ad-hoc evaluation of match expressions for enum tags, etc.), the bulk of the new work of this PR is the typechecking of general patterns.

Typechecking

For simplicity, I reproduce here what I wrote as a comment inside the typechecker:

We can associate a type to each pattern of each case of the match expression. This the type of a valid argument for this particular pattern. From there, the type of a valid argument for the whole match expression is ideally the union of all the pattern types.

For record types, we don't have any good way to express union: for example, what could be the type of something that is either {x} or {y}? In the Nickel type system, no type can represent that. Thus the match expression should be morally of type Void -> _, which itself doesn't actually exist as a Nickel type, so we reject it. In the case of record types, we just take the intersection of the types, which amounts to unify all pattern types together. While it might fail most of the time (including for {x} and {y}) and throws a type error, it can typecheck interesting expressions when the goal is rather to match on a subfield, for example:

x |> match {
{foo, bar: 'Baz} => <branch1>
{foo, bar: 'Qux} => <branch2>
}

We can definitely find a type for x: {foo: a, bar: [| 'Baz, 'Qux |]}.

For enum types, we can express unions: for example, the union of [|'Foo, 'Bar|] and [|'Bar, 'Baz|] is [|'Foo, 'Bar, 'Baz|]. We can even turn this into a unification problem: "open" the initial row types as [| 'Foo, 'Bar; ?a |] and [|'Bar, 'Baz; ?b |], unify them together, and close the result (unify the tail with an empty row tail). The advantage of this open-unify-close approach is that unification takes care of descending into record types and sub-patterns to perform this operation, and we're back to the same procedure (almost) than for record patterns: unify all pattern types. Although we have additional bookkeeping to perform (closing the right row types at the end of the procedure).

This bookkeeping is performed at least partially by the typecheck::pattern::PatternTypes::pattern_types function.

In consequence, the current approach to typecheck a pattern is:

  • in typecheck::pattern, the pattern_types function which elaborates the type of a pattern as well as its variables always elaborates open-ended enum rows when it encounters an enum pattern. It saves the open tail in a vector, together with a pattern path, which localizes the enum pattern inside the wider pattern
  • in the typechecker, we unify all the pattern types together, with the open enum rows described above. Then, depending on the presence of a default case, we close either all tails, or we close all but the tails appearing at the top-level of the pattern. Indeed, match { 'Foo => .., 'Bar => .., _ => } : forall rows. [| 'Foo, 'Bar; rows |]. The other paths are currently not leveraged, but we can imagine in the near future having wildcard patterns, in which case we would also let open the tails corresponding to each path where there is a wildcard pattern: typically, we would have match { {foo = 'Foo} => ..., {foo = 'Bar} => ..., {foo = _} => ...} : forall rows . { foo : [| 'Foo, 'Bar; rows |] }

Content

Besides the typechecking implementation, the rest is mostly obvious consequences of generalizing match expressions, as well as bug fixing in pattern compilations and pattern type checking discovered by adding more comprehensive tests.

Follow-up (ADTs + patterns)

  • Contracts specified within patterns are currently ignored by the compilation. This shouldn't be the case (and isn't very hard to restore)
  • Finally set the right syntax for ADTs (using normal application)
  • Decide if we want 'Foo to be matched by 'Foo x or not, adding %unit%: %Unit%.
  • Fix the contract generated for enum variants in enum types
  • Documentation

@github-actions github-actions bot temporarily deployed to pull request February 14, 2024 20:59 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 15, 2024 13:10 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 15, 2024 18:31 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 15, 2024 18:42 Inactive
@yannham yannham marked this pull request as ready for review February 15, 2024 19:02
Comment on lines +188 to +190
// if final_bindings_id == null then
// null
// else
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interesting part of this change is here: it was missing a null check which could lead at runtime to %record_remove% complaining that its argument is null on valid (albeit unmatching) patterns. The rest of the diff in this file is just the implementation of this change.

Base automatically changed from refactor/pattern-positions to master February 16, 2024 08:55
Copy link

dpulls bot commented Feb 16, 2024

🎉 All dependencies have been resolved !

This commit brings together many previous chunks of work to finally make
match expressions accept generic patterns, and not just enum tags.

Pattern compilation was already implemented. Besides changing the
parsing rule for "match" and change the implementation to compile the
pattern and applies the result to the argument, the bulk of the work has
been to correctly typecheck arbitrary structural patterns.

Refer to the comments for the "Match" case in the typechecker for more
details about the algorithm.
The compilation of record pattern was missing a null check which would
trigger a dynamic type error when some matches aren't exhaustive. This
commit fixes the issue by adding the missing null check.
@github-actions github-actions bot temporarily deployed to pull request February 16, 2024 09:05 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 16, 2024 11:34 Inactive
@yannham yannham mentioned this pull request Feb 16, 2024
core/src/eval/tests.rs Outdated Show resolved Hide resolved
core/tests/integration/inputs/pattern-matching/basics.ncl Outdated Show resolved Hide resolved
Co-authored-by: Viktor Kleen <viktor.kleen@tweag.io>
@github-actions github-actions bot temporarily deployed to pull request February 19, 2024 16:28 Inactive
@yannham yannham added this pull request to the merge queue Feb 19, 2024
Merged via the queue into master with commit 6f38f16 Feb 19, 2024
5 checks passed
@yannham yannham deleted the feat/full-pattern-matching branch February 19, 2024 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants