Type-based completion in the new completer #1577

jneem · 2023-09-05T20:16:06Z

Updates the new completer to get completions from types also.

I also added a (hopefully short-lived) feature flag to turn off the old completer. This makes it easier to test the new completer.

jneem · 2023-09-05T20:17:17Z

(By the way, no particular need to get this into 1.2. I'm happy to hold off on merging until we get that out)

lsp/nls/src/cache.rs

lsp/nls/src/field_walker.rs

yannham

I must say I'm not sure to understand the overall strategy very well, and in particular why do we need multiple linearizer and to linearize several times. Also, it seems like we do completion either using the term definition, either using the type when it's just a type, but not both. If this is indeed the case, I think it doesn't follow the philosophy that is laid out in the notes about the semantics of completion. Completion should take information from every possible source: if there is a defined value, a type annotation, and multiple contract annotations including a dict annotation with a record as a sub contract (think {foo | {_ : {bar : Number, ..}} = {baz = {other = 2}}}, then we should compute completion information from all of those sources at the same time, and propose both bar and other when trying to complete foo.baz.[.]. Dictionaries might be trickier, but this is not really the point - we can do that later.

lsp/nls/src/cache.rs

yannham · 2023-09-07T14:45:24Z

lsp/nls/src/field_walker.rs

+    /// `{ foo: Number }`
+    ///     ^-----------------------------------------------------------|
+    /// then in the expression `x.foo`, the definition of `foo` is here | and the
+    /// value of its definition is `Number`.


I might understand as I go more into the code, but this part tickles me. If you have {foo : Number}, then x.foo is undefined, it's not Number (but the LSP knows that it has the type Number). Here it seems like we conflate {foo : Number} with {foo = Number} or something? And how {foo : Number = 1} is represented? What if you have something like {foo | {bar, baz, ..} = {blorg}}, where the completion information should be bar, baz, blorg?

The idea here is that Value (maybe not well named) is a thing that might have fields for recursing into when resolving paths. It could be a term (like { foo = 1 }) or a type (like { foo : Number }). For completion (and eventually goto definition) we need to be able to recurse through both of these cases. So for example, if x has type { foo : { bar : { baz : Number }}} and we ask for the fields on x.foo.bar, then in the first call to FieldDefs::resolve we'll find that x has a field foo with "value" { bar : { baz : Number }}, and then on the next call to FieldDefs::resolve we'll find the subfield bar with "value" { baz: Number }.

The recursion can look at both terms and types/contracts. For example, if you have
let x | { bar, baz, .. } = { blorg } in x.
then in the Term::Annotated branch we collect the completions from both the annotation and the contract. We also accumulate fields from inferred types in FieldDefs::resolve, and combine those fields with the ones from the term.

Thanks to this question, I realized that I'd forgotten to accumulate fields from contracts and types in record field metadata; this is fixed in the latest commit. With the fix, if you call resolve_path on

{ foo : { bar : Number } = { bar = 1 }}

with the path foo:

on the first iteration, we'll see that the overall expression has an ident foo, and its Def has value { bar = 1 } and metadata { bar : Number }.

on the second iteration, we'll recurse into the previous Def for foo. We'll visit the value, which has a field bar (whose Def has value 1, and no metadata). And we'll visit the metadata, which has a field bar (whose Def has value Number and no metadata). So we'll return two Defs, one for each occurrence of bar.

I see, thanks for the explanation. I guess it's about naming and code organization rather than semantics, but what I find strange is that a metadata (a type annotation or a contract annotation) at a field level is treated differently depending on if the field has a definite value. Correct me if I'm wrong, but:

# case 1 { foo | {..1..} = {..2..}} # case 2 { foo : {..1..}}

In both cases foo has the same contract attached. But in the first case, the contract will go in the "metadata" for the completer, and the definition will be {..2..}, while, in the other case, the definition will be {..1..} (and then, will foo have "metadata" in the sense of the completer?)

It seems, from a distance, that we shouldn't actually make a distinction between what's on the right of the equal, what's in metadata.annotation.contracts and metadata.annotation.typ. All of them are equal sources of completion information, et we might have to recurse into all of them (in fact, for something which definition is a record, having this record on the right of the equal or as a contract annotation doesn't make any difference for the completer, I believe). That's why I think I find confusing the notion of definition: it seems to arbitrarily pick one source of completion as the "definition", when there are several ones. You say that the completer uses all of the information sources then, so it's fine operationally, but I'm wondering about the modeling. For example, would that make sense to (very schematically, there might be other considerations):

rename Defn -> CompletionSource

make it possible to attach multiple completion sources to each symbol/field/whatever, instead of only one definition

Sorry, I used definition/Defn a lot, but I believe I'm rather talking about Value : each field should have one definition, but instead of having one value, it would make more sense to me that they have multiple completion sources. Or, if you don't want to duplicate metadata and mimic the original's field structure, keep the metadata separate and set value to None for {foo : {bar : Number}}. I think what's confusing me is that we do something in between: when there's a definite value we keep the same structure as the original field - value inside value and the rest in the metadata -, but for some reason, when there's no definition, we put the type annotation in the value?

Ok thanks, I'll see if I can find a better model for this. A few considerations:

the name Def was chosen because the same infrastructure is supposed to eventually drive "goto definition/goto declaration"

we want to keep (at least part of) the metadata attached to fields: it's used for generating documentation, and not just for finding subfields

Maybe not relevant for much longer, but the current determinant for whether Value is a type or a term is that when visiting a type in resolve_type, the value is a type. When visiting a term in resolve_term, the value is a term.

Ok, I think I better see your point of "when resolving a path, we might end up on a subpart either of a type of a term" now. Then do you use metadata only for the documentation? Can Def.metadata be set if value is a type for example?

During resolution, what happens if you end up on a value that has both a value definition, contract annotations and type annotations (say at on a field such as {foo | {..1..} : {..2..} = {..3..}. Does it gives rise to as many other values that you explore (one for each item), with the path copied?

Sorry for all the questions 😅

Not at all, questions are great!

For your first question, we're currently using it only for documentation. There are potentially other metadata fields in the completion response that we could be filling out; I haven't looked in much detail yet. Right now metadata is only set for fields in a Term::Record, so I guess that means it's only paired with value being a term.

The answer depends on what "end up on a value" means:

if you ask for all the fields on {foo | {..1..} : {..2..} = {..3..}} then you will get back a single Def, with identifier foo, value {..3..}, and the rest in metadata

but if you take {foo | {..1..} : {..2..} = {..3..}} and ask "what are the fields on the foo field?" (i.e. you call resolve_path with path [foo]) then there will be three recursive calls to FieldDefs::resolve: one for the value, one for the contract, and one for the type. The results of those will be merged together.

Ok, I think this modelling issue might be resolved by a refactor that I was thinking of doing anyway: making resolve one step lazier than it currently is. That is, currently resolve returns a collection of Idents and Def. Instead, I want resolve to return a collection of things that have fields. So basically, the old way can be obtained from the new way by flat-mapping over the things that have fields and getting all their fields.

"Things that have fields" are

#[derive(Clone, Debug, PartialEq)] pub enum FieldHaver { // I'd welcome a better name suggestion! RecordTerm(RecordData), Dict(Type), RecordType(RecordRows), }

There are three advantages of this:

efficiency: when resolving a long path, we aren't interested in most fields of the intermediate path elements, and so it was wasteful to collect them all

avoids some special-casing to handle the annoyance that we can't collect all fields of a Dict

avoids the need to have a Def type describing the "value" of a field.

I have a draft of this and it seems to work ok. If it sounds reasonable, I'll finish it up on Monday.

FWIW, from a conceptual point of view, I do find the FieldHaver modeling/naming idea simpler and more natural, indeed.

yannham · 2023-09-07T14:48:51Z

core/src/typecheck/linearization.rs

 pub struct Uninit;

-/// Marker trait for possible states of the linearization
-pub trait LinearizationState {}


I think the idea here was to avoid having nonsensical things like Linearization<Ident>. Of course you could still do it but you need the extra step of impl LinearizationState for Ident. But I'm not convinced this was really useful, so it's probably fine to remove

core/src/typecheck/unif.rs

jneem · 2023-09-07T17:58:08Z

We discussed this synchronously, but just so there's a record of it: I've replaced the double linearization pass by a single linearization pass that records two things. The eventual goal is to replace the AnalysisHost linearizer with the much simpler TypeCollector linearizer. But there's a substantial migration path, so in the meantime I've hopefully found a nicer way for them to coexist.

As for accumulating both the type and term information, I believe this is done correctly now. There are three cases here

a Term::Annotated merges the defs from the contracts, type, and term
every term with an inferred type merges in the defs from its type
defs with metadata (which come from record fields with metadata) merge in the defs from their metadata.

(This last one was missing until just recently)

yannham

The plan discussed in the comments is compelling, and this PR is totally functional, so it might be easier reviewing-wise to merge this first and improve then.

jneem · 2023-09-11T16:02:48Z

Ah sorry, while you were typing that I finished and pushed the refactor...

yannham · 2023-09-11T16:25:09Z

It's not very important 🙂 just ping me when it's ready to review

This also adds a feature flag for turning off the old completer.

jneem · 2023-09-14T16:29:24Z

@yannham This has been rebased and it's ready for another look. The main changes compared to last time are in field_walker.rs

yannham

Didn't look into all the details again, but I found the field_walker part easier to follow, and much less confusing 👍

yannham · 2023-09-15T17:53:43Z

lsp/nls/src/field_walker.rs

-    ///
-    /// So, for `term.x.y.z`, this will be `vec!["x", "y", "z"]`.
-    pub path: Vec<Ident>,
+pub enum FieldHaver {


FieldHolder? FieldContainer? Recordoid (ok, maybe not this one) ? I'm just randomly spitting suggestions, feel free to ignore

🤷 I stuck with FieldHaver for now; we can always rename it later.

yannham · 2023-09-15T17:56:11Z

lsp/nls/src/field_walker.rs

-    pub value: Option<RichTerm>,
-    /// Field metadata.
-    pub metadata: Option<FieldMetadata>,
+enum FieldValue {


What about FieldContent or FieldData (or something else) ? All in all I believe value is generally a reasonable name for it, but it's already quite overloaded in Nickel (values in the sense of the lambda calculus, and Field already has a member named value, which is rather a RichTerm)

FieldContent it is.

yannham · 2023-09-15T17:58:18Z

lsp/nls/src/field_walker.rs

+    pub fn ident(&self) -> LocIdent {
+        self.ident
+    }


What's is the expected usage of self.ident() instead of just doing self.ident ?

I think I had a reason for it originally, but I can't figure out what 😕 It's gone now.

* Collect types into a table * Type-based completion This also adds a feature flag for turning off the old completer. * Remove bogus file * Explicitly visit type annotations * Add a test for annotated types * Only do one linearization traversal * Fix docs, and avoid some clones * Combine, and resolve metadata * Add a test * Slight improvements to the lsp spec * Refactor * Remove commented-out code * Remove the confusing initial-env thing * Complete the refactor, and document a little better * Fix some rebase issues * Clean up some unused code * Remove obsolete doc * Review comments

jneem requested review from vkleen, yannham and Radvendii September 5, 2023 20:16

github-actions bot temporarily deployed to pull request September 5, 2023 20:20 Inactive

vkleen approved these changes Sep 6, 2023

View reviewed changes

lsp/nls/src/cache.rs Outdated Show resolved Hide resolved

lsp/nls/src/field_walker.rs Outdated Show resolved Hide resolved

github-actions bot temporarily deployed to pull request September 7, 2023 14:50 Inactive

yannham reviewed Sep 7, 2023

View reviewed changes

github-actions bot temporarily deployed to pull request September 7, 2023 17:46 Inactive

github-actions bot temporarily deployed to pull request September 7, 2023 18:02 Inactive

jneem mentioned this pull request Sep 7, 2023

Context completion #1584

Merged

yannham approved these changes Sep 11, 2023

View reviewed changes

github-actions bot temporarily deployed to pull request September 11, 2023 15:35 Inactive

jneem added 14 commits September 14, 2023 11:06

Collect types into a table

e4f4695

Type-based completion

05c85fc

This also adds a feature flag for turning off the old completer.

Remove bogus file

699f5d4

Explicitly visit type annotations

cda3b4e

Add a test for annotated types

cb0a420

Only do one linearization traversal

420954d

Fix docs, and avoid some clones

dc1598d

Combine, and resolve metadata

1681201

Add a test

5bbcd20

Slight improvements to the lsp spec

31d3876

Refactor

d378a62

Remove commented-out code

9aac6c0

Remove the confusing initial-env thing

a74d428

Complete the refactor, and document a little better

547b22a

Fix some rebase issues

4263fe6

jneem force-pushed the nls-types branch from e0db775 to 4263fe6 Compare September 14, 2023 16:22

github-actions bot temporarily deployed to pull request September 14, 2023 16:27 Inactive

Clean up some unused code

57ab058

github-actions bot temporarily deployed to pull request September 14, 2023 16:32 Inactive

Remove obsolete doc

596fb5b

github-actions bot temporarily deployed to pull request September 14, 2023 17:04 Inactive

yannham approved these changes Sep 15, 2023

View reviewed changes

Review comments

7f36340

jneem enabled auto-merge September 15, 2023 18:43

github-actions bot temporarily deployed to pull request September 15, 2023 18:47 Inactive

jneem added this pull request to the merge queue Sep 15, 2023

Merged via the queue into master with commit b2459d3 Sep 15, 2023
4 checks passed

jneem deleted the nls-types branch September 15, 2023 19:25

p-linnane mentioned this pull request Nov 16, 2023

nickel 1.3.0 Homebrew/homebrew-core#154557

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Type-based completion in the new completer #1577

Type-based completion in the new completer #1577

jneem commented Sep 5, 2023

jneem commented Sep 5, 2023

yannham left a comment

yannham Sep 7, 2023

jneem Sep 7, 2023

yannham Sep 8, 2023 •

edited

Loading

yannham Sep 8, 2023

jneem Sep 8, 2023

yannham Sep 8, 2023

jneem Sep 8, 2023

jneem Sep 8, 2023 •

edited

Loading

yannham Sep 11, 2023

yannham Sep 7, 2023

jneem commented Sep 7, 2023

yannham left a comment

jneem commented Sep 11, 2023

yannham commented Sep 11, 2023

jneem commented Sep 14, 2023

yannham left a comment

yannham Sep 15, 2023

jneem Sep 15, 2023

yannham Sep 15, 2023

jneem Sep 15, 2023

yannham Sep 15, 2023

jneem Sep 15, 2023

Type-based completion in the new completer #1577

Type-based completion in the new completer #1577

Conversation

jneem commented Sep 5, 2023

jneem commented Sep 5, 2023

yannham left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yannham Sep 8, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jneem Sep 8, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jneem commented Sep 7, 2023

yannham left a comment

Choose a reason for hiding this comment

jneem commented Sep 11, 2023

yannham commented Sep 11, 2023

jneem commented Sep 14, 2023

yannham left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yannham Sep 8, 2023 •

edited

Loading

jneem Sep 8, 2023 •

edited

Loading