Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved (but possibly expensive) reference-finding #1800

Merged
merged 4 commits into from
Feb 8, 2024
Merged

Conversation

jneem
Copy link
Member

@jneem jneem commented Feb 3, 2024

The backstory here is that I attempted to implement renaming support in LSP but ran into some limitations with our support for finding references to record fields. Specifically, in an example like

let x = { foo = 1 } in
let y = x in
y.foo

we could tell that the definition of y.foo is the foo in { foo = 1 }, but we couldn't go the other way around. There are two difficulties in implementing the other direction:

  • it's awkward to write because the analysis mostly goes "up" the AST (whereas the direction we already implemented mostly goes "down" the AST)
  • we want it to be more-or-less an inverse of the direction we've already implemented, but it's tricky to get right and will be a pain to keep in sync.

This PR implements reference-finding in a way that's possibly inefficient but easy to keep in sync: to find references too the first foo in the example above, we find all places with a .foo static access, run the definition-finding direction that we've already implemented, and check whether the foo we started with is one of the definitions that was found.

I definitely need to do some more performance measurements here, but at least its fast on the biggest examples I have easily available (some 2000-line json-schema-to-nickel outputs, and the organist source tree).

@jneem jneem requested review from yannham and vkleen February 3, 2024 00:26
@github-actions github-actions bot temporarily deployed to pull request February 3, 2024 00:31 Inactive
Copy link
Member

@yannham yannham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit wary of a an analysis that needs to traverse the whole AST, although it might not be too horrible in practice, things like that tend to add up over time. It seems what we want is to have an undirected edge between a usage and a definition. Would that be too hard to just store the reverse edge alongside the definition node each time we link from a usage site to a definition site? I think the AST isn't mutated once it's built, so it shouldn't be too error-prone, as there should be only one place where we create such links (if we had to handle edits that would remove such edges it might start to be a bit more annoying).

@jneem
Copy link
Member Author

jneem commented Feb 5, 2024

So we do store a link between usage and definition sites, but in a way that turns out not to be useful enough. Specifically, "definition" here means a let binding, function parameter, or field in a record, and "usage" means a Term::Var.

The problem is that in order to have a nice LSP experience, we really want "usage" to also include Term::Op1(UnaryOp::StaticAccess), and I don't know of a good way to record all those links in a single AST pass. Right now, we have a reasonably efficient way to resolve a single static access to all its definition sites (which we use for answering "goto definition" queries), but only for one usage site at a time. I'll think about whether this can be extended to a single pass for all usage sites...

@yannham
Copy link
Member

yannham commented Feb 5, 2024

I see. Maybe I misunderstood then: the additional traversal of the AST to find all static accesses is done only once, and then is accessed through a Hashmap? In this case I believe it's reasonable (whether it can be improved or not), as it's just an additional pass. I initially thought we would have to traverse the whole AST for each and every request/renaming, which is something different really.

@jneem
Copy link
Member Author

jneem commented Feb 5, 2024

Right: the full traversal to find the static accesses is done once. On each request, we look up all the potentially related static accesses in a hashmap and do some partial traversals for each one. But if you have a lot of fields with the same name, it could still be a lot of processing per request.

Copy link
Member

@yannham yannham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think it might be worth adding a short summary of what is discussed in this PR to #1484 (comment), for future reference.

@@ -212,6 +227,7 @@ pub struct Analysis {
pub usage_lookup: UsageLookup,
pub parent_lookup: ParentLookup,
pub type_lookup: CollectedTypes<Type>,
pub static_accesses: HashMap<Ident, Vec<RichTerm>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth a little description of what it is

@github-actions github-actions bot temporarily deployed to pull request February 8, 2024 15:32 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 8, 2024 16:01 Inactive
@jneem jneem enabled auto-merge February 8, 2024 16:01
@jneem jneem added this pull request to the merge queue Feb 8, 2024
Merged via the queue into master with commit ec776e5 Feb 8, 2024
5 checks passed
@jneem jneem deleted the improved-usage branch February 8, 2024 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants