Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache file parsing in the LSP server #2327

Closed
keyboardDrummer opened this issue Jun 30, 2022 · 5 comments
Closed

Cache file parsing in the LSP server #2327

keyboardDrummer opened this issue Jun 30, 2022 · 5 comments
Assignees
Labels
kind: enhancement Enhancements, feature requests, etc. These are NOT bugs, but ways we can improve Dafny part: language server Support for LSP in Dafny (server part; client is in ide-vscode repo) status: designed Issues that have a complete story on how to implement this feature and why we want it

Comments

@keyboardDrummer
Copy link
Member

keyboardDrummer commented Jun 30, 2022

Change

  • When a file has to be parsed by the LSP server that was previously parsed and has not changed since, and was used in the previous compilation, do not parse it again but instead use a cached result.

Not in scope

  • Incremental parsing, where potentially the parse of each AST node is cached individually
  • Parsing files in parallel
  • Caching resolution of modules

Prerequisites

None

Enables

Implementation hints

  • Use file last modified timestamps to determine if they've been changed or not.
  • We need to clone the AST after storing it in the cache, because it's mutated by the rest of the compilation
  • Parse cache entries can be removed if they were not used during a compilation

Subtasks:

  • Make Parser.Parse only depend on the contents and the origin of the file:
    • Remove IncludeToken
    • Remove IsToBeVerified and IsToBeCompiled from ModuleDefinition
    • Remove the ModuleDecl argument from Parser.Parse and use additional return values instead, such as DefaultModuleDefinition.
    • Remove the ErrorReporter argument from Parser.Parse and use additional return values instead, such as a BatchErrorReporter.
    • Remove the BuiltIns argument from Parser.Parse and use additional return values instead.
  • Introduce a cache for parsing
  • [Optional] Parallelize parsing. Now that parsing is a stateless operation, we can more easily do it in parallel. Each parsing task is asynchronous and uses the returned list of includes to create new parsing tasks. All parsing tasks are added to a list and parsing is done when all tasks in the list are completed.
@keyboardDrummer keyboardDrummer added kind: enhancement Enhancements, feature requests, etc. These are NOT bugs, but ways we can improve Dafny part: language server Support for LSP in Dafny (server part; client is in ide-vscode repo) status: designed Issues that have a complete story on how to implement this feature and why we want it labels Jun 30, 2022
@keyboardDrummer keyboardDrummer added this to the Project aware IDE milestone Jun 30, 2022
@cpitclaudel
Copy link
Member

Adding to the implementation notes (from an issue I ran into with the REPL):

  • Add a cloning function for builtins: I thought it would be OK to use new Builtins() when cloning a program, but the parsing process adds new entries (function types and tuple types) to program.Builtins, so new Builtins() isn't sufficient: we must clone program.Builtins (or make the accessors for tuple and function types lazy so that they get created on demand).

@keyboardDrummer keyboardDrummer self-assigned this Jul 4, 2022
@cpitclaudel
Copy link
Member

Use file last modified timestamps to determine if they've been changed or not.

Last time modified timestamps are not very reliable. I would consider hashing the contents of the file instead, because:

  • some Windows file systems store file modified time at 2 (or 4? I forget) seconds resolution, so you can miss changes if a file is saved twice in close succession
  • some file systems store local time, other UTC (this is especially an issue with external drives), and DST adjustment causes them to get out of sync
  • I have no idea how WSL interacts with Windows times, and I suspect it's not completely unlikely that we'll have people accessing Dafny files from both Windows and WSL.

Until we see file IO become a significant part of the pipeline, we should consider just re-reading the source files.

Parse cache entries can be removed if they were not used during a compilation

Will that cause us to lose the cache for a module if the corresponding include is removed from a file and then re-added soon after? The MemoryCache that Boogie uses has a time-based expiration policy that handles these cases nicely.

@keyboardDrummer
Copy link
Member Author

Until we see file IO become a significant part of the pipeline, we should consider just re-reading the source files.

What I don't like about that is that it breaks the scalability of the parse caching. Consider a project with a huge amount of files, then you'd still need to read all of them from disk for every edit.

Note that if a user edits a file through the IDE, the LSP server will get a change notification and we'll always remove that entry from the parse cache, so the time resolution might not be an issue in practice.

Will that cause us to lose the cache for a module if the corresponding include is removed from a file and then re-added soon after?

Yes it would.

The MemoryCache that Boogie uses has a time-based expiration policy that handles these cases nicely.

I think it might be nicer to use the number of edits since the cache entry was last accessed as a measure of staleness, rather than time. This should also be fairly easy to add when we have a mechanism for clearing the cache after 1 edit.

@keyboardDrummer
Copy link
Member Author

For BuiltIns we need a way to create a correct BuiltIns given a set of file parse cache entries. It could be a module traversal that fills BuiltIns, or having the parser return a list of calls that configure built-ins, that we cache.

We also need to stop using Include tokens since they influence how a file is parsed in a way that doesn't depend on the contents of the file.

@keyboardDrummer
Copy link
Member Author

Implemented by #4085

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: enhancement Enhancements, feature requests, etc. These are NOT bugs, but ways we can improve Dafny part: language server Support for LSP in Dafny (server part; client is in ide-vscode repo) status: designed Issues that have a complete story on how to implement this feature and why we want it
Projects
None yet
Development

No branches or pull requests

2 participants