Ark: Synchronise handling of LSP messages #2999

lionel- · 2024-05-03T12:18:57Z

While investigating #2692, @DavisVaughan discovered this thread discussing synchronisation issues within tower-lsp, the LSP server implementation used in Ark: ebkalderon/tower-lsp#284.

The summary is that while message ordering is preserved by tower-lsp at all times, because handlers are async, the ordering is likely to be compromised by yielding with .await. And we call .await a lot, for instance at handler entry to log a message in our LSP output. This causes two main issues:

Stale state. A did_change_ state-changing notification might be invoked and grab the document lock before a refactoring/analysis request had a chance to read state. When this request is yielded control back, suddenly the world has changed and the cursor positions or ranges that the request is working on have become stale. We think this explains crashes like Ark: LSP crash while looking for completions or creating new document context #2692.
Unordered responses. The LSP server should try its best to preserve ordering of messages, and it's even required to do so when the messages might change the document. Because of the concurrency of async handlers, this ordering is not guaranteed, causing the server and client states to drift apart.

To fix (1), we should synchronise our handlers on entry with a read-write lock. To fix (2), we should synchronise them on exit with some sort of queue.

This comment from the author of rust-analyzer is also relevant: https://reddit.com/r/rust/comments/tbsiv5/towerlsp_0160_lightweight_framework_for_building/i0exnbi/

while most of the requests can be handled asynchronously, a minority of requests are "real time", in that they block user's input (eg, selectionRanges). For such requests, you want to handle them "on the main loop" as soon as possible, like you'd do for state update notifications.

Worth noting that the ruff project recently decided (astral-sh/ruff#10158) to switch from tower-lsp to lsp-server (https://github.com/rust-lang/rust-analyzer/tree/master/lib/lsp-server), the framework created for rust-analyzer. This seems like a lot of work but allows greater control over the scheduling of handlers and avoids the tokio runtime.

The text was updated successfully, but these errors were encountered:

DavisVaughan · 2024-05-03T15:43:23Z

Based on our new understanding of all this, I did another analysis of how we could possibly be getting supposedly "out of order" did_change() requests - which is what forced me to put that pending changes queue in place. I think it can be described as the following series of events:

Client sends did-change request A
Client sends did-change request B
tower-lsp forwards did-change request A
tower-lsp forwards did-change request B
Everything is fine up to this point
did_change(A)
- Immediately calls backend_log() before locking the doc
- await the backend_log()
The await allows for switching to did_change(B) immediately and we pause did_change(A)
did_change(B)
- Immediately calls backend_log() before locking the doc
- await the backend_log()
- Assume we stay in did_change(B) this time instead of switching back to did_change(A)
- Takes the lock on the doc
- Tries to process the change, but it looks out of order! Saves B as a pending change.
did_change(A) finally gets control back
- We process change A and the saved one from B

It's worth noting that if we had locked the doc before calling backend_log(), which is a totally reasonable thing to think you could do, then this would deadlock in this case (which we have seen many times in other scenarios), because when it switches to B and B tries to take the lock on the doc, it won't be able to until A releases it, but A can't release it until it gets control back. We are super careful about this in the diagnostics code, for example.

lionel- added bug Something isn't working lang: r labels May 3, 2024

lionel- added this to the Release Candidate milestone May 3, 2024

lionel- mentioned this issue May 3, 2024

Use global lock to synchronise LSP handlers posit-dev/ark#340

Closed

lionel- modified the milestones: Release Candidate, Public Beta 2024 Q2 May 9, 2024

lionel- added the area: kernels Issues related to Jupyter kernels and LSP servers label May 16, 2024

This was referenced May 17, 2024

Open files via Jupyter instead of LSP posit-dev/ark#357

Merged

Run LSP handlers consecutively by default posit-dev/ark#361

Merged

juliasilge assigned lionel- May 21, 2024

lionel- closed this as completed in posit-dev/ark#361 May 30, 2024

github-actions bot locked as resolved and limited conversation to collaborators Aug 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ark: Synchronise handling of LSP messages #2999

Ark: Synchronise handling of LSP messages #2999

lionel- commented May 3, 2024

DavisVaughan commented May 3, 2024 •

edited

Loading

Ark: Synchronise handling of LSP messages #2999

Ark: Synchronise handling of LSP messages #2999

Comments

lionel- commented May 3, 2024

DavisVaughan commented May 3, 2024 • edited Loading

DavisVaughan commented May 3, 2024 •

edited

Loading