Skip to content
This repository has been archived by the owner on Aug 31, 2023. It is now read-only.

perf(rome_service): increase the throughput of the remote daemon transport #3151

Merged
merged 6 commits into from
Sep 8, 2022

Conversation

leops
Copy link
Contributor

@leops leops commented Sep 2, 2022

Summary

This PR improves the performance of the remote daemon transport, primarily by allowing both the client and server to process more requests at the same time. The changes include:

  • Refactored the WorkspaceTransport trait to make it aware of request IDs. This lets the transport layer send out multiple requests to the remote server at the same time and process the responses out of order (this makes use of the dashmap crate to store the handles of pending requests in a thread-safe hashmap)
  • Use the OwnedReadHalf and OwnedWriteHalf types provided by tokio on Unix, and manually implemented <Client|Server>ReadHalf and <Client|Server>WriteHalf types on Windows to support "true" parallel I/O operations on the transport socket, rather than relying on the split function provided by tokio (that relies on a form of spinlocking internally)
  • Run the implementation of all the rome/* remote calls in the blocking thread pool of the server. This avoids starving the scheduler if a file takes longer to process

Additionally, I also modified the signature of the supports_feature method of the Workspace to return a Result, since it may error as any other method if there's an issue with the remote connection. Finally, I added a timeout error to all remote calls as a last-resort measure, currently the timeout is fixed to 15 seconds.

Test Plan

These are internal-only changes that should not be visible to end users, some of these are covered by the tests for the node.js JSON-RPC bindings but the OS-specific I/O code is still mostly untested.

@leops leops requested a review from a team September 2, 2022 16:18
Copy link
Contributor

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

increase the throughput of the remote daemon

how did you test that this change is increasing the throughput or is the main goal to allow concurrent use?

@leops
Copy link
Contributor Author

leops commented Sep 5, 2022

increase the throughput of the remote daemon

how did you test that this change is increasing the throughput or is the main goal to allow concurrent use?

Yes the main goal of the PR is to allow multiple requests to be in-flight at the same time. This leads to the following improvements to performance when running the formatter CLI in release mode on the rome/tools repo:

Wall time on feature/daemon-cli:

Formatted 52 files in 128ms

Wall time on perf/daemon-transport:

Formatted 52 files in 48ms

@MichaReiser
Copy link
Contributor

Yes the main goal of the PR is to allow multiple requests to be in-flight at the same time. This leads to the following improvements to performance when running the formatter CLI in release mode on the rome/tools repo:

Nice!

Sorry for all these questions but I'm not familiar with how workspaces work. How is it that allowing multiple requests improves the CLI performance? I would have expected that the CLI makes a single request: "format files: " or is it that we do the traversal inside of the CLI and then call into the deamon?

@ematipico ematipico added this to the 0.10.0 milestone Sep 5, 2022
@ematipico ematipico added the A-Core Area: core label Sep 5, 2022
@leops leops force-pushed the perf/daemon-transport branch from 245d297 to 917d057 Compare September 5, 2022 13:26
@leops
Copy link
Contributor Author

leops commented Sep 5, 2022

Yes the main goal of the PR is to allow multiple requests to be in-flight at the same time. This leads to the following improvements to performance when running the formatter CLI in release mode on the rome/tools repo:

Nice!

Sorry for all these questions but I'm not familiar with how workspaces work. How is it that allowing multiple requests improves the CLI performance? I would have expected that the CLI makes a single request: "format files: " or is it that we do the traversal inside of the CLI and then call into the deamon?

Exactly, the Workspace API currently has little to no awareness of the file system, so the traversal is handled by the CLI that issues a sequence of open -> format -> close calls for each file it process. The traversal is run in parallel on the rayon thread pool of the CLI, but since the initial implementation of the WorkspaceTransport did not allow for concurrent requests for simplicity this meant the traversal threads were spending most their time blocked waiting for unrelated requests to complete.

Base automatically changed from feature/daemon-cli to main September 5, 2022 14:21
@leops leops force-pushed the perf/daemon-transport branch from 917d057 to efae848 Compare September 5, 2022 14:36
@netlify
Copy link

netlify bot commented Sep 5, 2022

Deploy Preview for rometools ready!

Name Link
🔨 Latest commit 8a9c7db
🔍 Latest deploy log https://app.netlify.com/sites/rometools/deploys/6319f0497c40970009e814d1
😎 Deploy Preview https://deploy-preview-3151--rometools.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@leops leops temporarily deployed to netlify-playground September 5, 2022 14:36 Inactive
@github-actions
Copy link

github-actions bot commented Sep 5, 2022

@leops leops temporarily deployed to netlify-playground September 6, 2022 12:33 Inactive
crates/rome_lsp/src/server.rs Show resolved Hide resolved
Comment on lines 72 to 93
/// - the "write task" pulls outgoing messages from the "write channel" and
/// writes them to the "write half" of the socket
/// - the "read task" reads incoming messages from the "read half" of the
/// socket, then looks up a request with an ID corresponding to the received
/// message in the "pending requests" map. If a pending request is found, it's
/// fullfilled with the content of the message that was just received
Copy link
Contributor

@ematipico ematipico Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find really difficult to understand the documentation with these "write half", "write channel", etc. Probably because I lack some visual content and there's no context. Maybe it's best to be more precise or explain these terms and give context.

Unfortunately I am not familiar with the architecture and maybe it's best to be thorough, especially in case a person that is not familiar with these concepts will have to apply changes/fixes.

/// fullfilled with the content of the message that was just received
///
/// In addition to these, a new "foreground task" is created for each request.
/// These tasks create a "oneshot channel" and store it in the pending requests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// These tasks create a "oneshot channel" and store it in the pending requests
/// Each foreground task creates a "oneshot channel" and stores it in the pending requests

{
let (socket_read, mut socket_write) = split(socket);
let (write_send, mut write_recv) = channel(512);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's 512? Would it be possible to have int a const? The name of the variable might help.

/// Implementation of [WorkspaceTransport] for types implementing [AsyncRead]
/// and [AsyncWrite]
///
/// This implementation makes use of two "background tasks":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we give some arbitrary names to these background tasks? Naming them might help us across the codebase, so we can mention them using this internal, arbitrary names.


let (read_send, read_recv) = mpsc::unbounded_channel();
let (write_send, mut write_recv) = mpsc::unbounded_channel::<Vec<_>>();
runtime.spawn(async move {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's possible to move the two tasks into two separate functions? If not, that's fine. I just thought that having the two foreground tasks separated, might have helped us to document them.

let (message, response): (_, JsonRpcResponse) = match message {
Ok(message) => message,
Err(err) => {
eprintln!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it OK to have a eprintln here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a serialization or I/O error we don't have enough information to report this error to a specific pending request (since we failed to deserialize the request ID in the first place), and since this a background task we can't easily access the Console object to report the error there (specifically we'd have to wrap the console in a Mutex to share it between this and the rest of the CLI code), so it seemed like the only option here is to print the error to stderr directly and exit the task


let length = length.context("incoming response from the remote workspace is missing the Content-Length header")?;
if let Some((_, channel)) = pending_requests.remove(&response.id) {
Copy link
Contributor

@ematipico ematipico Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does .remove do here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remove method is used to take ownership of an entry in the hashmap, if the map contains an entry with the key response.id then the entry is removed from the map and the method returns Some((key, value)). Using remove rather than get is important for two reasons, first the values of the hashmap are "oneshot channels" and as the name indicates these can only be used once (the send method takes self by value) so we need to take ownership of the channel to consume it and fulfill the request. And more generally this also ensures that we don't keep old requests around in the map by removing the corresponding entry once it gets fulfilled.

@leops leops force-pushed the perf/daemon-transport branch from d62b8c1 to a1f491d Compare September 7, 2022 08:59
@leops leops temporarily deployed to netlify-playground September 7, 2022 08:59 Inactive
Copy link
Contributor

@ematipico ematipico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the branch locally. I tried to run the check command using the --use-server argument on the Rome classic code base, internals folder.

The main branch took 52 seconds to lint 5366 files.
This branch took 14 seconds to lint 5366 files.

crates/rome_cli/src/service/mod.rs Outdated Show resolved Hide resolved
leops and others added 2 commits September 8, 2022 15:36
@leops leops temporarily deployed to netlify-playground September 8, 2022 13:38 Inactive
@leops leops merged commit a9e7ebe into main Sep 8, 2022
@leops leops deleted the perf/daemon-transport branch September 8, 2022 14:29
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Core Area: core
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants