-
Notifications
You must be signed in to change notification settings - Fork 656
perf(rome_service): increase the throughput of the remote daemon transport #3151
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
increase the throughput of the remote daemon
how did you test that this change is increasing the throughput or is the main goal to allow concurrent use?
Yes the main goal of the PR is to allow multiple requests to be in-flight at the same time. This leads to the following improvements to performance when running the formatter CLI in release mode on the Wall time on
Wall time on
|
Nice! Sorry for all these questions but I'm not familiar with how workspaces work. How is it that allowing multiple requests improves the CLI performance? I would have expected that the CLI makes a single request: "format files: " or is it that we do the traversal inside of the CLI and then call into the deamon? |
245d297
to
917d057
Compare
Exactly, the Workspace API currently has little to no awareness of the file system, so the traversal is handled by the CLI that issues a sequence of open -> format -> close calls for each file it process. The traversal is run in parallel on the rayon thread pool of the CLI, but since the initial implementation of the |
917d057
to
efae848
Compare
✅ Deploy Preview for rometools ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
crates/rome_cli/src/service/mod.rs
Outdated
/// - the "write task" pulls outgoing messages from the "write channel" and | ||
/// writes them to the "write half" of the socket | ||
/// - the "read task" reads incoming messages from the "read half" of the | ||
/// socket, then looks up a request with an ID corresponding to the received | ||
/// message in the "pending requests" map. If a pending request is found, it's | ||
/// fullfilled with the content of the message that was just received |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find really difficult to understand the documentation with these "write half", "write channel", etc. Probably because I lack some visual content and there's no context. Maybe it's best to be more precise or explain these terms and give context.
Unfortunately I am not familiar with the architecture and maybe it's best to be thorough, especially in case a person that is not familiar with these concepts will have to apply changes/fixes.
crates/rome_cli/src/service/mod.rs
Outdated
/// fullfilled with the content of the message that was just received | ||
/// | ||
/// In addition to these, a new "foreground task" is created for each request. | ||
/// These tasks create a "oneshot channel" and store it in the pending requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// These tasks create a "oneshot channel" and store it in the pending requests | |
/// Each foreground task creates a "oneshot channel" and stores it in the pending requests |
crates/rome_cli/src/service/mod.rs
Outdated
{ | ||
let (socket_read, mut socket_write) = split(socket); | ||
let (write_send, mut write_recv) = channel(512); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's 512? Would it be possible to have int a const
? The name of the variable might help.
crates/rome_cli/src/service/mod.rs
Outdated
/// Implementation of [WorkspaceTransport] for types implementing [AsyncRead] | ||
/// and [AsyncWrite] | ||
/// | ||
/// This implementation makes use of two "background tasks": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we give some arbitrary names to these background tasks? Naming them might help us across the codebase, so we can mention them using this internal, arbitrary names.
crates/rome_cli/src/service/mod.rs
Outdated
|
||
let (read_send, read_recv) = mpsc::unbounded_channel(); | ||
let (write_send, mut write_recv) = mpsc::unbounded_channel::<Vec<_>>(); | ||
runtime.spawn(async move { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it's possible to move the two tasks into two separate functions? If not, that's fine. I just thought that having the two foreground tasks separated, might have helped us to document them.
crates/rome_cli/src/service/mod.rs
Outdated
let (message, response): (_, JsonRpcResponse) = match message { | ||
Ok(message) => message, | ||
Err(err) => { | ||
eprintln!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it OK to have a eprintln
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a serialization or I/O error we don't have enough information to report this error to a specific pending request (since we failed to deserialize the request ID in the first place), and since this a background task we can't easily access the Console
object to report the error there (specifically we'd have to wrap the console in a Mutex
to share it between this and the rest of the CLI code), so it seemed like the only option here is to print the error to stderr directly and exit the task
crates/rome_cli/src/service/mod.rs
Outdated
|
||
let length = length.context("incoming response from the remote workspace is missing the Content-Length header")?; | ||
if let Some((_, channel)) = pending_requests.remove(&response.id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does .remove
do here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The remove
method is used to take ownership of an entry in the hashmap, if the map contains an entry with the key response.id
then the entry is removed from the map and the method returns Some((key, value))
. Using remove
rather than get
is important for two reasons, first the values of the hashmap are "oneshot channels" and as the name indicates these can only be used once (the send
method takes self
by value) so we need to take ownership of the channel to consume it and fulfill the request. And more generally this also ensures that we don't keep old requests around in the map by removing the corresponding entry once it gets fulfilled.
d62b8c1
to
a1f491d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested the branch locally. I tried to run the check
command using the --use-server
argument on the Rome classic code base, internals
folder.
The main
branch took 52 seconds to lint 5366 files.
This branch took 14 seconds to lint 5366 files.
Co-authored-by: Emanuele Stoppa <my.burning@gmail.com>
Summary
This PR improves the performance of the remote daemon transport, primarily by allowing both the client and server to process more requests at the same time. The changes include:
WorkspaceTransport
trait to make it aware of request IDs. This lets the transport layer send out multiple requests to the remote server at the same time and process the responses out of order (this makes use of thedashmap
crate to store the handles of pending requests in a thread-safe hashmap)OwnedReadHalf
andOwnedWriteHalf
types provided bytokio
on Unix, and manually implemented<Client|Server>ReadHalf
and<Client|Server>WriteHalf
types on Windows to support "true" parallel I/O operations on the transport socket, rather than relying on thesplit
function provided bytokio
(that relies on a form of spinlocking internally)rome/*
remote calls in the blocking thread pool of the server. This avoids starving the scheduler if a file takes longer to processAdditionally, I also modified the signature of the
supports_feature
method of the Workspace to return aResult
, since it may error as any other method if there's an issue with the remote connection. Finally, I added a timeout error to all remote calls as a last-resort measure, currently the timeout is fixed to 15 seconds.Test Plan
These are internal-only changes that should not be visible to end users, some of these are covered by the tests for the node.js JSON-RPC bindings but the OS-specific I/O code is still mostly untested.