-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RUST-1712 Latency Sees Unbounded Growth When Doing Any Amount Of Concurrent Writes #913
Comments
Some more interesting information, whenever I update the logic in the thread so that each thread establishes its own client, the individual write performance is much faster and we don't experience the unbounded growth in latency. Though of course the overall performance is slower since it creates a whole client for each thread. But if it was a simple issue of an internal database/collection lock then whether its one client or 100 it shouldn't matter. Do y'all have any idea of what this could mean? Here's the new thread snippet where the client is created inside the thread let handle: tokio::task::JoinHandle<()> = tokio::spawn(async move {
let db = {
let client_options = mongodb::options::ClientOptions::parse(uri).await.unwrap();
let client = mongodb::Client::with_options(client_options).unwrap();
client.database(db_name)
};
let collection: mongodb::Collection<EventDocument> =
db.collection::<EventDocument>("events");
collection.list_index_names().await.unwrap();
let instant = tokio::time::Instant::now();
let event = EventDocument {
id: uuid::Uuid::new_v4(),
owner_id: format!("event-{}", i),
};
collection.insert_one(event, None).await.unwrap();
println!("inserted {} in {:?}", i, instant.elapsed());
}); It's important to note that adding a random sleep in the thread doesn't have the same effect of saving the individual write performance: let random_sleep = rand::random::<u64>() % 1000;
tokio::time::sleep(tokio::time::Duration::from_millis(random_sleep)).await;
collection.insert_one(event, None).await.unwrap(); The above code results in the same unbounded growth of latency |
@abr-egn Any insights? |
I haven't had time to look into this in any detail, sorry - hoping to get to it next week :) At first glance, I wonder if the use of |
@abr-egn I can try that out! why would |
I tested it with use futures::StreamExt;
use serde;
#[derive(serde::Serialize, serde::Deserialize, Clone, Debug, PartialEq)]
struct EventDocument {
#[serde(
rename = "_id",
with = "mongodb::bson::serde_helpers::uuid_1_as_binary"
)] // rename for it to be picked up by the _id index
pub id: uuid::Uuid, // UUID
pub owner_id: String,
}
#[tokio::main]
async fn main() {
tracing_subscriber::fmt::init();
let db_name = "slow_test";
let uri = "<uri>";
let db = {
let client_options = mongodb::options::ClientOptions::parse(uri).await.unwrap();
let client = mongodb::Client::with_options(client_options).unwrap();
client.database(db_name)
};
let collection = db.collection::<EventDocument>("events");
collection.list_index_names().await.unwrap();
let total_count = 100;
let total_instant = tokio::time::Instant::now();
// let mut handles = vec![];
let futures = futures::stream::FuturesUnordered::new();
for i in 0..total_count {
let collection = collection.clone();
let handle: tokio::task::JoinHandle<()> = tokio::spawn(async move {
let instant = tokio::time::Instant::now();
let event = EventDocument {
id: uuid::Uuid::new_v4(),
owner_id: format!("event-{}", i),
};
collection.insert_one(event, None).await.unwrap();
println!("inserted {} in {:?}", i, instant.elapsed());
});
// handles.push(handle);
futures.push(handle);
}
let results: Vec<_> = futures.collect().await;
println!("total time {:?}", total_instant.elapsed());
} With the following results:
|
Thank you for testing! I didn't have a concrete cause to suspect, but there have been timing surprises with Also, I'd like to say that I really appreciate the thorough bug report with the minimized repro case! |
I'm happy to have tested it out, there's always small things that could trip you. And always happy to provide as much info as possible! I know that the driver and the database are both quite complex so I know y'all need as much info as possible to figure out any issues. |
JIRA ticket: https://jira.mongodb.org/browse/RUST-1712 |
FYI, I've been able to reproduce this on a local test instance, so it's nothing about your particular server config. Actively investigating for a root cause, I suspect accidental serialization in the driver's connection pool handling. |
@abr-egn Thanks! Let me know if there's anything you need from me or anything I can to do help |
Okay, this is a complicated situation :)
The behavior this adds up to is that all of the operations get added to the queue at once; the first two get new connections spun up, while the rest have to wait for the connection process to finish. Then the next two get spun up, and so on. This causes the linearly increasing latency that you've observed. Mitigations for this:
With those two in place, you should be able to set |
@abr-egn Thanks for getting back to me! I had also experienced this issue in non-cold-start environments which is why I was concerned. I'm glad there's a solution so we can deal with write spikes more smoothly. It's fine if it takes more network and memory resources So before doing the writes I had the command I can make a PR so that |
I tried something very similar - the problem is here that it's basically racing the network latency against the command time, and it's very possible for the command to finish quickly enough that it doesn't actually fully warm up the pool. This is why I'd like to add an explicit method for that :)
I'm not surprised you weren't able to see clear results from tuning that parameter, though. If concurrency exceeds the actual number of connections available, it causes a bubble in the connection pipeline where every command in the queue has to wait for the connection spin-up, and you're back to linear latency increases. Seeing real improvements for your specific case needs all of a fully-warmed pool, a high
It'd be excellent if you can put together that PR! And confirmed in your understanding - the limiting machinery is already in place, this would be replacing the constant limit it uses with a user-configured one. |
@abr-egn I've put up #923, its a rough PR but hopefully it achieves what we need for Also this may be a bit in the weeds but I am curious. On the pool warming (to make sure that we actually have |
Also @abr-egn if your bandwidth is tight right now I can also try to implement the second part to actively spin up connections. Unsure of the complexity involved though lmk |
I hope to have that put together today or Monday :) I suspect that for someone less familiar with the driver internals it would be a much slower project. Thank you, though! |
@abr-egn Any updates on the pool warming work you mentioned? |
I have a PR in progress; it's blocked behind #920 (that does some foundational work needed for this). I was OOO for the last couple of days, but I'm hoping to get that one merged this week. |
Thanks! And I hoped you enjoyed your time off! |
Sorry this has taken substantially longer than I first estimated! I've got good news and bad news. The good news is that I have an imminent PR that adds the prewarming method, and it almost entirely eliminates the linear latency effect from the driver. An instrumented run (very similar to yours, with some added internal tracking) without the pre-warm:
The first set of inserts (up through 13) are essentially re-using the same connection; these are interleaved with tasks being spawned for new connection spin-up, but some of the inserts finish fast enough that more can be executed on the same connection before any new connections finish establishing. After that, it's a mixture of new and re-used connections, with the overall result being the linear execution time. Now, with the pre-warming added:
Connections being retrieved from the connection pool take effectively no time, and overall latency is determined entirely by time to execute (I've validated that the time there is spent waiting for a reply from the server, not doing anything in the driver). However:
On the latter point, I strongly suspect this is running into pessimal behavior for mongodb's locking behavior. It's likely you'd see improvements for the specific case of writing several documents into the same collection concurrently by buffering them locally and calling |
@abr-egn Thank you for your work on this! To address the points made in your comment:
|
Yeah, just the time cost of running the warm-up loop. There's no ongoing overhead to having run the prewarm, it's implemented as a task that requests fresh connections in a loop until the pool hits If you want to make sure the connection pool always stays warmed up for a long-running service, you'll need to run this both at service startup and when topology changes cause pools to be cleared. The warmup task doesn't block normal operations so after a pool clear you'll see a temporary bump in increased latency as connections get re-established or operations wait for in-progress connections to finish establishing. Turning up |
@abr-egn Do you have any updates on when I should expect the beta release (and the full version) to be cut? Thanks! |
We'll be doing the beta release next week; timing of the full release after that depends how it goes, of course, but we generally give at least a few weeks for feedback to happen. |
Thanks! |
We just put out release 2.7.0-beta, which includes these changes. |
@abr-egn Thanks! |
Versions/Environment
cargo pkgid mongodb
&cargo pkgid bson
)https://github.com/rust-lang/crates.io-index#mongodb@2.6.0
andhttps://github.com/rust-lang/crates.io-index#bson@2.6.1
db.version()
) 6.0.8Describe the bug
When writing concurrently do a MongoDB collection, the full time until execution continues after the insert has been completed grows non-stop linearly. So if you do 10 concurrent writes, whichever one is the first to go will be done in ~60ms but the slowest will take about 480ms. This slowness growths linearly with a 100 concurrent writes the slowest one takes more than 1 second.
This is despite the tracing logs seemingly saying that the command continues to succeed in the appropriate amount of time:
What I would expect is that the difference between running 1 write operation concurrently and running 2 to not be meaningfully different. While I'd expect latency to get worse with a significant amount of concurrent writes, I'd expect that to occur more gradually and under more significant load. And that the time it takes to do the write would be about the same as what the command succeeded tracing logs emits.
We've had this issue for a while but only recently took the time to isolate the issue. I've posted a rust program that replicates the issue with the relevant details below
Things I've tested out:
Based on the command log, it implies that the driver is getting things back at the time that I'd expect but that's possibly a red herring Apart from some lock-related issue I'm not sure what could be causing this. Do y'all have any idea of what the issue could be
To Reproduce
Use the following program to reproduce the issue and run it with the following command:
main.rs
Cargo.toml:
The text was updated successfully, but these errors were encountered: