-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reads dont complete with parallelism #2215
Comments
i also tried using the readOnly:true on a transaction, didn't seem to help either. and also tried pagination. no faster. |
You may want to try stream(). You should receive documents as they arrive, thereby avoiding the delay. Please let us know if you experience a performance improvement by doing this. |
@michaelAtCoalesce The 4 requests, are they for the same query as the single request? If so, you are requesting 4 times as much data and your network might be the bottleneck. In fact, the linear scaling likely indicates that the bandwidth is being maximized. Firestore might also be the bottleneck, in which case you should also understand scaling traffic. Firestore will dynamically add more capacity as required, but this takes time. https://firebase.google.com/docs/firestore/best-practices#ramping_up_traffic |
Yes it’s the same request. It’s not that much data (20 megabytes) so I don’t think it’s a matter of bandwidth… I’m on very fast gigabit internet on a beefy machine, I think it’s something Firestore backend related. Is there something potentially going on with how reads occur in the backend? Doesn’t Firestore do some kind of optimistic locking on reads that might cause this kind of behavior if multiple readers of a collection are executing? In this case, I’d be okay with an older snapshot of the data or one from a cache, as long as it was consistent. Is there a way to do that? I tried a readOnly transaction and it didn’t appear to help performance either. |
@michaelAtCoalesce I just noticed that you have |
No, live Firestore |
Your queries won't lock anything in read-only transactions, nor outside of transactions. Optimistic concurrency doesn't use locks at all. This is what some of the other Firestore SDKs use. This SDK will only use locks within a transaction, but much of that has been optimized away. From what I understand, you are not using any transactions? Since your test doesn't use transactions, so locks should not be a concern. |
There is an optimization where you specify read time. By doing so, Firestore can serve data from closest replica. See: https://firebase.google.com/docs/firestore/understand-reads-writes-scale#stale_reads |
update - i did another test where i had two separate processes, then submitted a parallel request through each process. it appears as though they complete in parallel well. so it appears that something that is specific to executing parallel reads in a single node process is causing this. i'm also noticing that for a ~20mb payload, the memory goes up about 800 megabytes... (this is with the preferRest) option it may be potentially related to the fact that the memory usage for this test case goes up so quickly, that it becomes a problem. it might be worth investigating why the memory usage on a single get() call of 20 megabytes worth of data is causing a spike of 800 megabytes in memory. |
i have a collection with ~2000 documents, each of ~10k-20k bytes.. so maybe 10-40 megabytes.
when i submit a single get() of all documents in this collection via the node.js SDK, i get great response times. 11.54 seconds.
however, when i submit 4 at nearly the same time such that they run in parallel, i see that the reads finish such that they are barely faster than if i had submitted them sequentially. ~44 seconds.
i would expect to see, slightly slower in the multiple concurrent get() case, but not this linearly increasing behavior.
@google-cloud/firestore
version: 7.10.0Steps to reproduce
create a firestore collection of sufficient size and documents. execute a get() on that collection, see that it completes fine as a single request. then submit with multiple concurrent get() read operations, and notice that the time is almost scaling linearly.
The text was updated successfully, but these errors were encountered: