-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch Request Documentation #2375
Comments
Please, I'm also struggling with this, I was looking at the code in the calendar API, and there is no batch method or anything remotely related. |
This appears to be out of date: https://developers.google.com/drive/api/v3/batch#node.js It's quite a run-around to figure out what's going on. |
Judging by the amount of stale issues regarding batching, and the demand for it.. one can only assume they don't care and won't get to batching any time soon. I'll make an updated version of https://github.com/pradeep-mishra/google-batch as I need it for my project. |
Ok y'all, thanks for your patience (well some of you 😛). I will update the docs at some point after this, but to get folks unblocked... Why batch requests existBatch requests are useful when you need to make a bunch of requests, in a short period of time. Specifically, they help ensure you can open a single TCP/IP connection, and make many requests over that single connection. This is especially helpful in environments like node.js, where it's easy to accidentally open too many network connections at once. Batch requests still enforce quota rules, so you're not saving anything in the way of quota. The deprecation of global batchIn a prior life, there was a global endpoint for all batch requests, letting you compose multiple requests to different services, all the same time. Turns out this has real scale and data protection issues, so these global endpoints that support multiple services at once were shut down: What types of batch exist todayWith global batch gone, now the responsibility falls to individual services. For example, the gmail API supports it's own version of batch: The key differences here are:
Why it probably doesn't matterSo coming all the way back around to the original point - batch requests are good because they let you have a single TCP/IP connection, and reduce the overhead of establishing multiple connections. Good news everyone - this is exactly what HTTP/2 does: HTTP/2 works by multiplexing (n) requests over a single TCP/IP connection, reducing overhead when you need to make many HTTP requests in a short burst. The other good news is that as far as I can tell, all Google APIs seem to support HTTP/2 natively :) And the bonus good news, is that this library supports it now too! As of now - I don't think there's a significant benefit to supporting the somewhat bespoke format for batch requests natively in the library, as HTTP/2 gives us the same advantages, as far as I can tell, and already exists. So long as you're making all of your requests within the same 500ms window, to the same host - it shouldn't be an issue. I've been looking for some feedback on the HTTP/2 implementation - I would love it anyone here would be willing to use the flag, do a little lightweight profiling, and let me know how this impacts your performance. |
Maybe I'm mistaken, but drive API alteast seems limited to about 10 req/sec even if I'm using http2 🤔 Would http2 even help avoid the req/sec spam protection they have? If not batch support seems appropriate imho 😅 |
@asbjornenge can you share more about the requests per second limit you're hitting? Or the issues on spam protection? From what I can read here, there are specific response codes and descriptions that come back: Everything I've read says rate limits and quotas are the same regardless of batch vs individual API calls, but if you're finding something different please let me know! |
@JustinBeckwith Sure 😊 So I'm fetching a full folder tree that is about 222 (so alot, but nowhere near my 1000 req/100sec/user limit) folders in total with a depth of 5 on the deepest. I have a recursive loop trying to fetch these folders with 222 "simultaneous" (not really but) requests. As soon as I exceed 10 req/sec I hit the I read somewhere (that I cannot find now) that all google APIs had a rate limit of 10req/sec to avoid spamming. If we had batch support I could perform those queries on 3 requests (100+100+2) and should not have this issue. Current code with issues:
|
Hmm, this is unfortunate to hear. I may be doing something wrong with the way I'm approaching the issue, but I'm creating an app that may copy anywhere from 10 to 1000's of files depending on a template scheme, and I'm not seeing any good way of approaching this outside an individual request per item- not only is this messy, but it seems to me to be a bad programming pattern. I'm also not sure if this is intentionally anti-consumer (Can I even pay more to get a better rate-limit for my company?) but the lack of any mass object labeling/uploading/copying inherently neuters any attempt at creating projects that want to manage directory structures in the google drive- I shouldn't need to repeat the same API call with a different file number 1500 times to move files to a new folder, and this problem gets worse when you consider the rate limits and how conceptually slow they are. Considering not only the silly repetition of submitting practically the same request 1500 times, add in rate limiting and you're looking at minutes of runtime to copy folder structures that it would take a local system seconds to copy/move. At my scale, that's not an amount of time that I couldn't work around, but I can't imagine trying to copy some file-structure that's 10's of thousands in size. Further, I know at this point I should have found a rate limiting library or created something myself, but it amazes me that a google-supported module doesn't inherently have a way of bulk-modifying data that's compatible with it's own rate limit. Don't get me wrong, I appreciate the support and the fact that this library wouldn't exist without support from the community, but I can't be the only one who sees fault with the way we're being forced to do things. I just can't help but feel like I'm getting a very thin wrapper of the basic html API to begin with- It saddens me that there's no level of abstraction in the library (a query searching for folders returns a json object, instead of converting the response files into classes with "move" or "set___" options) that would inherently make this a significantly more powerful tool. At the moment, the apps-script api is significantly more powerful in that way. I apologize if I come across as crass, but I suppose I was hoping for a drag-drop replacement of google app-script but in node so I could organize/schedule it in my own fashion. |
@asbjornenge Justin explains at the top of his post that batching unfortunately doesn't save any api requests- a batch for 10 is equivalent to ten requests
Not to beat a dead horse, but it does confuse me a small amount why rate limiting shouldn't conceptually be reduced from batching- the number of requests to the server drops, and the difference in payload size for multiple requests in one should still be significantly smaller than the overhead of processing the same set of headers and metadata 100 times, so I can't see the batch requests not reducing rate limiting be related to bandwidth. I suppose actual I/O of the api calls could be a reason it it's own right, but somehow I don't see that being the case- 10 copy operations on a 10 gig file is going to have a storage impact significantly larger than changing the parent flag on 100 files in 10 seconds I'm probably way off on the reasoning here, just hoping to understand the logic. |
So what's the consensus on this. Should we completely ditch request batching and make use of HTTP/2? |
No, please implement batch. There are a handful of APIs that have optimizations around batch requests (Drive mostly) that significantly impact write throughput, such as when adding multiple permissions to a file. Yes, it's still the same # of API requests, but they're handled differently when batched in a way that allows for higher throughput than if they're made individually. |
I also have a question: |
Having hit the issue of batching Google apis requests using Node.js myself, I decided to develop a solution that follows the guidelines provided in Google's batching documentation, implementing the Please have a look and do not hesitate to raise an issue if you find a bug or would like to discuss improvements. |
## Description This use batch to delete Google permissions. As the official `googleapis` node library [doesn't support batch](googleapis/google-api-nodejs-client#2375), the package `@jrmdayn/googleapis-batcher` has been added, allowing batch request through the usage of `fetchImplementation` option provided by `googleapis`. ## Additional Notes Add any other context or screenshots about the feature request here. ## Checklist: Before you create this PR, confirm all the requirements listed below by checking the checkboxes like this (`[x]`). - [x] I have performed a self-review of my own code. - [ ] I have commented my code, particularly in hard-to-understand areas. - [x] I have added tests to cover the new feature or fixes.
Is your feature request related to a problem? Please describe.
It has been an extremely frustrating process to find how to perform batch requests (in fact, I still haven't found how) because documentation around performing batch requests is all out of date (including the page saying that the old form of batch requests will stop working and to refer to client documentation).
Due to rate-limiting and the likes, it would be hugely helpful to be able to batch requests like file updates when moving many files from one folder to another. I'm likely going to have to piece this together as a raw HTTP request instead of an google api method, if that's even possible.
Describe the solution you'd like
It would be nice to have clear, concise instructions on batching requests using this library or it should be clearly stated that it's not in the scope of this project. At the moment basically every thread about batching includes information that no longer works or just dies off without resolution.
Describe alternatives you've considered
To be able to batch requests I'm likely going to have to perform a direct HTTP request to the API for file updates, or institute a timeout and process requests one-by one. Neither solution is optimal.
Additional context
#513
#1130 includes information regarding http2, but no examples of "tight loops" or how this helps batch requests
#121 includes reference to the deprecation of batch endpoints, but once more no concise documentation on how to accomplish this post global endpoint
#740 once more suggests http2 is magically going to fix issues with batching, but suggest "per-api batch methods still exist"
#290 , #338 include ideas/outdated information about batching.
The text was updated successfully, but these errors were encountered: