Random timeouts in api calls to bigquery.googleapis.com #40
Labels
api: bigquery
Issues related to the googleapis/python-bigquery API.
priority: p2
Moderately-important priority. Fix may not be included in next release.
type: bug
Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
After updating google-cloud-bigquery from version 1.19.0 to 1.24.0,
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='bigquery.googleapis.com', port=443): Read timed out. (read timeout=11.0)
started to pop up randomly.This is likely related to #34
Environment details
Ubuntu 19.10 x64
3.7.5
pip 20.0.2
google-cloud-bigquery
version:1.24.0
Steps to reproduce
The error is not deterministic, however we observed it on production environments (compute instances in GCP) and local developement environments.
For about 100 queries, at least 1-2 fails with this error (which makes it reproducible). This makes something like 1-2% of ALL requests to fail!
If I understand correctly, BQ API endpoint responsible for result() method will block for at most 10 seconds. There is also 1 second margin to neutralize network lags etc. No retry mechanism covering the timeout is present, so in case of delay of more than 1 second, the whole request will fail.
In my opinion, having 1.0s non-configurable timeout is not safe. Also not-mutating endpoints (like "is job finished") should automatically retry in case of timeout. This is not implemented at the moment.
We had to rollback to 1.19.0 to make everything stable again.
Code example
Nothing really helpful could be placed here.
The easiest way to reproduce this error is to run query that takes MORE than 10s in 100x loop.
Stack trace
The stack trace is always the same:
The text was updated successfully, but these errors were encountered: