Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RST_STREAM error from grpc via Bidi in Firestore Client Library #9890

Closed
crwilcox opened this issue Nov 25, 2019 · 9 comments · Fixed by #9995
Closed

RST_STREAM error from grpc via Bidi in Firestore Client Library #9890

crwilcox opened this issue Nov 25, 2019 · 9 comments · Fixed by #9995
Assignees
Labels
api: firestore Issues related to the Firestore API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@crwilcox
Copy link
Contributor

Implementing a watch/listen in Firestore, waiting a while (some reports of 30, 60, and 120 minutes) results in an unhandled RST_STREAM error. We should investigate if this can be handled.

Related to firebase/firebase-admin-python#282

google-api-core==1.14.3
google-auth==1.7.1
google-cloud-core==1.0.3
google-cloud-firestore==1.6.0
googleapis-common-protos==1.6.0
grpcio==1.25.0

Output and repro code at: https://gist.github.com/crwilcox/0ed944c7afa70a6a78dd8e8e76ebf1fd

@crwilcox crwilcox added the api: firestore Issues related to the Firestore API. label Nov 25, 2019
@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Nov 26, 2019
@crwilcox crwilcox self-assigned this Nov 27, 2019
@crwilcox crwilcox added priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. and removed triage me I really want to be triaged. labels Nov 27, 2019
@yoshi-automation yoshi-automation added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. labels Dec 2, 2019
@yoshi-automation yoshi-automation added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. labels Dec 12, 2019
@crwilcox
Copy link
Contributor Author

In Go we retry on far more error codes,
https://github.com/googleapis/google-cloud-go/blob/be07a624a2e4501df135ab7f731b7cfb408d9df2/firestore/watch.go#L507

codes.Unknown, codes.DeadlineExceeded, codes.ResourceExhausted, codes.Internal, codes.Unavailable, codes.Unauthenticated

@linevych
Copy link

Does retry mean that documents will be re-downloaded/do I get charged for it?

@crwilcox
Copy link
Contributor Author

In this case retry/recover has more to do with the channel stopping due to exceeding allowed lifetime. This time seems to be at around an hour.

While I am not certain of this, I think, due to not knowing how long it took to connect and what might have happened between the disconnect/reconnect, we do need to resynchronize the local model so I think we will have to redownload the references to the document references under watch.

@linevych
Copy link

I get it. But this doesn't resolve an actual problem with Firestore. Running a thread that will check Watch status and restart it when necessary is not a big deal. Paying the bills for tons of requests that are made because of a bug in the Firestore and additional workload on the own server is.

@yoshi-automation yoshi-automation added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. labels Dec 25, 2019
@crwilcox
Copy link
Contributor Author

@linevych the plan isn't to restart the stream so in theory this should be minimum amount of effort needed. I ran it for a couple of days and it didn't need to restream all the data, just notify of no change. So this seems to be what you would desire. I was being cautious to state such a thing without being sure.

Currently the fix in #9995 should resolve this.

@yoshi-automation yoshi-automation added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. labels Jan 1, 2020
@crwilcox
Copy link
Contributor Author

crwilcox commented Jan 2, 2020

https://pypi.org/project/google-cloud-firestore/1.6.1/ is released and contains the change to widen retries.

@linevych
Copy link

@crwilcox hi, looks like a restart of the stream does happen, here are the logs and example source code for it: https://gist.github.com/linevych/3480c133d6a839f2b04ee2dd4caa8a93

As you can see from the graph we are still getting billed for this.

https://i.imgur.com/9i4LNrW.png

@crwilcox
Copy link
Contributor Author

cc:
@schmidt-sebastian

@schmidt-sebastian
Copy link

The Python library should use the resume token from the previous session when it re-establishes the stream. Any further document updates should be merged with the state of the Snapshot at the time of the resume token. If properly implemented, documents that don't change will not be re-read.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: firestore Issues related to the Firestore API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants