-
Notifications
You must be signed in to change notification settings - Fork 796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[object_store] Should object_store retry on broken pipe errors when calling put
?
#5545
Comments
Could you provide a bit of context on how you're running into this issue. I wonder if you are multiplexing CPU bound tasks on the same threadpool and thereby starving out the IO tasks? Possibly something similar to #5366
Perhaps you might file an upstream issue in the hyper repo to get feedback on exposing this upstream? I'd be interested to know the circumstances in which this error might occur, and if there is a reason it isn't currently exposed |
We're doing a lot of frequent writes, so that could be possible. It seems to be pretty sporadic when it happens. (EDIT: I'll try and reproduce it with a more minimal example next week) I can file an issue upstream to ask, thanks for the pointer. |
I encountered this performing a
Yes, we do perform quite a few tasks in parallel, but we do quite a bit of profiling and CPU bound tasks shouldn't generally be starving the thread pool.
I don't have any evidence to believe that |
Closed by #5609 |
Which part is this question about
object_store
Describe your question
We've been using
object_store
, and occasionally, we see logs like this (some details omitted for brevity) when callingput()
:This is despite having retries configured, so this was confusing to us why it was saying 0 retries. Should this be something that gets retried?
Additional context
This seems to be a similar issue to #5106.
I've also done a bit of digging - this looks like it's a hyper error when there's a
BodyWrite
error.It looks like in
object_store
we check for hyper errors here, but it looks like theBodyWrite
error isn't checked. Unfortunately, I don't think hyper exposes a public interface for this check at the moment.For now what we've done is manually wrap the
put
with our own retry and check for the errorDisplay
implementation, but that's pretty jury-rigged.The text was updated successfully, but these errors were encountered: