Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BulkTransferException while using bazel-remote cache #15596

Closed
shubhindia opened this issue May 30, 2022 · 3 comments
Closed

BulkTransferException while using bazel-remote cache #15596

shubhindia opened this issue May 30, 2022 · 3 comments
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug

Comments

@shubhindia
Copy link

Description of the bug:

We are using bazel as a build system for our iOS apps and bazel-remote as our remote cache. We are seeing some BulkTransferExceptions. I have attached the build log below.
Bazel-version: v5.0.0
Bazel-remote version: v2.3.7

WARNING: Remote Cache: 500 Internal Server Error read tcp 172.XX.XX.XX:9092->172.XX.XX.XX:59844: i/o timeout com.google.devtools.build.lib.remote.common.BulkTransferException: 500 Internal Server Error read tcp 172.XX.XX.XX:9092->172.XX.XX.XX:59844: i/o timeout at com.google.devtools.build.lib.remote.util.RxUtils$BulkTransferExceptionCollector.onResult(RxUtils.java:91) at io.reactivex.rxjava3.internal.operators.flowable.FlowableCollectSingle$CollectSubscriber.onNext(FlowableCollectSingle.java:94) at io.reactivex.rxjava3.internal.operators.flowable.FlowableFlatMapSingle$FlatMapSingleSubscriber.innerSuccess(FlowableFlatMapSingle.java:173) at io.reactivex.rxjava3.internal.operators.flowable.FlowableFlatMapSingle$FlatMapSingleSubscriber$InnerObserver.onSuccess(FlowableFlatMapSingle.java:342) at io.reactivex.rxjava3.internal.operators.single.SingleDoFinally$DoFinallyObserver.onSuccess(SingleDoFinally.java:73) at io.reactivex.rxjava3.internal.observers.ResumeSingleObserver.onSuccess(ResumeSingleObserver.java:46) at io.reactivex.rxjava3.internal.operators.single.SingleJust.subscribeActual(SingleJust.java:30) at io.reactivex.rxjava3.core.Single.subscribe(Single.java:4855) at io.reactivex.rxjava3.internal.operators.single.SingleResumeNext$ResumeMainSingleObserver.onError(SingleResumeNext.java:80) at io.reactivex.rxjava3.internal.operators.completable.CompletableToSingle$ToSingle.onError(CompletableToSingle.java:73) at io.reactivex.rxjava3.internal.operators.completable.CompletableCreate$Emitter.tryOnError(CompletableCreate.java:91) at io.reactivex.rxjava3.internal.operators.completable.CompletableCreate$Emitter.onError(CompletableCreate.java:77) at com.google.devtools.build.lib.remote.util.RxFutures$OnceCompletableOnSubscribe$1.onFailure(RxFutures.java:102) at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1074) at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1213) at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:983) at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:771) at com.google.devtools.build.lib.remote.util.RxFutures$CompletableFuture.setException(RxFutures.java:276) at com.google.devtools.build.lib.remote.util.RxFutures$1.onError(RxFutures.java:210) at io.reactivex.rxjava3.internal.operators.completable.CompletableFromSingle$CompletableFromSingleObserver.onError(CompletableFromSingle.java:41) at io.reactivex.rxjava3.internal.operators.single.SingleCreate$Emitter.tryOnError(SingleCreate.java:95) at io.reactivex.rxjava3.internal.operators.single.SingleCreate$Emitter.onError(SingleCreate.java:81) at com.google.devtools.build.lib.remote.util.AsyncTaskCache$1.onError(AsyncTaskCache.java:306) at com.google.devtools.build.lib.remote.util.AsyncTaskCache$Execution.onError(AsyncTaskCache.java:197) at io.reactivex.rxjava3.internal.operators.completable.CompletableToSingle$ToSingle.onError(CompletableToSingle.java:73) at io.reactivex.rxjava3.internal.operators.completable.CompletableCreate$Emitter.tryOnError(CompletableCreate.java:91) at io.reactivex.rxjava3.internal.operators.completable.CompletableCreate$Emitter.onError(CompletableCreate.java:77) at com.google.devtools.build.lib.remote.util.RxFutures$OnceCompletableOnSubscribe$1.onFailure(RxFutures.java:102) at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1066) at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1213) at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:983) at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:771) at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:53) at com.google.devtools.build.lib.remote.http.HttpCacheClient.lambda$uploadAsync$10(HttpCacheClient.java:631) at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578) at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:552) at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491) at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616) at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609) at io.netty.util.concurrent.DefaultPromise.setFailure(DefaultPromise.java:109) at io.netty.channel.DefaultChannelPromise.setFailure(DefaultChannelPromise.java:89) at com.google.devtools.build.lib.remote.http.HttpUploadHandler.channelRead0(HttpUploadHandler.java:81) at com.google.devtools.build.lib.remote.http.HttpUploadHandler.channelRead0(HttpUploadHandler.java:40) at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Unknown Source) Suppressed: com.google.devtools.build.lib.remote.http.HttpException: 500 Internal Server Error read tcp 172.XX.XX.XX:9092->172.XX.XX.XX:59844: i/o timeout ... 32 more [5,171 / 5,177] Linking GojekHost_bin; 16s remote-cache, local WARNING: Remote Cache: 500 Internal Server Error read tcp 172.XX.XX.XX:9092->172.XX.XX.XX:59930: i/o timeout com.google.devtools.build.lib.remote.common.BulkTransferException: 500 Internal Server Error read tcp 172.XX.XX.XX:9092->172.XX.XX.XX:59930: i/o timeout at com.google.devtools.build.lib.remote.util.RxUtils$BulkTransferExceptionCollector.onResult(RxUtils.java:91) at io.reactivex.rxjava3.internal.operators.flowable.FlowableCollectSingle$CollectSubscriber.onNext(FlowableCollectSingle.java:94) at io.reactivex.rxjava3.internal.operators.flowable.FlowableFlatMapSingle$FlatMapSingleSubscriber.innerSuccess(FlowableFlatMapSingle.java:173) at io.reactivex.rxjava3.internal.operators.flowable.FlowableFlatMapSingle$FlatMapSingleSubscriber$InnerObserver.onSuccess(FlowableFlatMapSingle.java:342) at io.reactivex.rxjava3.internal.operators.single.SingleDoFinally$DoFinallyObserver.onSuccess(SingleDoFinally.java:73) at io.reactivex.rxjava3.internal.observers.ResumeSingleObserver.onSuccess(ResumeSingleObserver.java:46) at io.reactivex.rxjava3.internal.operators.single.SingleJust.subscribeActual(SingleJust.java:30) at io.reactivex.rxjava3.core.Single.subscribe(Single.java:4855) at io.reactivex.rxjava3.internal.operators.single.SingleResumeNext$ResumeMainSingleObserver.onError(SingleResumeNext.java:80) at io.reactivex.rxjava3.internal.operators.completable.CompletableToSingle$ToSingle.onError(CompletableToSingle.java:73) at io.reactivex.rxjava3.internal.operators.completable.CompletableCreate$Emitter.tryOnError(CompletableCreate.java:91) at io.reactivex.rxjava3.internal.operators.completable.CompletableCreate$Emitter.onError(CompletableCreate.java:77) at com.google.devtools.build.lib.remote.util.RxFutures$OnceCompletableOnSubscribe$1.onFailure(RxFutures.java:102) at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1074) at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1213) at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:983) at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:771) at com.google.devtools.build.lib.remote.util.RxFutures$CompletableFuture.setException(RxFutures.java:276) at com.google.devtools.build.lib.remote.util.RxFutures$1.onError(RxFutures.java:210) at io.reactivex.rxjava3.internal.operators.completable.CompletableFromSingle$CompletableFromSingleObserver.onError(CompletableFromSingle.java:41) at io.reactivex.rxjava3.internal.operators.single.SingleCreate$Emitter.tryOnError(SingleCreate.java:95) at io.reactivex.rxjava3.internal.operators.single.SingleCreate$Emitter.onError(SingleCreate.java:81) at com.google.devtools.build.lib.remote.util.AsyncTaskCache$1.onError(AsyncTaskCache.java:306) at com.google.devtools.build.lib.remote.util.AsyncTaskCache$Execution.onError(AsyncTaskCache.java:197) at io.reactivex.rxjava3.internal.operators.completable.CompletableToSingle$ToSingle.onError(CompletableToSingle.java:73) at io.reactivex.rxjava3.internal.operators.completable.CompletableCreate$Emitter.tryOnError(CompletableCreate.java:91) at io.reactivex.rxjava3.internal.operators.completable.CompletableCreate$Emitter.onError(CompletableCreate.java:77) at com.google.devtools.build.lib.remote.util.RxFutures$OnceCompletableOnSubscribe$1.onFailure(RxFutures.java:102) at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1066) at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1213) at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:983) at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:771) at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:53) at com.google.devtools.build.lib.remote.http.HttpCacheClient.lambda$uploadAsync$10(HttpCacheClient.java:631) at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578) at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:552) at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491) at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616) at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609) at io.netty.util.concurrent.DefaultPromise.setFailure(DefaultPromise.java:109) at io.netty.channel.DefaultChannelPromise.setFailure(DefaultChannelPromise.java:89) at com.google.devtools.build.lib.remote.http.HttpUploadHandler.channelRead0(HttpUploadHandler.java:81) at com.google.devtools.build.lib.remote.http.HttpUploadHandler.channelRead0(HttpUploadHandler.java:40) at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Unknown Source) Suppressed: com.google.devtools.build.lib.remote.http.HttpException: 500 Internal Server Error read tcp 172.XX.XX.XX:9092->172.XX.XX.XX:59930: i/o timeout ... 32 more

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

No response

Which operating system are you running Bazel on?

Mac OS 12.2.1

What is the output of bazel info release?

INFO: Invocation ID: d4cb32a8-86e0-467b-b007-c06c932378ec release 5.0.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@sgowroji sgowroji added type: bug more data needed untriaged team-Remote-Exec Issues and PRs for the Execution (Remote) team labels May 31, 2022
@sgowroji
Copy link
Member

Hello @shubhindia, Could you please provide complete reproduction steps of the above Exception. Thanks!

@sanju-naik
Copy link

Hi @sgowroji,

We started seeing the above error when around 40 of our CI workers started using this single remote cache instance, it seems like maybe it's happening due to the load or when multiple workers are trying to access it at the same time? As it worked fine when we tested it with only one CI worker for a week or so. Apart from this, we don't have any other steps as such.

Another thing to note is along with this 500 Internal Server Error, we are seeing BulkTransferException error quite a lot.
Here are some links related BulkTransferException if that helps - #11473 , https://stackoverflow.com/questions/62456527/how-to-debug-remote-cache-write-failures , but we aren't using any of the flags mentioned in these links.

This is our configuration :

Bazel version - 5.0.0

Remote cache configuration:

# These two are the only required options:
dir: /Users/runner/builds/bazel_cache
max_size: 400

http_address: 0.0.0.0:9092
grpc_address: none

http_read_timeout: 60s
http_write_timeout: 60s

access_log_level: all

@meisterT meisterT added P2 We'll consider working on this in future. (Assignee optional) and removed untriaged labels Jul 14, 2022
@coeuvre
Copy link
Member

coeuvre commented Sep 7, 2023

The error was because Bazel cannot reach your server (due to 500 Internal Server Error) maybe because it was under high load.

@coeuvre coeuvre closed this as not planned Won't fix, can't repro, duplicate, stale Sep 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug
Projects
None yet
Development

No branches or pull requests

5 participants