gRPC response sender is not thread safe #349
Labels
kind/bug
Categorizes issue or PR as related to a bug.
kind/flake
Categorizes issue or PR as related to a flaky test
Description
There's a bug with the interaction between the
GrpcToLogStreamGateway
andGrpcResponseWriter
. The gateway runs every call in a single threaded executor - this means any external calls, such asGrpcToLogStreamGateway#responseCallback
is called from the engine thread context, but then the actual logic is executed in the gateway's own thread. This frees the engine to do other things, such as read and process records, and potentially try to respond to more commands.However, the actual mapping of the response is done in the gateway thread context. When mapping the response, it accesses the
GrpcResponseWriter
's internal buffer view to get the response record. This is not a thread safe object, and thus may have been changed between the time the gateway received the call to respond and the time it actually maps the response.This can cause responses to become corrupt and not be sent out - instead an exception is thrown by the response mapper in the
GrpcToLogStreamGateway
. Unfortunately, the executor swallows this exception, so from our point of view "nothing" happens and no response is sent out.Expected behaviour
Responses are properly mapped and sent out, and errors are displayed when they occur.
Reproduction steps
You can run the
WorkerTest
for the TestContainers extension in a loop until failure. It typically takes ~50 runs to occur, but can take up to ~500, YMMV.Environment
The text was updated successfully, but these errors were encountered: