Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InProcess: client streaming memory leak when retry enabled #8712

Closed
systemfreund opened this issue Nov 18, 2021 · 9 comments · Fixed by #11406
Closed

InProcess: client streaming memory leak when retry enabled #8712

systemfreund opened this issue Nov 18, 2021 · 9 comments · Fixed by #11406
Assignees
Milestone

Comments

@systemfreund
Copy link

systemfreund commented Nov 18, 2021

What version of gRPC-Java are you using?

1.42.0

What is your environment?

Linux 5.15.2
openjdk version "11.0.13" 2021-10-19

What did you expect to see?

No java heap space exception.

What did you see instead?

java.lang.OutOfMemoryError: Java heap space

Steps to reproduce the bug

We've recently updated our grpc dependency from 1.39.0 to 1.42.0 which now causes an out of memory error in one of our tests.

I've attached a test project that can reproduce the problem.
grpcheap.zip

How to run:

mvn clean test -Dgrpc.version=1.39.0   # test runs successfully
mvn clean test -Dgrpc.version=1.42.0   # test fails with OOM

Please be aware that the test is run with limited heap size (see pom.xml): -Xmx64M

The general idea of the test is this:
Client-side has an InputStream that produces ~200mb data. This data of that stream is (lazily) split into ~1mb chunks and these are then streamed to the server.

What happens when I run the test with grpc >1.40 is that I get an OOM when it has sent ~60mb worth of chunks, which obviously is near the -Xmx limit. However, what I would expect is that the heap usage would never rise to that level, as after each 1mb chunk has been sent that chunk can be gc'ed. I've also attached a screenshot of visualvm where you can see the heap graph of the test using grpc 1.39.0 vs 1.42.0:

grpc

@ejona86
Copy link
Member

ejona86 commented Nov 22, 2021

Given retries were enabled in 1.40, it seems likely that there's an interaction with retries that are retaining data. Given that much data, the RetriableStream should commit itself. Maybe the real stream isn't catching up so passThrough != true?

As a workaround, ManagedChannelBuilder.disableRetry() probably will "fix" this behavior.

@dapengzhang0
Copy link
Member

Yes, since 1.40, retry is enabled by default, so for client streaming RPC, the outbound messages will be cached in memory (retry buffer) until the call is committed, unless the buffer limit exceeded. So you need either ManagedChannelBuilder.disableRetry() or set a buffer limit small enough for -Xmx64M with ManagedChannelBuilder.perRpcBufferLimit() or ManagedChannelBuilder.retryBufferSize().

For your test, you can not use GrpcServerRule to set them because you can not access the ManagedChannelBuilder. You could use GrpcCleanupRule instead.

@ejona86
Copy link
Member

ejona86 commented Nov 23, 2021

@dapengzhang0, isn't the per_rpc_buffer_limit default in Java 1 MB though? The cross-RPC default limit is 16 MB. It seems they have at least 30 MB available, so I don't see how retry could be properly functioning for this single RPC.

@dapengzhang0
Copy link
Member

isn't the per_rpc_buffer_limit default in Java 1 MB though? The cross-RPC default limit is 16 MB. It seems they have at least 30 MB available

Oh, right. That means setting ManagedChannelBuilder.perRpcBufferLimit() and ManagedChannelBuilder.retryBufferSize() won't work.

I think the problem comes from InProcessTransport.InProcessStream.InProcessClientStream.writeMessage(), which doesn't trigger StreamTracer.outboundWireSize() (because it doesn't use a Framer), unlike AbstractStream.writeMessage() does. So the buffer limit never gets exceeded for InProcessClientStream.

@systemfreund
Copy link
Author

Thanks all for your responses. I'll close the issue then, as it doesnt appear to be a bug.

@dapengzhang0
Copy link
Member

@ejona86 We still need to improve the behavior for InProcessTransport? Does it make sense to call outboundWireSize() for InProcess although there is actually no "wire"?

@ejona86
Copy link
Member

ejona86 commented Nov 23, 2021

I think it'd make sense to add a mode to the inprocess transport to have it serialize payloads so that the sizes are reported. I also wonder if we should disable retries on inprocess by default.

@dapengzhang0 dapengzhang0 changed the title Possible memory leak when using ByteString in client->server streaming InProcess: client streaming memory leak when retry enabled Nov 23, 2021
@dapengzhang0 dapengzhang0 added this to the Next milestone Nov 23, 2021
@dapengzhang0
Copy link
Member

I reopen the issue to track the InProcess payload size report/enable retry issue.

@ThoSap
Copy link

ThoSap commented May 26, 2022

I am experiencing the same memory leak issue with gRPC-Java v1.45.1

Environment:

Linux 4.18.0-305.19.1.el8_4.x86_64
OpenJDK 64-Bit Server VM 21.9 (build 17.0.3+7-LTS, mixed mode, sharing)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants