Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClientConn.Close does not wait for connections to be closed before returning #2869

Closed
micfan opened this issue Jun 18, 2019 · 13 comments · Fixed by #7666 · May be fixed by eshitachandwani/grpc-go-emc#1 or eshitachandwani/grpc-go-emc#2
Assignees
Labels
Area: Client Includes Channel/Subchannel/Streams, Connectivity States, RPC Retries, Dial/Call Options and more. P3 Status: Help Wanted Type: Bug

Comments

@micfan
Copy link

micfan commented Jun 18, 2019

Hi, I'm testing the compatibility of helloworld example between https://github.com/tower-rs/tower-grpc and https://github.com/grpc/grpc-go.

I found that the Golang client sent a RST TCP flag before it closing connection, however, the Rust client sent a FIN TCP flag. And, the Rust server think a RST flag is not proper.

This issue is born from this tower-rs/tower-grpc#187.


What version of gRPC are you using?

const Version = "1.22.0-dev"

What version of Go are you using (go version)?

$ go version
go version go1.12.6 windows/amd64

What operating system (Linux, Windows, …) and version?

Windows 10

What did you do?

If possible, provide a recipe for reproducing the error.

Here is the Wireshark dump of Golang client: https://pastebin.com/XxLkUXQw

And the Screenshot: Screenshot

  1. got ConnectionReset error: https://github.com/tower-rs/tower-grpc/blob/master/tower-grpc-examples/src/helloworld/server.rs + https://github.com/grpc/grpc-go/tree/master/examples/helloworld/greeter_client

  2. no this kind error: https://github.com/tower-rs/tower-grpc/blob/master/tower-grpc-examples/src/helloworld/server.rs + https://github.com/tower-rs/tower-grpc/blob/master/tower-grpc-examples/src/helloworld/client.rs

  3. got the same kind TCP flag [RST, ACK]: https://github.com/grpc/grpc-go/tree/master/examples/helloworld/greeter_server + https://github.com/grpc/grpc-go/tree/master/examples/helloworld/greeter_client

What did you expect to see?

It should be a `FIN` flag instead of `RST` flag when TCP is closing, in my opinion

What did you see instead?

A `RST` flag was sent from grpc-go client
@menghanl
Copy link
Contributor

menghanl commented Jun 18, 2019

I believe the root cause is: ClientConn.Close() doesn't wait for the underlying goroutines to exit. Which means, when ClientConn.Close() returns, the TCP connections are not guaranteed to be closed.

In the helloworld client, cc.Close() is the last thing, and the program exits after that. So it's possible that when the program exits, the TCP connection is still open, and will be force closed, resulting in a RST.

You can try to add a defer func() { time.Sleep(time.Second) }() at the beginning of main, to give some for TCP connection to close. You should see FIN as expected.


The race between ClientConn.Close() and other goroutines doesn't seem to be a big problem to me in practice, because it doesn't affect RPCs, and also server should be able to tolerate errors like this.
If it still concerns you, please file an issue.
EDIT: renaming this issue instead.

@dfawley dfawley changed the title client ConnectionReset before TCP close ClientConn.Close does not wait for connections to be closed before returning Jun 20, 2019
@dfawley dfawley added the P3 label Jun 20, 2019
@stale stale bot added the stale label Sep 6, 2019
@dfawley dfawley removed the stale label Sep 6, 2019
@micfan micfan closed this as completed Sep 9, 2019
@dfawley
Copy link
Member

dfawley commented Sep 9, 2019

We still should fix this. The stale bot went rogue (probot/stale#207) and accidentally tagged this - sorry for the spam.

@dfawley dfawley reopened this Sep 9, 2019
@johnwmstevens
Copy link

This is still a problem. Any plans to fix this?

@n0izn0iz
Copy link

bump

@dfawley
Copy link
Member

dfawley commented Oct 28, 2020

If anyone would like to help with this, we don't have time to prioritize it, but are willing to review code.

@infovivek2020
Copy link
Contributor

infovivek2020 commented Jul 10, 2024

@purnesh42H Please assign this issue to me
to reproduce this we need to have RUSTC then only we can check right?

@purnesh42H
Copy link
Contributor

purnesh42H commented Jul 10, 2024

@infovivek2020 I have assigned this to you. You don't need RUSTC. The underlying issue is that client.Close() doesn't wait for all connections to be close. See #2869 (comment)

@infovivek2020
Copy link
Contributor

@purnesh42H on filter the packet in wireshark for TCP connection after and before adding delay(#2869) in client main.go I can't see RST Flag
image
getting FIN Flag

@purnesh42H
Copy link
Contributor

purnesh42H commented Jul 22, 2024

#2869

@infovivek2020 could you mention how much delay you tried? Are you not able to repro this with any amount of delay or with no delay at all?

@infovivek2020
Copy link
Contributor

@purnesh42H tried with hour minute and second delay but still i am getting FIN signal in wireshark as i have shared previously

@infovivek2020
Copy link
Contributor

@purnesh42H panic: cannot update state as grpc.Dial with resolver has not been called

goroutine 1 [running]:
google.golang.org/grpc/resolver/manual.(*Resolver).UpdateState(0xa8d25d0, {{0xa8cede0, 0x3, 0x3}, {0x0, 0x0, 0x0}, 0x0, 0x0})
/home/vivek/go/src/grpc-go/resolver/manual/manual.go:111 +0x14a
main.main()
/home/vivek/go/src/grpc-go/examples/features/debugging/client/main.go:69 +0x6e8
exit status 2

@purnesh42H purnesh42H removed their assignment Jul 25, 2024
@purnesh42H
Copy link
Contributor

@infovivek2020 was not able to repro this on their machine and asked to work on a different task. Moving it back to unassigned

@dfawley
Copy link
Member

dfawley commented Jul 25, 2024

Since this bug is 4 years old, and we've done similar work on similar things in the past, I looked, and I think this is still a problem. The issue is here:

grpc-go/clientconn.go

Lines 1140 to 1149 in 47be8a6

for ac := range conns {
ac.tearDown(ErrClientConnClosing)
}
cc.addTraceEvent("deleted")
// TraceEvent needs to be called before RemoveEntry, as TraceEvent may add
// trace reference to the entity being deleted, and thus prevent it from being
// deleted right away.
channelz.RemoveEntry(cc.channelz.ID)
return nil

We shutdown the transports by canceling their contexts (inside tearDown) but we never wait for them to be closed.

This is made more difficult because we orphan transports at times, e.g. if we get a GOAWAY on a connection that has a stream open, we create a new connection to the server, but we stop tracking the old one.

Let's ignore the TCP stuff in the issue and focus on the fact that calling cc.Close doesn't block until all the goroutines, connections, etc. created by the client have been cleaned up.

@purnesh42H purnesh42H added the P3 label Jul 29, 2024
eshitachandwani added a commit to eshitachandwani/grpc-go-emc that referenced this issue Aug 23, 2024
It was not guaranteed that when `transport.Close()` returned, the
reader go-routines terminated. All the other channels like,  `ctxDone`, `writerDone`, `goAway` are being waited upon for a signal

To ensure that `transport.Close()` only return after the reader go-routine terminates:

- Wait for the `readerDone` channel to signal the completion 

This fixes: grpc#2869

RELEASE NOTES:

grpc: Wait until reader go routine exits in transport.Close()
@arjan-bal arjan-bal added the Area: Client Includes Channel/Subchannel/Streams, Connectivity States, RPC Retries, Dial/Call Options and more. label Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment