-
Notifications
You must be signed in to change notification settings - Fork 565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client: prevent leak of file descriptors with multiple clients #434
Conversation
Related to #404. |
@orivej I don't think that we should apply this change. I'm probably leaning towards option 4, but understanding that people will probably not read the documentation and catch the leaking with profiling/debugging and conclude that they should not create too many instances of the go-dockerclient. Keeping less instances of the Client is preferred. Would you be interested in adding this to the documentation of the library? |
I thought of another option, registering |
@orivej seems like a better idea, but there's an important point: we should do that only when we allocate the the client, we should not call CloseIdleConnections if the user code has replaced the HTTPClient. One approach is having the cleanhttp.DefaultClient() result as an unexported field in the struct, along with the exported field, and call the CloseIdleConnections using the unexported reference, if it's non-nil, leaving it up to the user code in case the HTTPClient has been customized. What do you think? |
I moved the original proposal to https://github.com/orivej/go-dockerclient/tree/fix-fd-leak-v1 and added CloseIdleConnections as the HTTP transport finalizer. Since it finalizes just the transport and not the In one of my applications that used to exhaust the default of 1024 file descriptors after the merge of #399, finalizers manage to keep the number of outstanding connections under 60, so the new proposal will likely keep applications that relied on the old API working [1]. However, one might still prefer to close connections explicitly. What do you think of orivej@643597a on top of the current proposal? [1] In my opinion #399 breaks API, since it considerably changes the behaviour of old New methods for applications that create and discard clients as necessary. |
@fsouza If this is as big a problem as @orivej suggests than the right place to put in finalizers is in cleanhttp. The patch he's suggesting here still uses cleanhttp, just adds finalizers to the methods, but it might as well be fixed centrally (which was kind of the point of cleanhttp in the first place). What I'm not clear on at this point is what is causing the problem in the first place. Shouldn't the normal behavior of http.Client, when being GCd, be to close idle connections? IOW, this feels like working around a Golang bug -- which is not at all a problem, but should potentially be raised upstream. |
I've done some more digging -- the connection pool in Go is implemented by http.Transport. Unfortunately this is where a lot of the issues that @orivej In the Consul PR you said that finalizers weren't a great option, but then you used them in this PR. Is there a reason you think they aren't? Otherwise, I'm happy to put finalizers on the transports returned from |
Here's a playground example with the functions pulled from |
In this example I do not see finalizers be called ever until the program runs out of FDs: http://play.golang.org/p/EFyiff-O9B (it will not run on Play due to time restriction, but you can run it locally, presumably after executing |
This is why I think cleanhttp solves this problem by helping instantiate one (or a small set of) private instances of A reimplementation of |
Hi @orivej, It certainly doesn't need to use double indirection; the various places in Go itself that use SetFinalizer (for example, https://golang.org/src/net/fd_unix.go and https://golang.org/src/os/exec/exec.go) all do so on objects using single indirection. I think you are right about double indirection simply finalizing the pointer. I'm not sure why that fixes the problem, but it does -- I could reproduce here. At http://play.golang.org/p/ufC7P8pDog you can see a modified version of your code where I simply moved the finalizer into the section of code that is in I'd be very interested in knowing why the double indirection makes this difference, but if you agree that this should fix the problem, I'll make this change in |
I think the problem here is that if the transports have different configurations (TLS, proxy, keepalive, etc.), reuse of the idle connections can be subject to different behavior than what a client might expect. I think the overall better solution would be to have Transport kill its own idle connections when it's GCd. I wouldn't be surprised if DefaultTransport never even does that, since it's around for the lifetime of the program anyways. |
I see that you added the second indirection to the finalizer, without which it still does not work. |
@orivej Yep. I've been looking through some of the documentation on the GC in 1.5 to see if I can spot (without digging into code) why this behaves this way.
I do think that either the documentation is correct and the usage in the Go stdlib is faulty, or the documentation is incorrect and the usage in the Go stdlib is correct, because I agree with you -- that does seem to imply that it must be a pointer to a pointer. Thing is, even if it isn't, eventually the GC will in fact run the finalizer, but it behaves more like the documentation suggests:
Whereas when using double indirection it runs now. I think this is definitely worth a post on I was starting to read https://blog.golang.org/go15gc and have a suspicion that it has something to do with tweaks to the algorithm. When using double indirection, you're creating a second object on the heap (the pointer to the pointer) which outside of the SetFinalizer call is never referenced again. I (again, not looking at the GC code) can imagine some speed optimization in the GC or even the runtime where it avoids GCing objects created in a particular function and/or for loop while in the middle of the for loop; IOW it doesn't drop all knowledge of the pointer until some later time, with the objects being marked grey as a result. Since the pointer-to-a-pointer is created in a separate function, it doesn't benefit from this optimization, is marked white, and is reaped instantly. This all said, do we have agreement about the way forward here? SetFinalizer in |
When I moved server out of the process in the last paste, removed the call to Would |
@orivej Sounds good. I'll make the changes in Regarding |
Tested -- works fine. I'll put in a PR to |
I thought of another potential problem to keep in mind. Here hashicorp/consul#688 (comment) @samprakos reported that a server consumed all FDs because clients instantiated in a loop kept them open. It looks like finalizers will completely protect clients, but servers may still need more FDs to keep more connections to the same number of clients due to client transports that are not yet finalized. Custom solutions like hashicorp/consul#1517 may still be useful (but hopefully not necessary). |
@orivej I think we are ok - @samprakos was running a process on the servers using the Consul API client so the finalizers would help here (the servers aren't a special case). You are right that if people point a bunch of Consul API clients at a small set of servers then they can cause problems, but the vast majority of Consul users will be pointing their Consul API clients at localhost to hit their local agent. The agent -> server RPC communication in Consul uses connection pools and multiplexed TCP streams (by way of yamux) to be as efficient as possible with server fds, though it is common to have to bump up the fd limits on Consul servers in large clusters. |
I moved my second proposal for go-dockerclient to https://github.com/orivej/go-dockerclient/tree/fix-fd-leak-v2, and updated this one with the current cleanhttp. I confirmed that this edition fixes my original issue. |
Great. I'll merge the PR. |
Confirmed, and updated PR. |
@fsouza, JFYI, upstream |
@orivej thanks for pinging this one. The change we made in Consul isn't for better performance vs. the finalizer, it's better than the new default behavior of cleanhttp without the finalizer and with keep alives disabled which would have made the Consul API client make a new connection for each request. |
@slackpad, do you know why |
@orivej The finalizers did not work in all cases. We saw actual issues where GC wasn't triggering, presumably because due to memory being freed up it just didn't need to run. So it often worked, but it (in not-easily-reproducible fashion) did not always work. It didn't seem like a good idea to periodically force GC. The way it is now (forcing keepalives off) is demonstrably safe, but it does have a speed cost. We may put in options for the pooled version to have it call the idle connection cleanup function on a user-defined timer (and potentially also add the finalizer back in for connections that weren't cleaned up during the timed event). However, generally speaking the dev is aware of the kind of workload they'll be using -- lots of connections stood up and torn down, or reusing existing connections over and over -- so can make a reasoned choice about which one to use. |
Each HTTP client keeps alive up to 2 connections, so the number of clients created in the lifetime of an application must be constant to prevent exhaustion of file descriptors. The options to fix this are:
(*http.Transport).CloseIdleConnections()