-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New HttpClient API design #22505
Comments
cc: @davidsh, @CIPop, @geoffkizer |
My opinion for what it's worth is. Leave httpclient alone aside from the fully managed awesomeness that is already in progress (mostly solves 1.) However there should be a second class that shares the guts (sockets, http parsing etc) with http client but is designed for out of the box service to service or high throughput. Http client behaves like a client-side lob out of the box (connections per server at low numbers, holding sockets etc) and by trying to make a single class to rule them all I feel the API gets restricted. FYI I think this is the same issue that SSLStream struggles with (clients have to have long term backwards compatibility, lower resource use etc) |
Good idea... Something like However, there still the DNS issue which is something out of managed world... How to deal with that? |
That is exactly what I am thinking, because in a server environment it would be much better to be able to get lower and control the connections (this could lead to a "server focused" pool). The DNS problem, is tough it's also a security consideration, if you look at the motions Java has gone through there are issues like DNS poisoning etc that can cause reliability and DOS attacks. |
The proposals to create a pool of HttpClient instances are non-nonsensical, HttpClient already implements pooling internally, you'd get the safe effect with a single shared instance. It does sound like there are DNS issues to address, but frequently closing connections (i.e. always using new instances of HttpClient) is over aggressive. We need a way to periodically refresh DNS and limit the lifetime of connections after we detect DNS changes for that host. |
Why? If you are in a server-to-server connection, like @Drawaes alluded, it would be interesting to give the service developer the ability to manage the connection lifecycle. |
Almost like some sort of TTL... 😁 Seriously though, if we can cache internally the DNS records for their actual TTLs we'd get the expected behavior, at least from my point of view. I haven't reviewed the managed plans that deeply, is this even a possibility or are we only getting the result (and no TTL) in those instances? |
@Tratcher I am not really seeing the downside to exposing the individual connections that make up a pool. I hadn't thought of it from that direction but instead providing a second pooling provider that is more "server" focused, but actually now I think about it exposing the individual connections allows for more scenarios to be covered by "power" users. @davidfowl I can't find it right now, but this is not too dissimilar to an idea you were commenting on a while back about a "kestrel client" similar to other frameworks provide |
@galvesribeiro connection management is one of the hardest problems in the client stack, offloading that to the user isn't doing them many favors. server-to-server can use a somewhat simplified model, but even then you're not going to want to restrict it to a single connection bottleneck so you still need to track and assign connections. Doing this poorly can severely compromise the performance of your application. There's also quite a bit of state that gets shared across connections in the client pool. E.g. cookies, cache, dns, auth, tls sessions, etc.. Managing that sharing would also get cumbersome. |
Agreed on the state. But the problem for service 2 service often is the state. In a "microservice" environment you often have services calling others as driven from one incoming request. I agree it is tough to write this by yourself so then I think a "shared primitive connection" with the socket and parsing code that is used in the httpclient and another "service" based http client would be ideal. |
Proposal: Make a central static HttpMessageHandler and change HttpClient's default constructor to use it rather than newing one up. After all, it's the handler that owns the pool, not the HttpClient instance. This handler would only have settings and defaults appropriate to the shared scenario, and would no-op in its Dispose method. This static handler could internally be implemented using any of the existing handler stacks, or the new managed one. Developers could customize all of the settings on their HttpClient instance without impacting other parts of the app (e.g. timeouts, default headers, base address). If you need to insert middleware then you wrap it around the static handler rather than a new instance. If you need to customize handler settings for a specific section of your app then you new up your own handler and client as you would today. |
I agree with the above approach. I think maybe what might be useful here is to provide use cases. Then they can be assessed as to their commonality as well as the ability to modify the current API, or other solutions for them rather than talking in the abstract. So @Tratcher 's comment above is perfect in that respect. I have the use case of a front end service (servicing a JS UI) that calls a number of backend services. This is a common pattern I have seen, but I am not saying this is how everyone does it. We would like to limit the outbound connections calling backend services, say to 500 connections, but there are 5 different services we call. With the current HttpClient we only have the option of making the connection per server limit 100, and not being able to service 500 calls if all the calls are to one of the services, or 500 and risk hitting 2500 connections. |
I think these two sentences are so smack bang on the mark. I don't have any suggested solutions to all of this but I can fully, 100% attest to what @Tratcher summed up. When I read that (now famous) DotNet Monsters article, my 'WTF' could probably be heard across the entire country :( That said, I felt way better when every person I asked, they to didn't know about this and we all thought we were 'smart' cause we were putting this in a I see commentary along the lines of "docs, more docs, better docs, etc" to help educate and facilitate this issue, but I just don't feel like that's ... gonna really help, etc. So yeah -> totally 100% super-strongly agree with @Tratcher but I just don't have a suggestion to help make it better. 🤔 But this issue resonates really strongly with me and some other peers. |
We all know the side effectos od the Also, State sharing is always a pain, I agree. But making it static will only make the false sensation of safety. I would stick with @Drawaes original idea and have a lower level type for server-to-server comms, and a more user friendly client type. Having both in the same type will always limit the possibilities. |
Because It’s just that |
@poke and that is the point... Short-lived or not, the user assume that they should call Anyway more a philosophical comment. :) |
SqlConnection has a limited (clumsy?) way to do that through connection string flags. ServicePoint is just plain sucks. Just let us implement/instantiate DNS and TCP connection managers and then pass them into HTTP client or HTTP request through an overloaded ctor. Default to global singletons otherwise. |
The problem with HttpClient is that it is doing way too many things internally instead of delegating them to other classes. It should not be responsible for managing DNS cache or TCP connection pools. |
@olviko HttpClient is an extremely lightweight class. It does some management of cancellation tokens, some header setting and buffer draining, but other than that it passes everything on to HttpClientHandler or WebRequestHandler. In turn HttpClientHandler allows ServicePointManager to do all the connection management and HttpWebRequest does the actual request sending. The IDisposable thing is unfortunate but it's there to clean up cancellation tokens and request/response bodies that exist during the lifetime of sending a request or returning a response. When not in the process of making a request, the HttpClient doesn't hold onto any unmanaged resources. |
@Tratcher We tried to convince the HttpClient team years ago to make the HttpClientHandler somehow globally changeable but we were unsuccessful at the time. I think it would open up a whole range of useful scenarios. |
@davidfowl HttpClient was designed to long living for numerous reasons unrelated to connection management. HttpClient has a BaseAddress property, a DefaultRequestHeaders collection, and a pipeline of message handlers. Having to set these up on every request would be a waste of time and would push people to create factory classes, defeating the purpose of having HttpClient as a class. Setting up a single HttpClient instance for each target API that you want to call is ideal because it allows setting up all the common behavior once and ensures that a connection group is managed just for that HttpClient. The benefits of this design are more obvious when building desktop and mobile apps, but I do believe there is still some value to the model in a server environment. |
But doesn't service point apply to every http handler? Eg max server connections is 1 property for every http handler, so in the server world, if I wanted to say "max connections to server for api x should be 5" and "max connections to server for api y should be 20" can I do that today? (It's possible I am missing something) |
There is a scenario I ran into with an internal customer of WCF using an HTTP binding which would be impossible to provide a fix for if we had a global singleton HttpClient instance. The scenario was a service was running on multiple hosts behind a load balancer. One of the hosts ended up with a bad deployment of their service code and was returning 500 responses for every request. A mid-tier server had a pool of connections to the load balancer and one of those connections was established to the failed server. The failed server always returned a valid HTTP response so it wouldn't get dropped by the connection pooling as technically the response was a valid response at the protocol level. On top of that, as the response was very fast, that connection would get used a lot so most requests would fail. The solution was to provide a way to abandon the connection pool (this was using HttpWebRequest so I made a mechanism where you could instruct WCF to use a different ConnectionGroupName so effectively abandoning the old pool and starting a new one). The bad server had been detected and removed from the load balancer, but the load balancer doesn't close existing connections to a server removed from the pool. Switching to a new connection pool allowed moving away from the bad server. If there is a single instance of HttpClient / HttpClientHandler, then you severely restrict your ability to control your connections. |
This as well... when you have level 3 load balancers 500's etc can be a
pain.
|
I do wonder if there is some cross over between wanting an "http parser" and control the connections on the client side with the discussion over at Because at the end of the day, a low level exposed http parser and some connection primitives (sockets etc) seem to be the requirement on both sides. |
@Drawaes, using HttpClient directly, the 500's from behind a load balancer are easy to deal with. Dispose the old HttpClient and instantiate a new one. There is an overload of the HttpClient ctor which takes an HttpClientHandler and a bool to say whether the HttpClientHandler should be disposed. With this, you can have a static instance of your HttpClientHandler and create and Dispose of HttpClient without upsetting your connection pool. |
does that kill the single connection in the connection pool? |
If you specify false on the constructor, it won't kill the connection pool in HttpClientHandler. |
Another thing that would be worth addressing in any redesign, is that if one wants to authenticate with a client certificate, it must be done in the Most of the time this is fine, as a client cert is intended to represent the client. However there are some cases where one may need to use different client certs for different requests - even to the same destination. Currently this requires at least one To figure out which instance to use, one may need a It would be easier if one could attach a certificate directly to a |
Id love to see httpclient work on a model similar to the new transport/protocols model on the server side (https://github.com/aspnet/KestrelHttpServer/issues/1980) The way I see it, Httpclient should only deal with things like remembering and passing along cookies, http parsing should be done by a layer below that, and the transport/connection management should be done by a layer below that |
I have hit this problem a few times when developing Web API apps that called other web service. When using Ninject and Castlewindsor it was possible to create multiple HttpClients at startup, configure them and pass the appropriate HttpClient into the right controllers. But with the inbuilt DI in .NET Core it doesn't seem possible to do this. I think this is a common use case. |
Current and for the solution : use an static instance . the problem is there no way to share the instance across various package , ( imagine each package use one HttpClient , 200 packages will use 200 HttpClient and all of them not disposable and not for sure about the memory usage ) So , I think the new HttpClient should be static and provide a non-static disposable class over those static method for allowing us use |
I don't think the lock is an issue these are extremely short lived. Static public Httpclient would be evil ( and you would need many more locks) ( a static internal tcp connection pool used by httpclient is a different matter) . I agree with @Drawaes above leave it and create a new high performance client that is harder to use with a construct which accepts a tcppool etc. Worth noting quite a few people with Rust hate the new high performance tokio as its much harder to use. |
@bryanjhogan Why can't you inject in HttpClient instances in ASP.NET DI ? @John0King I don't see why the OperationStarted flag that is used to prevent you changing BaseAddress and DefaultRequestHeaders after making a request is bad for concurrent access? It doesn't affect the request/response processing. Changing BaseAddress/DefaultRequestHeaders per request defeats the point of the properties. If you don't want to use them, you don't have to. And I don't think anyone was suggesting that using HttpClient as a static instance is a good idea. Having a shared instance per API is possible without using statics. |
how ? How u handle the cookie / Credential / default headers ? and for concurrent performance , I'm not very sure, I only see the souce code and it using |
@John0King I don't see a lock here https://github.com/dotnet/corefx/blob/master/src/System.Net.Http/src/System/Net/Http/HttpClient.cs Can you show me where you are seeing a lock in the source? DefaultHeaders can be stored in the HttpClient instance and Cookies and Certificates can be added to the HttpClientHandler. |
@darrelmiller 😆 I must remember it incorrectly, I can't find it. |
Please consider exposing the backing This can obviously be done during the factory delegate, but it's really hard for me to imagine pulling all of this customized code into the composition root of At least, perhaps just exposing the |
Due to lack of recent activity, this issue has been marked as a candidate for backlog cleanup. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will undo this process. This process is part of our issue cleanup automation. |
This really is a cleanup candidate, but we need to verify nothing important is left out (and if needed, create a separate issue scoped to the specific ask, or expand one of the existing issues) I'm assigning it to myself temporarily for the investigation/triage purposes |
@CarnaViire hey! Yes, I agree. This is from 2017 when things were different. With most of the things now on the managed world, and with the introduction of The only one that I still haven't seen workarounds is the DNS cache management. That may still bite us in some scenarios where intentionally we don't want to rely on caches. |
Solved issues:
Tracked issues:
Triage: I believe the list covers all the topics discussed in the issue. Closing this "mega-issue" in favor of the issues linked above. |
Following the discussion here https://github.com/dotnet/designs/issues/9 there are several issues with current
HttpClient
implementation. The idea with this issue is to discuss the design of a proposed new implementation and API surface forHttpClient
as part of the original issue.Some points discussed on that issue:
HttpClient
. One on Windows (based on WinHttp) and other on *NIX world (based on libcurl)IDisposable
(the TCP socket still leaking for a while by default for 240s due toTIME_WAIT
state which makes you run out of available socket connections under high load)SqlConnection
. However, it would probably hit worse issues with DNS cache, since it is managed by the OS networking stack and this implementation try to be 100% managed on top of the new Foundation layer proposed on the previous issue.The text was updated successfully, but these errors were encountered: