Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforcing AsyncClient as context-managed only? #769

Closed
tomchristie opened this issue Jan 16, 2020 · 10 comments
Closed

Enforcing AsyncClient as context-managed only? #769

tomchristie opened this issue Jan 16, 2020 · 10 comments
Milestone

Comments

@tomchristie
Copy link
Member

This issue isn't neccessarily something that I think we ought to do, or not do. However, since we've got our public API pretty squared away now, and are starting to think about a 1.0 release, it's something that we really ought to discuss, and make sure we're absolutely happy with whatever we land on.

Something that's come up as a possible? point of design contention has been if AsyncClient ought to strictly only provide a context-managed interface, or if we're okay with supporting __init__ and, optionally calling .aclose().

Currently we're supporting either style, and that's working fine for our users, but...

  • If we ever wanted to support background tasks, then structured concurrency means we'd want the client instances to be context-managed, so that we've got well defined lifetimes for any background running tasks.
  • The context managed style also means strictly managed resource managment. Connections have well-scoped lifetimes, rather than "cleanup on interpreter exit".

Do we need to be able to support background tasks. Well, possibly.

We don't strictly need background tasks to support our HTTP/2 implementation, so long as we're using HTTP/1.1 shortish timeouts, on connections that have no activity. (If we're not running a background task on it, then if there's no request/response activity progressing the connection, then we won't respond to HTTP/2 pings, and servers will time us out.)

However, if we did have background tasks then...

  • We could have nicer HTTP/1.1 keep-alive timeouts, where we actually properly expire connections in the background after their timeout period.
  • We could optionally support long-lived HTTP/2 connections, holding onto a connection for longer periods of time in order to provide lower latency when the next request is made.

Also, I've no idea how this ties in with HTTP/3.

Other relevant factors here:

  • It's also possible that some limited kinds of global tasks might at some point be supported by trio [discussion] [potential heresy] global connection pools and global tasks python-trio/hip#125
  • One reason I was initially reticant about strictly enforcing context managed async clients is that I wasn't clear how that'd fit in with the ASGI style, but having put some more thought in there it's clear that it can fit in with ASGI's lifespan messaging. Eg: Support generator function to manage startup/shutdown. starlette#799
  • We'd be suggesting a deliberate disparity with the sync case, which is a bit odd. I don't think we'd want to / be able to enforce context managed sync clients. It wouldn't work with WSGI, and anyways we'd like folks to be able to switch over from using requests to using httpx, and we wouldn't be able to justify the constraint. It's also less valuable since we wouldn't be running background tasks in the sync case anyways, so it'd only be relevant wrt. clean connection lifetimes.
  • Simply having an async context managed class doesn't actually tie in fully with "And we'd like a nursery available within this context" See conversation around https://gitter.im/python-trio/general?at=5e20aa33ad195a0f671cc071 - You can do it with some clever metaclassing, or you can potnentially call directly into the nursery __aenter__ and __aexit__, but it might not neccessarily? be the API we'd want to adopt if we wanted a nusery available within AsyncClient. (I think we'd probably? on ok ground here tho' - there's options available.)

I don't have any clear answer on this. What we've got at the moment is working fine for our users, but it's feasible that adopting the more constrained option could prevent issues further down the line?

@tomchristie tomchristie added this to the v1.0 milestone Jan 16, 2020
@tomchristie
Copy link
Member Author

I guess valid options for us here may be:

  • Do nothing. Everything's working great as-is, so we don't neccessarily need to adopt a change here.
  • Keep things as they are, but make sure we're strongly recommend the context managed usage, and only ever use that in our examples.
  • Adopt the more constrained choice, and strictly enforce the context managed usage.

@iwoloschin
Copy link

What is the cost for standing up a new client for each request, or set of requests? If I have a long running process that periodically makes requests is it ok to make one master client that everything then uses, or should each request use its very own client?

For a concrete example, I've got a FastAPI backend with an Angular frontend. I'm trying to get data from a third party. I didn't feel like dealing with CORS or bundling client certs into the frontend so I'm just using my backend to proxy the request for me. I don't really care about cleaning up after myself because the backend runs "forever", forever meaning until I push an update and restart the container. Should I create a single top level client that just gets called for each request, or async with in the FastAPI coroutine?

client = httpx.AsyncClient()

@router.get('/endpoint')
async def getter():
   return await client.get('http://other-host/resource')

Or:

@router.get('/endpoint')
async def getter():
    async with httpx.AsyncClient() as client:
        return await client.get('http://other-host/resource')

Is the first example "better" because the client can just be reused? Does that only make sense if you're connecting to the same host? Or is it better to have many ephemeral clients handling a single request?

Timing here is interesting for me, I have a code base that looks a lot like the second example (a natural evolution from threaded/requests-based to async/httpx-based), but I've slowly been refactoring into something closer to the first example since that felt like a more correct way to do things. Perhaps I'll hold off on more changes for now :).

@tomchristie
Copy link
Member Author

@iwoloschin Creating a single top-level client is slightly preferred, since requests can share a single common connection pool. You'll sometimes end up with lower latency on outgoing requests, since the client was able to reuse an existing connection.

@tomchristie
Copy link
Member Author

Wrt context-managed vs. init API, it's worth noting that the constraint here isn't implied by the Client class, but rather by the dispatchers. It's the dispatch class that may have startup and shutdown context associated with it.

One reason that's interesting is because actually there may be multiple dispatchers associated with a client. Right now that's only ever in the form of proxies. Our client close method currently only calls self.dispatch.close(), but actually it really ought to call close on any proxy dispatchers too.

At some point we'll likely introduce a more general "mounting" API (eg. to mount a specific dispatcher against a particular host). The proxies case will then just be one specific type of mounting, and we'll want to close all mounted dispatchers on client close.

Of note there is:

  • The may want the Dispatcher API to mandate __enter__,__exit__ / __aenter__,__aexit__ as being part of the API, rather than having a close method?
  • We ought to tweak our existing close() implementation, to close any proxy dispatch implementations too.

@StephenBrown2
Copy link
Contributor

StephenBrown2 commented Jan 20, 2020

A slight modification of @iwoloschin 's use-case, I'm also creating an API client library, in which I currently have:

class MyClientClass:
	...httpx.AsyncClient setup (timeout, headers, cookies)...
	self.client = httpx.AsyncClient(base_url=..., setup args)

	async def get(...):
        return await self.client.get(...)

With the context-manager I presume that would only require my sub-functions to be modified like:

   ...
   async def get(...):
       async with self.client as request:
           return await request.get(...)

Sort of a combination of the above two approaches, a single client instance that gets context-managed for each request... if I interpret it correctly.

@florimondmanca
Copy link
Member

florimondmanca commented Jan 20, 2020

@StephenBrown2

With the context-manager I presume that would only require my sub-functions to be modified like:

Actually you wouldn’t need to modify sub-functions at all. The connection pool exists and works regardless of context-managed usage — we just need to make sure resources are released properly once we don’t need the client anymore.

In fact, the snippet about sub-functions would not have the effect you think it would: it would close the connection pool on each request, preventing the wrapping client from benefiting from connection pooling. (I.e. connections will be recreated every time!)

@StephenBrown2
Copy link
Contributor

— we just need to make sure resources are released properly once we don’t need the client anymore.

Ah alright. How would I go about doing that? Since I presume users of my client will be making multiple requests, and not necessarily releasing resources on their own... Would I simply ignore the context-manager usage always?

@tomchristie
Copy link
Member Author

@StephenBrown2 Either don't worry about using it in a context managed style, or if you do want to then enforce that your MyClientClass also needs to be used in a context-managed style, and only instantiate the client in __aenter__, and remove it in __aexit__.

@HuiiBuh
Copy link

HuiiBuh commented Mar 20, 2020

I think a lot of people (including me) are using a non context managed instance of the Client in their backend simply because it is nicer to reuse the same already configured object every time you make a request to the same origin.
Taking away this option seems to be not that practical.

@tomchristie
Copy link
Member Author

I'm going to close this off for now. We can recommend it, but we don't really need to hard-enforce it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants