Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support pluggable http clients/coroutine frameworks (anyio/httpx) #749

Open
graingert opened this issue Jan 11, 2020 · 25 comments
Open

support pluggable http clients/coroutine frameworks (anyio/httpx) #749

graingert opened this issue Jan 11, 2020 · 25 comments
Labels
enhancement New feature or request

Comments

@graingert
Copy link
Contributor

I'd like to be able to use aiobotocore in a project that uses trio and httpx

anyio provides a generic interface to curio/trio/asyncio that libraries can use to provide generic support for all async/await eventloops

aiobotocore would require users to pass a "request" coroutine function that provides all the http functionality that aiobotocore needs for http requests

@graingert
Copy link
Contributor Author

As far as I'm aware aiobotocore would need to replace all asyncio calls with the relevant anyio equivalent, and any gather/ensure_future with a nursary/spawn pattern instead

@thehesiod
Copy link
Collaborator

should be done with #619 which will cause a TON of refactoring and perhaps re-thinking this whole project

@graingert
Copy link
Contributor Author

I don't think #619 and this issue are really that coupled and could be completed in any order

@thehesiod thehesiod added the enhancement New feature or request label Feb 15, 2020
@dazza-codes
Copy link
Contributor

dazza-codes commented Feb 19, 2020

botocore/boto3 already supports an event callback for extensions and aiobotocore inherits that behavior without any modifications. See

To focus only on the request component, if that's possible, consider details in #755 that describe how moto uses the event system to mock requests. It hooks into the before-send event. So long as the http_response returned is compatible with the rest of the response parsing, it should work. If it needs a custom response parser, register a parser-generator as a component in the client.

All of this is easily understood (mostly) from reading

@thehesiod
Copy link
Collaborator

thehesiod commented Feb 23, 2020

the reason I said #619 (#659) is because it's going to greatly expand this codebase, and thus parts which would need to be refactored to support whats mentioned in this issue. IIRC there are sleeps/locks for example in the new credentials file we need to replace components in. @dazza-codes that's great insight for that one particular event, I just logged #774 so we don't forget about it and apply it where possible. Given #659 moves us to async events a lot more will be possible.

@ediskandarov
Copy link

What is the roadmap here. I would contribute if there's a plan.

aiohttp dependency is discouraged in projects that use a different server web framework.

@graingert
Copy link
Contributor Author

@toidi step 1 would be to find any imports/calls to asyncio and replace it with the anyio equivalent

@ediskandarov
Copy link

@thehesiod are you ok with the proposal above?

@thehesiod
Copy link
Collaborator

Let me take a look at anyio and report back

@thehesiod
Copy link
Collaborator

ok looking at anyio that's not going to solve the http client portion, that tries to solve asyncio event loop plugging, which actually is already designed to be globally replaceable (see for example how uvloop works). From what I can see botocore is already working on a pluggable http client session, see: https://github.com/boto/botocore/blob/develop/botocore/endpoint.py#L280 however it's not piped to either Config or create_client yet from what I can see, seems like a WIP. So I think we need to wait to see what botocore is going to do. Another factor is that it's not as simple as just allowing a aiohttp replacement, you need to port all the exceptions as well (https://github.com/aio-libs/aiobotocore/blob/master/aiobotocore/endpoint.py#L156), I just went through that in fact on the most recent botocore update. looping in @terrycain .

@graingert
Copy link
Contributor Author

that tries to solve asyncio event loop plugging, which actually is already designed to be globally replaceable

I'm looking to replace the coroutine framework not the event loop

@graingert graingert changed the title support pluggable http clients/event loops (anyio/httpx) support pluggable http clients/coroutine frameworks (anyio/httpx) Jan 18, 2021
@thehesiod
Copy link
Collaborator

ah I c, sorry this is a two prong request, probably should be split as they're not necessarily related

@graingert
Copy link
Contributor Author

Well it's mainly about the pluggable coroutine framework, as I'd like to be able to use aiobotocore on curio and trio applications. For this to happen asyncio and aiohttp both need replacing as they both only support asyncio event loops

@terricain
Copy link
Collaborator

So, theres 2 pieces of work here. Making the event loop more customisable than just the asyncio backend implementation so we can use things like trio and curio. As well as making the http client swappable as aiohttp does not work with curio/trio. Both bits of work are required before any of this work is really usable.

In terms of swapping asyncio invocations to anyio, from a glance I don't see that causing any issues, it would be nice to see if there is any performance degradation with the switch, though it should just be a wrapper to the asyncio function call.

@graingert
Copy link
Contributor Author

the changes to aiobotocore for anyio will be quite minimal:

aiobotocore/_endpoint_helpers.py:    asyncio.TimeoutError,
aiobotocore/credentials.py:        self._refresh_lock = asyncio.Lock()
aiobotocore/credentials.py:        self._refresh_lock = asyncio.Lock()
aiobotocore/credentials.py:    def __init__(self, *args, popen=asyncio.create_subprocess_exec, **kwargs):
aiobotocore/endpoint.py:    # NOTE: The only line changed here changing time.sleep to asyncio.sleep
aiobotocore/endpoint.py:            await asyncio.sleep(handler_response)
aiobotocore/hooks.py:            if asyncio.iscoroutinefunction(handler):
aiobotocore/response.py:class AioReadTimeoutError(ReadTimeoutError, asyncio.TimeoutError):
aiobotocore/response.py:        except asyncio.TimeoutError as e:
aiobotocore/utils.py:RETRYABLE_HTTP_ERRORS = (aiohttp.client_exceptions.ClientError, asyncio.TimeoutError)
aiobotocore/utils.py:                except asyncio.TimeoutError:
aiobotocore/utils.py:    def __init__(self, session=None, sleep=asyncio.sleep):
aiobotocore/waiter.py:            await asyncio.sleep(sleep_amount)
asyncio.Lock() -> anyio.create_lock()
asyncio.create_subprocess_exec -> anyio.open_process
asyncio.iscoroutinefunction -> asyncio.iscoroutinefunction
asyncio.sleep() -> anyio.sleep()

and I think we'd need a TIMEOUT_ERRORS = (asyncio.TimeoutError, TimeoutError)

@graingert
Copy link
Contributor Author

the TimeoutErrors would have to be handled a bit more carefully as modern coroutine frameworks discourage "libraries from using internal timeouts. Instead, it encourages the caller to enforce timeouts, which makes timeout code easier to compose and reason about."

so this code:

timeout = aiohttp.ClientTimeout(total=self._timeout)
async with self._session(timeout=timeout,
trust_env=self._trust_env) as session:
for i in range(self._num_attempts):
try:
async with session.put(url, headers=headers) as resp:
text = await resp.text()
if resp.status == 200:
return text
elif resp.status in (404, 403, 405):
return None
elif resp.status in (400,):
raise BadIMDSRequestError(request)
except asyncio.TimeoutError:
return None
except RETRYABLE_HTTP_ERRORS as e:
logger.debug(
"Caught retryable HTTP exception while making metadata "
"service request to %s: %s", url, e, exc_info=True)
except aiohttp.client_exceptions.ClientConnectorError as e:
if getattr(e, 'errno', None) == 8 or \
str(getattr(e, 'os_error', None)) == \
'Domain name not found': # threaded vs async resolver
raise InvalidIMDSEndpointError(endpoint=url, error=e)
else:
raise
return None

would use anyio.move_on_after to handle timeouts per retry

    async def _fetch_metadata_token(self):
        self._assert_enabled()
        url = self._base_url + self._TOKEN_PATH
        headers = {
            'x-aws-ec2-metadata-token-ttl-seconds': self._TOKEN_TTL,
        }
        self._add_user_agent(headers)

        request = botocore.awsrequest.AWSRequest(
            method='PUT', url=url, headers=headers)

        async with self._session(timeout=None,
                                 trust_env=self._trust_env) as session:
            for i in range(self._num_attempts):
                with anyio.move_on_after(timeout):
                    try:
                        async with session.put(url, headers=headers) as resp:
                            text = await resp.text()
                            if resp.status == 200:
                                return text
                            if resp.status in (404, 403, 405):
                                return None
                            if resp.status in (400,):
                                raise BadIMDSRequestError(request)
                    except RETRYABLE_HTTP_ERRORS as e:
                        logger.debug(
                            "Caught retryable HTTP exception while making metadata "
                            "service request to %s: %s", url, e, exc_info=True)
                    except aiohttp.client_exceptions.ClientConnectorError as e:
                        if getattr(e, 'errno', None) == 8 or \
                                str(getattr(e, 'os_error', None)) == \
                                'Domain name not found':  # threaded vs async resolver
                            raise InvalidIMDSEndpointError(endpoint=url, error=e)
                        raise
        return None

@ediskandarov
Copy link

the similar discussion in HTTPX
it didn't seem to invest into anyio yet

if aiobotocore would go for pluggable HTTP client - coroutine framework is not a requirement

encode/httpx#296

@graingert
Copy link
Contributor Author

@toidi httpx uses its own abstraction https://github.com/encode/httpcore/blob/master/httpcore/_backends/anyio.py which is for opening sockets, creating locks and sleeping. aiobotocore also needs subprocess support

@graingert
Copy link
Contributor Author

@emcpow2 httpx and encode in general have fully bought into anyio now: https://github.com/encode/httpcore/blob/f0d16e9462910c13d3b72039223346a3fe8a3299/CHANGELOG.md#fixed-2

@Zac-HD
Copy link

Zac-HD commented Aug 29, 2023

I'm also really keen to use aiobotocore with Trio - and it looks like there's a reasonable two-step process:

  1. Switch from aiohttp to httpx as our http client; these days it uses anyio internally and thus runs natively on top of asyncio ortrio. The only changes this would require to user code are in exception handling, if someone is currently catching aiohttp-specific exceptions (likely!). A backwards-compatibility layer would be annoying but possible.

  2. Switch from asyncio to anyio. This may require a few API changes to support structured concurrency, and would generally involve somewhat deeper changes to somewhat less code.

The best way forward is probably for someone to draft a PR for (1), and then we can decide whether it's easier to do the two parts together or separately.

@thehesiod
Copy link
Collaborator

yea, we've been internally switching to httpx for some of jobs because aiohttp keeps giving random https payload errors. I'm all for this now

@thehesiod
Copy link
Collaborator

it's easier now with the refactoring botocore has done and we've incorporated

@jakkdl
Copy link
Contributor

jakkdl commented Feb 6, 2024

After working on implementing httpx this is likely going to cause quite a few changes in user code, and some functionality would be straight up lost due to httpx not having the exact same feature set as aiohttp - e.g. encode/httpx#2211

In general the code feels quite tightly coupled to aiohttp, such that it would look very different if it'd been written for httpx from the start, and even where it would be possible in theory to create a seamless transition for end users it'd lead to very clunky code. E.g. I'm starting to want to completely drop response.StreamingBody and directly hand the httpx.Response object to the user: #1085 (comment)

So I'm starting to lean towards it being easiest initially to implement httpx alongside aiohttp, and you can specify which one to use upon initialization of the session. This way I don't have to initially bother with full support for niche stuff like proxies, ssl certificates, etc etc (which in several cases doesn't even seem to have tests, which makes it very hard for me to guarantee functionality isn't changed) and can much more easily break it up into multiple PRs. And then when the httpx implementation is done deprecation warnings can be added to aiohttp-specific stuff.

Copy link

This issue has been marked as stale because it has been inactive for more than 60 days. Please update this pull request or it will be automatically closed in 7 days.

@github-actions github-actions bot added the Stale label Aug 20, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 27, 2024
@jakkdl
Copy link
Contributor

jakkdl commented Aug 31, 2024

This likely shouldn't be closed as not planned unless we're dropping #1085

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

8 participants