Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keepalive connections aren't released when closing the ConnectionPool #720

Closed
edocod1 opened this issue Jan 4, 2020 · 3 comments · Fixed by #721
Closed

Keepalive connections aren't released when closing the ConnectionPool #720

edocod1 opened this issue Jan 4, 2020 · 3 comments · Fixed by #721
Labels
bug Something isn't working pooling Issues and PRs relative to connection pooling

Comments

@edocod1
Copy link

edocod1 commented Jan 4, 2020

Hello. I am having an issue where it looks like connections aren't being closed correctly, and after i reach a number of requests equivalent to "hard_limit" of pool_limits, i get a PoolTimeout exception.

I tried upgrading to httpx==0.10.1, with no success.

Minimal example:

import httpx, asyncio, logging
from httpx import PoolLimits
from random import randint

queue = asyncio.Queue()

clients = [
	httpx.AsyncClient(
		http2=True,
		pool_limits=PoolLimits(soft_limit=2, hard_limit=10),
		cookies={'a': '123456789', 'b': '987654321'},
	)
]

async def worker_loop(cid, client, queue):
	while 1:
		sub_id = await queue.get()

		async with client as c:
			r = await c.get(f'https://mywebsite.dummy/submission.php?id={sub_id}')

		if r.status_code != 200:
			print(cid, f'Got status code {r.status_code} while parsing {sub_id}')
			return

async def main():
	for i in range(2500):
		await queue.put(randint(1, 80000000))

	for k, v in enumerate(clients):
		asyncio.create_task(worker_loop(k, v, queue))

	while 1:
		if queue.qsize() == 0:
			await queue.put(randint(1, 80000000))
		await asyncio.sleep(2)

loop = asyncio.get_event_loop()
loop.run_until_complete(main())
loop.stop()

I checked with netstat, and only one actual connection is opened to the IP address, so pooling seems to work fine.
I really cannot understand why. I even tried using the "aclose()" syntax, without the "async with" block, but no difference at all.

@florimondmanca
Copy link
Member

only one actual connection is opened to the IP address, so pooling seems to work fine

I don't think that's actually the behavior you'd expect (though, from your code that's the behavior one should expect): since requests are concurrent here, HTTPX will use as many connections as it can to perform the requests, i.e. up to 10 (the hard limit). Once one of the 10 first requests terminates, the connection can be reused by a subsequent request.

So what you'd expect is to have 10 connections almost constantly being used, but you only get one, plus a pool timeout…

Your code is similar in spirit to this:

while True:
    url = await get_next_url()
    async with client:
        response = await client.get(url)

The async with client is what causes the connection pool to be setup, and then teardown. So in practice, you are tearing down the connection pool after each request, meaning you're only ever able to use one connection (because any previous connection would have been thrown away by closing the pool after the previous request terminated).

What you need to do for the connection pool to be effectively used, is to move the async with block at the top-level:

async with client:
    while True:
        url = await get_next_url()
        response = await client.get(url)

With this I am able to run your program without issues against a local dummy server.

But that doesn't explain the pool timeout…


I am able to reproduce the timeout issue with this example:

import asyncio

import httpx


async def main() -> None:
    url = "http://localhost:8000"
    max_connections = 2
    client = httpx.AsyncClient(
        pool_limits=httpx.PoolLimits(soft_limit=1, hard_limit=max_connections)
    )

    for _ in range(max_connections + 1):
        async with client:
            await client.get(url)


if __name__ == "__main__":
    asyncio.run(main())

(Note that if we switch out for _ in range() and async with client as suggested above, there's no error.)

I pinpointed the origin of the issue to us not releasing the internal max_connections semaphore when closing keepalive connections when closing the connection pool. PR incoming!

@florimondmanca florimondmanca added the bug Something isn't working label Jan 4, 2020
@florimondmanca florimondmanca changed the title Connections don't get closed Keepalive connections aren't released when closing the ConnectionPool Jan 4, 2020
@florimondmanca florimondmanca added the pooling Issues and PRs relative to connection pooling label Jan 4, 2020
@edocod1
Copy link
Author

edocod1 commented Jan 4, 2020

Oh, I thought instantiating the client opened the connection pool, and you'd need to use "async with" to let the pool know you wanted a connection. Please close this issue if you feel like it. Thank you!

@zeldrinn
Copy link

@florimondmanca is this truly fixed? i'm running into something that seems similar, yet i'm using version 0.13.3 which looks to contain the fix. specifically, if the number of tasks executed via asyncio.gather(...) is greater than max_connections, i get a PoolTimeout. it seems like maybe this is happening because the tasks that have completed aren't releasing their connections upon completion?

that said, i'm fairly new to asyncio to it's possible i'm doing something wrong. here's an example that throws a PoolTimeout:

import asyncio
import httpx


async def main() -> None:
    url = "https://www.example.com"
    max_connections = 2
    timeout = httpx.Timeout(3.0, connect=5.0, pool=2.0)
    limits = httpx.Limits(max_connections=2)
    client = httpx.AsyncClient(timeout=timeout, pool_limits=limits)

    async with client:
        tasks = []
        for _ in range(max_connections + 1):
            tasks.append(client.get(url))
        await asyncio.gather(*tasks)

if __name__ == "__main__":
    loop = asyncio.get_event_loop()
    try:
        loop.run_until_complete(main())
    finally:
        loop.close()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pooling Issues and PRs relative to connection pooling
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants