Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in head get requests over SSL #1767

Closed
arthurdarcet opened this issue Mar 29, 2017 · 27 comments
Closed

Memory leak in head get requests over SSL #1767

arthurdarcet opened this issue Mar 29, 2017 · 27 comments
Labels

Comments

@arthurdarcet
Copy link
Contributor

I'm trying to track down a memory leak but I not having much luck… Could you have a look?

Here is the leaking code:

import resource, gc

last = 0
def mem(*_):
    gc.collect()
    global last
    mem = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1024
    print('Variation:', mem - last, 'ko')
    last = mem


import asyncio, aiohttp

async def main():
   conn = aiohttp.TCPConnector(verify_ssl=False)
   session = aiohttp.ClientSession(connector=conn)
   url = 'https://127.0.0.1:8081'

   mem()
   for _ in range(3000):
       async with session.head(url):
           pass
       mem()

asyncio.get_event_loop().run_until_complete(main())

I'm using this https server to test:

import ssl, asyncio
from aiohttp import web

async def handle(request):
    await request.read()
    return web.Response(text='ok')

app = web.Application()
app.router.add_get('/', handle)
handler = app.make_handler()

sslcontext = ssl.SSLContext(ssl.PROTOCOL_SSLv23)
# openssl req  -nodes -new -x509  -keyout server.key -out server.cert
sslcontext.load_cert_chain('server.cert', 'server.key')

loop = asyncio.get_event_loop()
loop.run_until_complete(loop.create_server(handler, '0.0.0.0', 8081, ssl=sslcontext))
loop.run_forever()

The first script leaks around 30ko of RAM for every requests (between 8ko and 60ko)

From what i see with objgraph / pympler, the leaking object are ResponseHandlers. Not sure why they aren't garbage collected though.

@fafhrd91
Copy link
Member

maybe related to #1756
I will look into this problem today

@fafhrd91
Copy link
Member

could you check 2.0 branch, I found bug

@arthurdarcet
Copy link
Contributor Author

nope, still leaking. You can check by yourself with the first script, it only has aiohttp as a dependency and shows Variation: 20ko for each request (meaning 20ko of ram was again allocated). It should show Variation: 0

@fafhrd91
Copy link
Member

could you test again with aiohttp.TCPConnector(verify_ssl=False, disable_cleanup_closed=True)

@arthurdarcet
Copy link
Contributor Author

not a valid kwarg for TCPConnector, did you mean force_close?

@fafhrd91
Copy link
Member

hmm, try this in your script:

conn = aiohttp.TCPConnector(verify_ssl=False)
conn._cleanup_closed_disabled = True

@arthurdarcet
Copy link
Contributor Author

yes, that solves it

i'm not entirely sure this is a memory leak: the memory usage increases way too much imo, but it seems to be slowly stabilising:
I've graphed the memory increase after each request:
screen shot 2017-03-28 at 22 27 26
Until 3k-4k requests, the memory usage increases by 30ko per requests (more or less). After that, it slows down and stops using more memory after a while.

I'm guessing this could be normal if the responses are kept somewhere for a few seconds before being cleaned up?

@fafhrd91
Copy link
Member

you can disable connection pool n connector, in that case connections should be released immediately.
but remember ssl transport also has shutdown procedure, and currently if something goes wrong during
shutdown, asyncio leaks connection. (that is the purpose of cleanup_closed_disabled)

@arthurdarcet
Copy link
Contributor Author

The underlying issue here is a real memory leak:
We have a very simple service proxying requests: we receive GET/PUT/POST request on an aiohttp server (basic HTTP server), and we make GET/POST/HEAD requests to another backend using the client part of aiohttp (these requests are SSL).

This basic proxy process is leaking around 300 MB of RAM per hour (and this process receives around 1 request per second, and makes 1-2 requests per second to the backend). Currently we're restarting the process regularly to avoid killing our servers…
It looks like the problem appeared with aiohttp 2, but i'm not entirely sure when. I'm having trouble replicating the issue locally so bisecting will be painful…

I'll continue investigating the issue, but i don't think the connection pooling or shutdown procedures are part of the issue here: the requests are made relatively slowly, with low concurrency (usually 1, at most 3-5)

--

Also, the problem is really specific to HEAD requests, which is very weird i guess. But the exact same script with GET requests does not leak anything:
If I leave cleanup_closed_disabled at False, with HEAD requests, the memory usage increase by around 30ko for each request during the first 3,000 requests (after that it becomes erratic).
If I do the exact same (cleanup_closed_disabled = False) but with GET requests instead, the memory usage is perfectly stable and not increase at all with each request.

@fafhrd91
Copy link
Member

interesting, I'll look too

@fafhrd91
Copy link
Member

I can not reproduce the problem with your original scripts.
but it could be actual leak, something stores refs to objects. could you check what objects are in memory and number of objects, something like:

    tps = {}
    for ob in gc.get_objects():
        tps[type(ob)] = tps.get(type(ob), 0) + 1

    pprint(tps)

@arthurdarcet
Copy link
Contributor Author

I'm tracking the created objects with objgraph: the aiohttp object leaking are ResponseHandler

@fafhrd91
Copy link
Member

do you see other aiohttp related objects? what about asyncio?

@fafhrd91
Copy link
Member

also could you check who holds reference to ResponseHandler object

@arthurdarcet
Copy link
Contributor Author

This is the the new objects after the HEAD request number 301 (i do 300 requests, then start the diff, do one more request, print the diff)

list                        5375       +45
dict                        4439        +5
method                       671        +2
deque                        605        +2
weakref                     2057        +1
_SSLProtocolTransport        301        +1
SSLProtocol                  301        +1
ResponseHandler              301        +1
_SSLPipe                     301        +1
ServerDisconnectedError      300        +1
SSLObject                    301        +1
TimerHandle                    5        +1

As for references: some ResponseHandler do not have any reference pointing to them ; those who do are coming from the event loop:
chain

@fafhrd91
Copy link
Member

I do not see any of aiohttp objects except ResponseHandler, but handler is asyncio protocol
seems SSLProtocol holds reference to ResponseHandler, which makes sense. may ssl transports are in shutdown state?

try to replace this line https://github.com/aio-libs/aiohttp/blob/2.0/aiohttp/connector.py#L463
with "abort()", also check transport.close() get called.

@fafhrd91
Copy link
Member

wait, not replace. but right after "protocol.close()" add "transport.abort()"

@arthurdarcet
Copy link
Contributor Author

actually, I think all this is a false lead:
It makes perfect sense that the ResponseHandler are staying in memory for a while until the ssl connection shuts down
The memory usage increases for a few seconds (roughly for the first 3k requests, but that number is probably dependant on the number of request per second i do)
Then the first requests that were made start shuting down properly, and memory is released correctly, and the memory usage stops increasing.

So everything perfectly normal here. The only weird thing is that this does not happen with GET requests, but this might have something to do with connection pooling done only for GET requests and not for HEAD requests?

@fafhrd91
Copy link
Member

does HEAD request return payload? if it returns any data connector closes connection, on other hand for GET connector can properly detect end of message and can reuse connection.

@arthurdarcet
Copy link
Contributor Author

The server serving the HEAD requests is aiohttp too, but that's probably it. So this is completely unrelated to memory usage:

Here is what i see when I do a HEAD request against a simple aiohttp server:

> HEAD / HTTP/1.1
> Host: 127.0.0.1:8080
> User-Agent: curl/7.49.1
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: text/plain; charset=utf-8
< Content-Length: 2
< Date: Wed, 29 Mar 2017 18:03:09 GMT
< Server: Python/3.6 aiohttp/2.0.5
< 
^C

Is the Content-Length: 2 expected?

@fafhrd91
Copy link
Member

do you return any data inn response?

@fafhrd91
Copy link
Member

aiohttp.web does not prevent you from writing body in response for HEAD

@arthurdarcet
Copy link
Contributor Author

Ok, I understand what's happening here:

  • My HTTP server is a basic aiohttp server with only a GET handler
  • a HEAD handler is added automatically by aiohttp
  • This HEAD handler correctly strips the body from the response returned by the GET handler
  • But it does not strip the Content-Length header. (if my GET handler return a web.Response(text='okuie'), I get a Content-Length: 5 header in the response to a HEAD request, with no body
  • And all this keeps the connector from re-using the connection, which create new SSL connection for every requests and looks a lot like a memory leak (even though it's definitely not a leak)

@fafhrd91
Copy link
Member

yes, this needs come cleanup in 2.1

@arthurdarcet
Copy link
Contributor Author

ok. I'll open another issue to track this, and close this, nothing wrong with the memory handling here (my memory leak must be somewhere else, I'll keep looking…)

@fafhrd91
Copy link
Member

good

@lock
Copy link

lock bot commented Oct 28, 2019

This thread has been automatically locked since there has not been
any recent activity after it was closed. Please open a new issue for
related bugs.

If you feel like there's important points made in this discussion,
please include those exceprts into that new issue.

@lock lock bot added the outdated label Oct 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants