Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak for simple get call #4618

Closed
dima-kov opened this issue Mar 10, 2020 · 44 comments
Closed

Memory leak for simple get call #4618

dima-kov opened this issue Mar 10, 2020 · 44 comments
Labels

Comments

@dima-kov
Copy link

dima-kov commented Mar 10, 2020

🐞 Describe the bug

I have faced with memory leak in my program: after loading 100k URLs the process eats additional 300mb of RAM. I had extracted this piece of code, that reproduces memory issue (below) and ran it with mprof for 5 min for loading 4k urls:

aiohttp-memory-leak

💡 To Reproduce

Use env:

aiohttp==4.0.0a1
async-timeout==3.0.1
attrs==19.3.0
chardet==3.0.4
cycler==0.10.0
idna==2.9
kiwisolver==1.1.0
matplotlib==3.2.0
memory-profiler==0.57.0
multidict==4.7.5
numpy==1.18.1
psutil==5.7.0
pyparsing==2.4.6
python-dateutil==2.8.1
six==1.14.0
typing-extensions==3.7.4.1
uvloop==0.14.0
yarl==1.4.2

Tested with different versions of aiohttp (>3.5)

Code:

import asyncio
import uvloop
import aiohttp

url = 'https://docs.mapbox.com/mapbox-gl-js/assets/earthquakes.geojson'

load_times = 4000


def get_to_load():
    global load_times
    if load_times > 0:
        load_times -= 1
        return url
    return None


async def fetch(session, url):
    response = await session.get(url)
    return await response.text()


async def load(session, worker_id):
    to_load = get_to_load()
    while to_load is not None:
        r = await fetch(session, to_load)
        print('Done', worker_id)
        await asyncio.sleep(0.2)
        to_load = get_to_load()


async def main(workers_num=90):
    async with aiohttp.ClientSession() as session:
        await asyncio.gather(*[load(session, i) for i in range(workers_num)])


asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
loop = asyncio.get_event_loop()

asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
asyncio.run(asyncio.sleep(3))
asyncio.run(main())
asyncio.run(asyncio.sleep(5))
  1. Run for 5 min with profiler.

💡 Expected behavior
Keep memory usage near straight during the whole program run.

📋 Your version of the Python
Python 3.7.5 (Also tested on 3.8)

📋 Your version of the aiohttp/yarl/multidict distributions

  • aiohttp:
Name: aiohttp
Version: 4.0.0a1
  • yarn:
Name: yarl
Version: 1.4.2
  • multidict
Name: multidict
Version: 4.7.5

📋 Additional context
I have read all issues related to memory leaks, but have not found any appropriate solution. Here are what I tried:

  • update all packages
  • aiohttp 4.0
  • python 3.8
  • uvloop implementation instead of asyncio
  • single aiohttp.ClientSession instead of new on every request (actually, this gave a bit increase)

📋 Enviroment

  • OSX
ProductVersion: 10.15.1
BuildVersion:   19B88
  • also ran in docker (linux) - the same
@dima-kov dima-kov added the bug label Mar 10, 2020
@KKomarov
Copy link

such session helps
aiohttp.ClientSession(cookie_jar=aiohttp.DummyCookieJar())
also your request objects not closed exlicitly

@socketpair
Copy link
Contributor

async def fetch (session, url):
    async with session.get(...) as x:
        return await x.text()

@dima-kov
Copy link
Author

such session helps
aiohttp.ClientSession(cookie_jar=aiohttp.DummyCookieJar())
also your request objects not closed exlicitly

@KKomarov but why they are not closed explicitly? According to docs, as I understood, calling .text() should be enough?

@dima-kov
Copy link
Author

@KKomarov, Dummy cookies have not helped
async_fix_dummy_cookies

@dima-kov
Copy link
Author

@socketpair , here is a result. Also, not increase
async_fix_fail

@KKomarov
Copy link

@dima-kov closing responses helped for me, memory is about 100mb. maybe it's closing after calling text but not immediately.

@dima-kov
Copy link
Author

dima-kov commented Mar 20, 2020

@KKomarov , could you share your code, please? I tried with this and failed:

import asyncio
import uvloop
import aiohttp

url = 'https://docs.mapbox.com/mapbox-gl-js/assets/earthquakes.geojson'
load_times = 4000


def get_to_load():
    global load_times
    if load_times > 0:
        load_times -= 1
        return url
    return None


async def fetch(session, url):
    response = await session.get(url)
    t = await response.text()
    response.close()
    return t


async def load(session, worker_id):
    to_load = get_to_load()
    print(f'start {worker_id}')
    while to_load is not None:
        r = await fetch(session, to_load)
        print('Done', worker_id)
        to_load = get_to_load()


async def main(workers_num=90):
    async with aiohttp.ClientSession(cookie_jar=aiohttp.DummyCookieJar()) as session:
        await asyncio.gather(*[load(session, i) for i in range(workers_num)])


asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
asyncio.run(asyncio.sleep(3))
asyncio.run(main())
asyncio.run(asyncio.sleep(5))

async_fix_fail_second

@KKomarov
Copy link

async def fetch(session, url):
    async with session.get(url) as response:
        return await response.text()

@hh-h
Copy link
Contributor

hh-h commented Apr 10, 2020

this is a graph for this code
image

import asyncio
import aiohttp

url = 'https://docs.mapbox.com/mapbox-gl-js/assets/earthquakes.geojson'
load_times = 4000


def get_to_load():
    global load_times
    if load_times > 0:
        load_times -= 1
        return url
    return None


async def fetch(session, url):
    async with session.get(url) as r:
        return await r.text()


async def load(session, worker_id):
    to_load = get_to_load()
    print(f'start {worker_id}')
    while to_load is not None:
        await fetch(session, to_load)
        print('Done', worker_id)
        to_load = get_to_load()


async def main(workers_num=90):
    async with aiohttp.ClientSession() as session:
        await asyncio.gather(*[load(session, i) for i in range(workers_num)])


asyncio.run(asyncio.sleep(3))
asyncio.run(main())
asyncio.run(asyncio.sleep(5))

but I suggest to rewrite it to something like this code

import asyncio
import aiohttp
from typing import Coroutine

URL = 'https://docs.mapbox.com/mapbox-gl-js/assets/earthquakes.geojson'
MAX_LOAD_TIMES = 500


async def fetch(session: aiohttp.ClientSession, url: str):
    async with session.get(url) as r:
        return await r.text()


async def limited(semaphore: asyncio.Semaphore, coro: Coroutine) -> None:
    async with semaphore:
        await coro


async def main(workers_num=90):
    await asyncio.sleep(3)
    semaphore = asyncio.Semaphore(workers_num)
    async with aiohttp.ClientSession() as session:
        await asyncio.gather(*[limited(semaphore, fetch(session, URL)) for _ in range(MAX_LOAD_TIMES)])
    await asyncio.sleep(5)

asyncio.run(main())

graph (decreased load_times to 500)
image

@socketpair
Copy link
Contributor

socketpair commented Apr 10, 2020

A week ago I faced almost the same problem. Seems Python makes huge memory fragmentation. So in order to account memory you should consider two methods:

  1. number of anoymous pages of the process
  2. summ of arguments of all malloc()s that were not free()d

Probably, due to enormous memory fragmentation point 1 will grow and point 2 will not.

I'm not sure, but possibly you need to use tracemalloc module

@Luis310C
Copy link

this bug has not been fixed?

@socketpair
Copy link
Contributor

No, not fixed. Do you have sequence how to reproduce ?

@webknjaz webknjaz added the reproducer: missing This PR or issue lacks code, which reproduce the problem described or clearly understandable STR label Aug 31, 2021
@hf-kklein
Copy link

relates to #4833

@Dreamsorcerer
Copy link
Member

This feels like a cpython bug.

Even when I add to the end of the script:

gc.collect()
time.sleep(30)

The memory usage still doesn't go down. This is after asyncio.run() has completed, so there should be no references to anything still around.

@Dreamsorcerer Dreamsorcerer removed the reproducer: missing This PR or issue lacks code, which reproduce the problem described or clearly understandable STR label Aug 23, 2024
@bdraco
Copy link
Member

bdraco commented Sep 21, 2024

Can you reproduce it if fetch a plaintext non ssl url?

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Sep 26, 2024

Hmm, it might be the Python parser...

I'm running the first example at: #4618 (comment)

If I just run it with aiohttp 3.10.6 installed, then memory usage sits at 1.6% the entire time. It doesn't increase over time or anything.
But when I run it from the repo with PYTHONPATH=/path/to/aiohttp, then I see the memory constantly increasing all the time (reaching >2% within about 10 seconds and keeps climbing) the script is running. When running from the repo source, the C parsers won't be used, so that's the most likely suspect to me.

@Dreamsorcerer
Copy link
Member

Yes, if I run it the same way with AIOHTTP_NO_EXTENSIONS='1' on the installed version, then I get the memory leak. Definitely the Python code, which is probably why we haven't seen so many complaints, as not many users will be running this.

@Dreamsorcerer
Copy link
Member

Commenting out the other imports, it's definitely caused by HttpResponseParser. There's no leak if I just use that one class from the C module.

@Dreamsorcerer
Copy link
Member

tracemalloc seems useless. It seems to suggest there is more memory allocated when using the C extension than when there's an obvious memory leak...

@Dreamsorcerer
Copy link
Member

Same results when testing with aiohttp 3.6.2 and 3.10.6, and on Python 3.8 and 3.10.

Not sure what's happening, but it's trivial to reproduce by looking at top and seeing the memory keep going up.

@Dreamsorcerer
Copy link
Member

Through trial and error, I think I've narrowed it down to StreamReader or HttpPayloadParser at:

payload = StreamReader(
self.protocol,
timer=self.timer,
loop=loop,
limit=self._limit,
)
payload_parser = HttpPayloadParser(
payload,
length=length,
chunked=msg.chunked,
method=method,
compression=msg.compression,
code=self.code,
response_with_body=self.response_with_body,
auto_decompress=self._auto_decompress,
lax=self.lax,
)

But, struggling to figure out any more than that.

@bdraco
Copy link
Member

bdraco commented Sep 28, 2024

ClientConnectionError seem to build up in memory. I wonder if they hold a reference that is preventing GC

@bdraco
Copy link
Member

bdraco commented Sep 28, 2024

at the end after a gc

function                   6163
tuple                      3771
dict                       2746
wrapper_descriptor         1515
ReferenceType              1487
method_descriptor          1099
builtin_function_or_method 1063
getset_descriptor          1014
type                       917
list                       665

@bdraco
Copy link
Member

bdraco commented Sep 28, 2024

import gc
import asyncio
import aiohttp
import objgraph
import pprint
url = 'http://docs.mapbox.com/mapbox-gl-js/assets/earthquakes.geojson'
load_times = 4000


def get_to_load():
    print(objgraph.show_growth())
    global load_times
    if load_times > 0:
        load_times -= 1
        return url
    return None


async def fetch(session, url):
    async with session.get(url) as r:
        return await r.text()


async def load(session, worker_id):
    to_load = get_to_load()
    print(f'start {worker_id}')
    while to_load is not None:
        await fetch(session, to_load)
        print('Done', worker_id)
        to_load = get_to_load()


async def main(workers_num=90):
    async with aiohttp.ClientSession() as session:
        await asyncio.gather(*[load(session, i) for i in range(workers_num)])


asyncio.run(asyncio.sleep(3))
asyncio.run(main())
print("common")
print(objgraph.show_most_common_types())
asyncio.run(asyncio.sleep(5))
gc.collect()
print("common after")
print(objgraph.show_most_common_types())


def _safe_repr(obj) -> str:
    """Get the repr of an object but keep going if there is an exception.

    We wrap repr to ensure if one object cannot be serialized, we can
    still get the rest.
    """
    try:
        return repr(obj)
    except Exception:  # noqa: BLE001
        return f"Failed to serialize {type(obj)}"

for obj in objgraph.by_type('list'):
    print(_safe_repr(obj))

@Dreamsorcerer
Copy link
Member

I'm not so sure of my previous analysis anymore. I tested it for longer, and with the Python parser the memory usage never exceeds 4%. It might just be that it's less memory efficient than the C parser, and so much slower that it takes about a minute to reach peak memory usage instead of only a few seconds with the C parser.

@bdraco
Copy link
Member

bdraco commented Oct 3, 2024

I did notice we create a lot of small objects and gc overhead starts to affect performance when request overhead is high. We can probably save a bit of ram for these with slots

@bdraco
Copy link
Member

bdraco commented Oct 3, 2024

Some of those objects are now dataclasses in 4.x which don’t support slots on all python versions we support

@bdraco
Copy link
Member

bdraco commented Oct 3, 2024

3.10 is needed for slots. I think we solved a similar with a different solution for ESP home.

@bdraco
Copy link
Member

bdraco commented Oct 3, 2024

Objgraph can better tell us which ones churn and then we can add slots to any that can use it. That should help a bit with the gc overhead

I haven’t been able to get it to leak though

@Dreamsorcerer
Copy link
Member

I haven’t been able to get it to leak though

Your memory usage is the same at the start of the script, and when it's sleeping at the end?
On mine, the memory usage seems to only drop by a small portion of what it went up by.
e.g. starts at 1%, reaches a peak of 3.6%, then while sleeping at the end it's still at 3%. Or something along those lines. Measured just by looking at top.

@bdraco
Copy link
Member

bdraco commented Oct 3, 2024

Here are the objects that we churn

CIMultiDict 363 +3. -- nothing we can do here
ClientConnectionError 95 +3. -- we can avoid creating this one #9405
list 1675 +2 -- nothing we can do here
deque 634 +2 -- nothing we can do here
SplitResult 186 +2 -- nothing we can do here, part of yarl implementation
URL 183 +2 -- nothing we can do here, part of yarl implementation
SimpleCookie 181 +2 - inherits from dict nothing we can do here
StreamReader 182 +2 #9407
TimerHandle 137 +1 - asyncio -- cpython already implements __slots__ https://github.com/python/cpython/blob/1f9025a4e7819bb4f7799784710f0f3750a9ca31/Lib/asyncio/events.py#L107
TimerContext 92 +1 #9406

@bdraco
Copy link
Member

bdraco commented Oct 3, 2024

I haven’t been able to get it to leak though

Your memory usage is the same at the start of the script, and when it's sleeping at the end? On mine, the memory usage seems to only drop by a small portion of what it went up by. e.g. starts at 1%, reaches a peak of 3.6%, then while sleeping at the end it's still at 3%. Or something along those lines. Measured just by looking at top.

memory usage goes up but I didn't find any python object leaks.

bdraco added a commit that referenced this issue Oct 3, 2024
Every connection will end with the connection closed
exception. Only create it once since its always the
same.

related issue #4618
bdraco added a commit that referenced this issue Oct 3, 2024
We use __slots__ almost everywhere else in the codebase, however
__slots__ was missing for these helpers

related issue #4618
bdraco added a commit that referenced this issue Oct 3, 2024
We use `__slots__` almost everywhere else in the codebase, however `__slots__`
were not implemented in stream

related issue #4618
@bdraco
Copy link
Member

bdraco commented Oct 3, 2024

import gc
import asyncio
import aiohttp
import psutil
import os

url = "http://docs.mapbox.com/mapbox-gl-js/assets/earthquakes.geojson"
load_times = 4000


def get_to_load():
    global load_times
    if load_times > 0:
        load_times -= 1
        return url
    return None


async def fetch(session, url):
    async with session.get(url) as r:
        return await r.text()


async def load(session, worker_id):
    to_load = get_to_load()
    while to_load is not None:
        await fetch(session, to_load)
        to_load = get_to_load()


async def main(workers_num=90):
    async with aiohttp.ClientSession() as session:
        await asyncio.gather(*[load(session, i) for i in range(workers_num)])


async def single_request():
    async with aiohttp.ClientSession() as session:
        await fetch(session, url)


mem_before = psutil.Process(os.getpid()).memory_info().rss / 1024**2
print("--memory before--")
print(mem_before)
asyncio.run(asyncio.sleep(3))

asyncio.run(single_request())
asyncio.run(asyncio.sleep(2))
print("--memory after single request--")
mem_after_single = psutil.Process(os.getpid()).memory_info().rss / 1024**2
print(mem_after_single)

asyncio.run(main())
asyncio.run(asyncio.sleep(5))
mem_after = psutil.Process(os.getpid()).memory_info().rss / 1024**2
print("--memory after, before gc--")
print(mem_after)
gc.collect()
mem_after_gc = psutil.Process(os.getpid()).memory_info().rss / 1024**2
print("--memory after, after gc--")
print(mem_after_gc)

measuring rss

@bdraco
Copy link
Member

bdraco commented Oct 3, 2024

with linked PRs

--memory before--
33.34375
--memory after single request--
41.265625
--memory after, before gc--
538.578125
--memory after, after gc--
538.578125

@bdraco
Copy link
Member

bdraco commented Oct 3, 2024

before linked PRs

--memory before--
33.765625
--memory after single request--
41.21875
--memory after, before gc--
543.09375
--memory after, after gc--
543.09375

So it helps a little but not that much

@bdraco
Copy link
Member

bdraco commented Oct 3, 2024

more fixes in the linked PR

(venv) bdraco@MacBook-Pro-5 aiohttp % python3 rss.py 
--memory before--
35.046875
--memory after single request--
42.65625
--memory after, before gc--
491.125
--memory after, after gc--
492.828125

So now a nice reduction, but still the memory isn't freed

@bdraco
Copy link
Member

bdraco commented Oct 3, 2024

oh wait, we need to trim

@bdraco
Copy link
Member

bdraco commented Oct 3, 2024

with trim

import gc
import asyncio
import aiohttp
import psutil
import os
import ctypes

url = "http://docs.mapbox.com/mapbox-gl-js/assets/earthquakes.geojson"
load_times = 4000


def trim_memory() -> int:
    libc = ctypes.CDLL("libc.so.6")
    return libc.malloc_trim(0)


def get_to_load():
    global load_times
    if load_times > 0:
        load_times -= 1
        return url
    return None


async def fetch(session, url):
    async with session.get(url) as r:
        return await r.text()


async def load(session, worker_id):
    to_load = get_to_load()
    while to_load is not None:
        await fetch(session, to_load)
        to_load = get_to_load()


async def main(workers_num=90):
    async with aiohttp.ClientSession() as session:
        await asyncio.gather(*[load(session, i) for i in range(workers_num)])


async def single_request():
    async with aiohttp.ClientSession() as session:
        await fetch(session, url)


mem_before = psutil.Process(os.getpid()).memory_info().rss / 1024**2
print("--memory before--")
print(mem_before)
asyncio.run(asyncio.sleep(3))

asyncio.run(single_request())
asyncio.run(asyncio.sleep(2))
print("--memory after single request--")
mem_after_single = psutil.Process(os.getpid()).memory_info().rss / 1024**2
print(mem_after_single)

asyncio.run(main())
asyncio.run(asyncio.sleep(5))
mem_after = psutil.Process(os.getpid()).memory_info().rss / 1024**2
print("--memory after, before gc--")
print(mem_after)
gc.collect()
mem_after_gc = psutil.Process(os.getpid()).memory_info().rss / 1024**2
print("--memory after, after gc--")
print(mem_after_gc)
trim_memory()
mem_after_trim = psutil.Process(os.getpid()).memory_info().rss / 1024**2
print("--memory after, after trim--")
print(mem_after_trim)

@bdraco
Copy link
Member

bdraco commented Oct 3, 2024

trim test will only work on glibc

@bdraco
Copy link
Member

bdraco commented Oct 3, 2024

root@ubuntu:~# python3 rss.py
--memory before--
31.41015625
--memory after single request—-
33.14453125
--memory after, before gc--
153. 48828125
--memory after, after gc--
137.015625
--memory after, after trim--
36.546875

doesn't seem to leak after trim.

@bdraco
Copy link
Member

bdraco commented Oct 3, 2024

Another version with a loop to run it twice to make sure the ram doesn't go up after the second iterator

(sorry for the screen shot, running on a remote term)

Screenshot 2024-10-03 at 12 03 08 PM
import gc
import asyncio
import aiohttp
import psutil
import os
import ctypes

url = "http://docs.mapbox.com/mapbox-gl-js/assets/earthquakes.geojson"
load_times = 4000


def trim_memory() -> int:
    libc = ctypes.CDLL("libc.so.6")
    return libc.malloc_trim(0)


def get_to_load():
    global load_times
    if load_times > 0:
        load_times -= 1
        return url
    return None


async def fetch(session, url):
    async with session.get(url) as r:
        return await r.text()


async def load(session, worker_id):
    to_load = get_to_load()
    while to_load is not None:
        await fetch(session, to_load)
        to_load = get_to_load()


async def main(workers_num=90):
    async with aiohttp.ClientSession() as session:
        await asyncio.gather(*[load(session, i) for i in range(workers_num)])


async def single_request():
    async with aiohttp.ClientSession() as session:
        await fetch(session, url)


for _ in range(2):
   mem_before = psutil.Process(os.getpid()).memory_info().rss / 1024**2
   print("--memory before--")
   print(mem_before)
   asyncio.run(asyncio.sleep(3))
   
   asyncio.run(single_request())
   asyncio.run(asyncio.sleep(2))
   print("--memory after single request--")
   mem_after_single = psutil.Process(os.getpid()).memory_info().rss / 1024**2
   print(mem_after_single)
   
   asyncio.run(main())
   asyncio.run(asyncio.sleep(5))
   mem_after = psutil.Process(os.getpid()).memory_info().rss / 1024**2
   print("--memory after, before gc--")
   print(mem_after)
   gc.collect()
   mem_after_gc = psutil.Process(os.getpid()).memory_info().rss / 1024**2
   print("--memory after, after gc--")
   print(mem_after_gc)
   trim_memory()
   mem_after_trim = psutil.Process(os.getpid()).memory_info().rss / 1024**2
   print("--memory after, after trim--")
   print(mem_after_trim)

@bdraco
Copy link
Member

bdraco commented Oct 3, 2024

So I'd say there is no leak, and the test case was missing a call to malloc_trim after

At least we discovered a few places where __slots__ were missing to help reduce the memory chrun

@Dreamsorcerer
Copy link
Member

Ah, OK. I'd have assumed that cpython would do any relevant trimming when the memory usage drops by a significant amount (though maybe that amount is just higher than my expectations).

Anyway, I think we can call this resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants