Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak when asyncio.open_connection raise #88863

Closed
seer mannequin opened this issue Jul 21, 2021 · 9 comments · Fixed by #95739
Closed

Memory leak when asyncio.open_connection raise #88863

seer mannequin opened this issue Jul 21, 2021 · 9 comments · Fixed by #95739
Labels
3.11 only security fixes 3.12 bugs and security fixes performance Performance or resource usage topic-asyncio

Comments

@seer
Copy link
Mannequin

seer mannequin commented Jul 21, 2021

BPO 44697
Nosy @brianquinlan, @asvetlov, @1st1, @aeros, @jdevries3133

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2021-07-21.13:46:19.757>
labels = ['expert-asyncio', '3.9', 'performance']
title = 'Memory leak when asyncio.open_connection raise'
updated_at = <Date 2021-08-05.20:05:54.604>
user = 'https://bugs.python.org/seer'

bugs.python.org fields:

activity = <Date 2021-08-05.20:05:54.604>
actor = 'jack__d'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['asyncio']
creation = <Date 2021-07-21.13:46:19.757>
creator = 'seer'
dependencies = []
files = []
hgrepos = []
issue_num = 44697
keywords = ['patch']
message_count = 4.0
messages = ['397945', '397982', '398219', '398234']
nosy_count = 7.0
nosy_names = ['bquinlan', 'asvetlov', 'yselivanov', 'aeros', 'jack__d', 'seer', 'whoKilledLora']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'resource usage'
url = 'https://bugs.python.org/issue44697'
versions = ['Python 3.9']

Linked PRs

@seer
Copy link
Mannequin Author

seer mannequin commented Jul 21, 2021

I write some short example.

import resource
import asyncio


class B:
    def __init__(self, loop):
        self.loop = loop
        self.some_big_data = bytearray(1024 * 1024)  # 1Mb for memory bloating

    async def doStuff(self):
        if not await self.connect():
            return
        print('Stuff done')

    async def connect(self) -> bool:
        try:
            _, writer = await asyncio.open_connection('127.0.0.1', 12345, loop=self.loop)
            writer.close()
            return True
        except OSError as e:
            pass
        return False


class A:
    def __init__(self, loop):
        self.loop = loop

    async def doBStuff(self):
        b = B(self.loop)
        await b.doStuff()

    async def work(self):
        print('Working...')
        for _ in range(1000):
            await self.loop.create_task(self.doBStuff())
        print('Done.')
        print(
            'Memory usage {}kb'.format(
                resource.getrusage(
                    resource.RUSAGE_SELF).ru_maxrss))


async def amain(loop):
    a = A(loop)
    await a.work()


if __name__ == "__main__":
    loop = asyncio.get_event_loop()
    loop.run_until_complete(amain(loop))

100 cycles
"Memory usage 41980kb"

1000 cycles
"Memory usage 55412kb"

10000 cycles
"Memory usage 82880kb"

And so on...

Does anyone know workaround?

@seer seer mannequin added topic-asyncio performance Performance or resource usage labels Jul 21, 2021
@whoKilledLora
Copy link
Mannequin

whoKilledLora mannequin commented Jul 22, 2021

Confirmed. I have the same problem. I suspect this is related to https://bugs.python.org/issue41699.

@seer
Copy link
Mannequin Author

seer mannequin commented Jul 26, 2021

Checked on 3.9.6 - still leaking.

Strange stuff, but if I write

except OSError as e:
    del self

instead of

except OSError as e:
    pass

leak is disappearing.

@seer seer mannequin added 3.9 only security fixes labels Jul 26, 2021
@aeros
Copy link
Contributor

aeros commented Jul 26, 2021

Thank you Arteem, that should help indicate where the memory leak is present.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@kumaraditya303 kumaraditya303 added 3.11 only security fixes 3.12 bugs and security fixes and removed 3.9 only security fixes labels May 28, 2022
@kumaraditya303
Copy link
Contributor

This seems more like a gc issue as adding gc.collect() calls after each iteration fixes the leak.

import resource
import asyncio
import gc


class B:
    def __init__(self, loop):
        self.loop = loop
        self.some_big_data = bytearray(1024 * 1024)  # 1Mb for memory bloating

    async def doStuff(self):
        if not await self.connect():
            return
        print('Stuff done')

    async def connect(self) -> bool:
        try:
            _, writer = await asyncio.open_connection('127.0.0.1', 12345)
            writer.close()
            return True
        except OSError as e:
            pass
        return False


class A:
    def __init__(self, loop):
        self.loop = loop

    async def doBStuff(self):
        b = B(self.loop)
        await b.doStuff()

    async def work(self):
        print('Working...')
        for i in range(1000):
            await self.loop.create_task(self.doBStuff())
            gc.collect()
            if i % 100 == 0:
                print('Memory usage {}kb {}'.format(resource.getrusage(
                    resource.RUSAGE_SELF).ru_maxrss, i))
        print('Memory usage {}kb '.format(resource.getrusage(
            resource.RUSAGE_SELF).ru_maxrss, ))
        print('Done.')


async def amain(loop):
    a = A(loop)
    await a.work()


if __name__ == "__main__":
    loop = asyncio.get_event_loop()
    loop.run_until_complete(amain(loop))

cc @pablogsal

@pablogsal
Copy link
Member

This seems more like a gc issue as adding gc.collect() calls after each iteration fixes the leak.

That doesn't mean it is a GC issue. That means the GC is doing its work and the problem is that there are reference cycles to clean that until the GC runs they won't go away on their own.

It may be something to help here, but the GC is working as expected from this description.

@kumaraditya303
Copy link
Contributor

This might be related to #81001

@kumaraditya303
Copy link
Contributor

After some investigation, the issue is that the raised exception's traceback forms ref cycles are eventually cleared by gc but causes increased memory usage. The following example does not form ref cycles and hence memory usage is constant.

import resource
import asyncio


class B:
    def __init__(self, loop):
        self.loop = loop
        self.some_big_data = bytearray(1024 * 1024)  # 1Mb for memory bloating

    async def doStuff(self):
        if not await self.connect():
            return
        print('Stuff done')

    async def connect(self) -> bool:
        try:
            _, writer = await asyncio.open_connection('127.0.0.1', 12345)
            writer.close()
            return True
        except OSError as e:
            e.__traceback__ = None # Clear ref cycle manually
        return False


class A:
    def __init__(self, loop):
        self.loop = loop

    async def doBStuff(self):
        b = B(self.loop)
        await b.doStuff()

    async def work(self):
        print('Working...')
        for i in range(1000):
            await self.loop.create_task(self.doBStuff())
            if i % 100 == 0:
                print('Memory usage {}kb {}'.format(resource.getrusage(
                    resource.RUSAGE_SELF).ru_maxrss, i))
        print('Memory usage {}kb '.format(resource.getrusage(
            resource.RUSAGE_SELF).ru_maxrss, ))
        print('Done.')


async def amain(loop):
    a = A(loop)
    await a.work()


if __name__ == "__main__":
    loop = asyncio.get_event_loop()
    loop.run_until_complete(amain(loop))

@frostbyte134
Copy link
Contributor

frostbyte134 commented Aug 6, 2022

As kumaraditya303 said, it seems like there are some local variables (exception and future instances) which makes reference cycles in asyncio library.
I uploaded a PR which breaks those cycles explicitly.

After the fix numbers from get_traced_memory does not grow linearly with the loop count anymore.

MacBookPro cpython % ./python.exe ~/dev/leak.py --loop_count 10  
Working...
Done.
cur = 1311527, peak = 2365922 from tracemalloc.get_traced_memory()
------------------------------
MacBookPro cpython % ./python.exe ~/dev/leak.py --loop_count 100
Working...
Done.
cur = 1317149, peak = 2371520 from tracemalloc.get_traced_memory()
------------------------------
MacBookPro cpython % ./python.exe ~/dev/leak.py --loop_count 1000
Working...
Done.
cur = 1317050, peak = 2371421 from tracemalloc.get_traced_memory()
import asyncio
import argparse
import tracemalloc


class B:
    def __init__(self):
        self.some_big_data = bytearray(1024 * 1024)  # 1Mb for memory bloating

    async def doStuff(self):
        if not await self.connect():
            return
        print('Stuff done')

    async def connect(self) -> bool:
        try:
            _, writer = await asyncio.open_connection('127.0.0.1', 12345)
            writer.close()
            return True
        except OSError as e:
            pass
        return False


class A:
    def __init__(self, loop_count):
        self.loop_count = loop_count
        pass

    async def doBStuff(self):
        b = B()
        await b.doStuff()

    async def work(self):
        print('Working...')
        for _ in range(self.loop_count):
            await self.doBStuff()
        print('Done.')
        # print(
        #     'Memory usage {}mb'.format(
        #         int(resource.getrusage(
        #             resource.RUSAGE_SELF).ru_maxrss)//1000))
        current, peak = tracemalloc.get_traced_memory()
        print(f'cur = {current}, peak = {peak} from tracemalloc.get_traced_memory()')
        print("-"*30)


async def amain(loop_count):
    a = A(loop_count)
    await a.work()


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument('-l', '--loop_count', type=int, nargs='?', default=10)
    args = parser.parse_args()

    tracemalloc.start()
    asyncio.run(amain(args.loop_count))

Checked in Windows 10/Mac Monterey/Centos 7.9

gvanrossum pushed a commit that referenced this issue Nov 22, 2022
…on raises (#95739)

Break reference cycles to resolve memory leak, by
removing local exception and future instances from the frame
frostbyte134 added a commit to frostbyte134/cpython that referenced this issue Nov 23, 2022
…open_connection raises (pythonGH-95739)

Break reference cycles to resolve memory leak, by
removing local exception and future instances from the frame.
(cherry picked from commit 995f617)

Co-authored-by: Dong Uk, Kang <nailbrainz@gmail.com>
frostbyte134 added a commit to frostbyte134/cpython that referenced this issue Nov 23, 2022
…open_connection raises (pythonGH-95739)

Break reference cycles to resolve memory leak, by
removing local exception and future instances from the frame.
(cherry picked from commit 995f617)

Co-authored-by: Dong Uk, Kang <nailbrainz@gmail.com>
gvanrossum pushed a commit that referenced this issue Nov 23, 2022
…onnection raises (GH-95739) (#99721)

Break reference cycles to resolve memory leak, by
removing local exception and future instances from the frame.
(cherry picked from commit 995f617)

Co-authored-by: Dong Uk, Kang <nailbrainz@gmail.com>
gvanrossum pushed a commit that referenced this issue Nov 23, 2022
…onnection raises (GH-95739) (#99722)

Break reference cycles to resolve memory leak, by
removing local exception and future instances from the frame.
(cherry picked from commit 995f617)

Co-authored-by: Dong Uk, Kang <nailbrainz@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes 3.12 bugs and security fixes performance Performance or resource usage topic-asyncio
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants