Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in asyncio.Server.close #109564

Open
bmerry opened this issue Sep 19, 2023 · 8 comments
Open

Race condition in asyncio.Server.close #109564

bmerry opened this issue Sep 19, 2023 · 8 comments
Labels
topic-asyncio type-bug An unexpected behavior, bug, or error

Comments

@bmerry
Copy link
Contributor

bmerry commented Sep 19, 2023

Bug report

Bug description:

There is a race condition when closing an asyncio.Server. Accepting a connection takes several iterations of the event loop to complete, and a call to Server.close in the middle will lead to exceptions.

First, some demonstration code. Run both of these concurrently, running the server with python -X dev:

Client:

#!/usr/bin/env python3

import socket

while True:
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
        try:
            sock.connect(("127.0.0.1", 4445))
        except IOError:
            pass

Server:

#!/usr/bin/env python3

import asyncio

def cb(reader, writer):
    writer.close()

def _my_attach(self):
    if self._sockets is None:
        print(self._sockets, self._active_count, flush=True)
    _orig_attach(self)

async def main():
    while True:
        server = await asyncio.start_server(cb, host="127.0.0.1", port=4445)
        await asyncio.sleep(0)
        server.close()
        await server.wait_closed()

asyncio.run(main())

It will repeatly output errors like this:

Error on transport creation for incoming connection
handle_traceback: Handle created at (most recent call last):
  File "/usr/lib/python3.12/asyncio/runners.py", line 194, in run
    return runner.run(main)
  File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
  File "/usr/lib/python3.12/asyncio/base_events.py", line 651, in run_until_complete
    self.run_forever()
  File "/usr/lib/python3.12/asyncio/base_events.py", line 618, in run_forever
    self._run_once()
  File "/usr/lib/python3.12/asyncio/base_events.py", line 1943, in _run_once
    handle._run()
  File "/usr/lib/python3.12/asyncio/events.py", line 84, in _run
    self._context.run(self._callback, *self._args)
  File "/usr/lib/python3.12/asyncio/selector_events.py", line 211, in _accept_connection
    self.create_task(accept)
  File "/usr/lib/python3.12/asyncio/base_events.py", line 436, in create_task
    task = tasks.Task(coro, loop=self, name=name, context=context)
protocol: <asyncio.streams.StreamReaderProtocol object at 0x7f7bb340b350>
Traceback (most recent call last):
  File "/usr/lib/python3.12/asyncio/selector_events.py", line 230, in _accept_connection2
    transport = self._make_socket_transport(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/selector_events.py", line 72, in _make_socket_transport
    return _SelectorSocketTransport(self, sock, protocol, waiter,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/selector_events.py", line 936, in __init__
    super().__init__(loop, sock, protocol, extra, server)
  File "/usr/lib/python3.12/asyncio/selector_events.py", line 800, in __init__
    self._server._attach()
  File "/usr/lib/python3.12/asyncio/base_events.py", line 294, in _attach
    assert self._sockets is not None
AssertionError

Additionally, stopping the server (with Ctrl-C) prints lots of ResourceWarnings about unclosed transports and sockets.

When a connection arrives on a socket (with selector_events), it goes through the following steps before reaching Server._attach:

  1. The socket is accepted: here
  2. _accept_connection2 is scheduled (will only run on next event loop iteration): here
  3. The transport is created: here
  4. Server._attach is called: here

It's possible for server.close() to be executed between steps 2 and 3, in which case Server._sockets has already been set to None, and the assertion is triggered. The client will presumably experience this as a connection that either closes immediately (if the garbage collector closes the socket) or is unresponsive.

This particular bug might be fixable by creating the transport synchronously in _accept_connection. I think there may be further potential race conditions because connection_made is also called asynchronously, which means it is possible for it to be called after Server.close has already returned. That might not break anything in asyncio itself but it will cause software that does something like the following to hang in 3.12 (related to #79033 / #104344):

server.close()
for (reader, writer) in my_connections:
    writer.close()
    await writer.closed()
await server.wait_closed()

CPython versions tested on:

3.11, 3.12

Operating systems tested on:

Linux

@bmerry bmerry added the type-bug An unexpected behavior, bug, or error label Sep 19, 2023
@Yaro1
Copy link

Yaro1 commented Nov 29, 2023

Hi, I would like to take that issue!

@gvanrossum
Copy link
Member

@Yaro1 Okay, first step, can you reproduce the issue locally using the main branch?

@Yaro1
Copy link

Yaro1 commented Dec 1, 2023

@gvanrossum Yes I reproduced

@gvanrossum
Copy link
Member

Okay go ahead and work on a fix then. Come back here if you are stuck.

@Yaro1
Copy link

Yaro1 commented Dec 2, 2023

okay, thank you

@Yaro1
Copy link

Yaro1 commented Dec 9, 2023

Well, I'm not sure about creating the transport synchronously in _accept_connection.

I tried to understand what is the reason of dividing accepting socket and creating a transport to _accept_connection and _accept_connection2.

I found commit which include creating _accept_connection2 2934262

But there is a problem - I guess there is not enough information in commit message:

Issue https://github.com/python/cpython/pull/23333: Fix BaseSelectorEventLoop._accept_connection(). Close the
  transport on error. In debug mode, log errors using call_exception_handler()

And the link for now indicates to another issue.

Could you help me please with an understanding of purpose _accept_connection and _accept_connection2? @gvanrossum

@gvanrossum
Copy link
Member

I'm sorry, I recommend that you try a simpler issue.

@Yaro1
Copy link

Yaro1 commented Dec 10, 2023

okay I will take simpler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-asyncio type-bug An unexpected behavior, bug, or error
Projects
Status: Todo
Development

No branches or pull requests

4 participants