Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResourceWarning: unclosed transport <_SelectorSocketTransport fd=6> (asyncio only) #825

Closed
JulienPalard opened this issue Feb 26, 2020 · 16 comments · Fixed by encode/httpcore#167
Labels
bug Something isn't working concurrency/asyncio Issues related to concurrency (asyncio-only) external Root cause pending resolution in an external dependency

Comments

@JulienPalard
Copy link

References: #550

On 0.11.1 and current HEAD 50d337e I'm getting:

$ ipython
Python 3.8.0 (default, Dec 12 2019, 09:56:28) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.12.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import httpx                                                                                                                                           

In [2]: async with httpx.AsyncClient() as client: 
   ...:     r = await client.get('https://www.example.org/')                                                                                                   

In [3]: r                                                                                                                                                      
Out[3]: <Response [200 OK]>

In [4]:                                                                                                                                                        
Do you really want to exit ([y]/n)? 
An open stream object is being garbage collected; call "stream.close()" explicitly.
source_traceback: Object created at (most recent call last):
[...]
  File "/home/mdk/.local/lib/python3.8/asyncio/base_events.py", line 595, in run_until_complete
    self.run_forever()
  File "/home/mdk/.local/lib/python3.8/asyncio/base_events.py", line 563, in run_forever
    self._run_once()
  File "/home/mdk/.local/lib/python3.8/asyncio/base_events.py", line 1836, in _run_once
    handle._run()
  File "/home/mdk/.local/lib/python3.8/asyncio/events.py", line 81, in _run
    self._context.run(self._callback, *self._args)
  File "/home/mdk/.local/lib/python3.8/asyncio/streams.py", line 50, in open_connection
    reader = StreamReader(limit=limit, loop=loop)
$ 
@JulienPalard
Copy link
Author

Tried with HTTPX_LOG_LEVE=trace because tracemalloc don't tells me much, and Python 3.8.2:

$ cat reproducer.py 
import asyncio

import httpx

async def mdk():
    async with httpx.AsyncClient() as client:
        r = await client.get('https://www.example.org/')
    print(r.text)

asyncio.run(mdk())
 
$ python reproducer.py 
TRACE [2020-02-26 09:37:29] httpx._config - load_ssl_context verify=True cert=None trust_env=True http2=False
TRACE [2020-02-26 09:37:29] httpx._config - load_verify_locations cafile=/home/mdk/clones/python/docsbuild-scripts/.venv/lib/python3.8/site-packages/certifi/cacert.pem
TRACE [2020-02-26 09:37:29] httpx._dispatch.connection_pool - acquire_connection origin=Origin(scheme='https' host='www.example.org' port=443)
TRACE [2020-02-26 09:37:29] httpx._dispatch.connection_pool - new_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.example.org' port=443))
TRACE [2020-02-26 09:37:29] httpx._dispatch.connection - start_connect tcp host='www.example.org' port=443 timeout=Timeout(timeout=5.0)
TRACE [2020-02-26 09:37:29] httpx._dispatch.connection - connected http_version='HTTP/1.1'
TRACE [2020-02-26 09:37:29] httpx._dispatch.http11 - send_headers method='GET' target='/' headers=Headers({'host': 'www.example.org', 'user-agent': 'python-httpx/0.11.1', 'accept': '*/*', 'accept-encoding': 'gzip, deflate', 'connection': 'keep-alive'})
TRACE [2020-02-26 09:37:29] httpx._dispatch.http11 - send_data data=Data(<0 bytes>)
TRACE [2020-02-26 09:37:29] httpx._dispatch.http11 - receive_event event=NEED_DATA
TRACE [2020-02-26 09:37:29] httpx._dispatch.http11 - receive_event event=Response(status_code=200, headers=[(b'content-encoding', b'gzip'), (b'accept-ranges', b'bytes'), (b'age', b'499617'), (b'cache-control', b'max-age=604800'), (b'content-type', b'text/html; charset=UTF-8'), (b'date', b'Wed, 26 Feb 2020 08:37:29 GMT'), (b'etag', b'"3147526947"'), (b'expires', b'Wed, 04 Mar 2020 08:37:29 GMT'), (b'last-modified', b'Thu, 17 Oct 2019 07:18:26 GMT'), (b'server', b'ECS (bsa/EB17)'), (b'vary', b'Accept-Encoding'), (b'x-cache', b'HIT'), (b'content-length', b'648')], http_version=b'1.1', reason=b'OK')
DEBUG [2020-02-26 09:37:29] httpx._client - HTTP Request: GET https://www.example.org/ "HTTP/1.1 200 OK"
TRACE [2020-02-26 09:37:29] httpx._dispatch.http11 - receive_event event=Data(<648 bytes>)
TRACE [2020-02-26 09:37:29] httpx._dispatch.http11 - receive_event event=EndOfMessage(headers=[])
TRACE [2020-02-26 09:37:29] httpx._dispatch.http11 - response_closed our_state=DONE their_state=DONE
TRACE [2020-02-26 09:37:29] httpx._dispatch.connection_pool - release_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.example.org' port=443))
TRACE [2020-02-26 09:37:29] httpx._dispatch.connection - close_connection
TRACE [2020-02-26 09:37:29] httpx._dispatch.http11 - send_event event=ConnectionClosed()
<!doctype html>
[... redacted (useless html) ...]

/home/mdk/.local/lib/python3.8/asyncio/selector_events.py:694: ResourceWarning: unclosed transport <_SelectorSocketTransport fd=6>
  _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
Object allocated at (most recent call last):
  File "reproducer.py", lineno 10
    asyncio.run(mdk())
  File "/home/mdk/.local/lib/python3.8/asyncio/runners.py", lineno 43
    return loop.run_until_complete(main)
  File "/home/mdk/.local/lib/python3.8/asyncio/base_events.py", lineno 603
    self.run_forever()
  File "/home/mdk/.local/lib/python3.8/asyncio/base_events.py", lineno 570
    self._run_once()
  File "/home/mdk/.local/lib/python3.8/asyncio/base_events.py", lineno 1859
    handle._run()
  File "/home/mdk/.local/lib/python3.8/asyncio/events.py", lineno 81
    self._context.run(self._callback, *self._args)
  File "/home/mdk/.local/lib/python3.8/asyncio/streams.py", lineno 52
    transport, _ = await loop.create_connection(
  File "/home/mdk/.local/lib/python3.8/asyncio/base_events.py", lineno 1050
    transport, protocol = await self._create_connection_transport(
  File "/home/mdk/.local/lib/python3.8/asyncio/base_events.py", lineno 1072
    transport = self._make_ssl_transport(
  File "/home/mdk/.local/lib/python3.8/asyncio/selector_events.py", lineno 78
    _SelectorSocketTransport(self, rawsock, ssl_protocol,

@JulienPalard
Copy link
Author

So this is related to #634, _backends/asyncio.py, SocketStream's close method:

    async def close(self) -> None:
        # NOTE: StreamWriter instances expose a '.wait_closed()' coroutine function,
        # but using it has caused compatibility issues with certain sites in
        # the past (see https://github.com/encode/httpx/issues/634), which is
        # why we don't call it here.
        # This is fine, though, because '.close()' schedules the actual closing of the
        # stream, meaning that at best it will happen during the next event loop
        # iteration, and at worst asyncio will take care of it on program exit.
        async with self.write_lock:
            with as_network_error(OSError):
                self.stream_writer.close()

Obviously adding a await self.stream_writer.wait_closed() fixes #825 probably while reopening #634

So I'm going to read #634 now.

@florimondmanca
Copy link
Member

Hi @JulienPalard,

This is very strange to me. It's a straight-on request/response flow, no redirects, no special cases that surface in the TRACE logs…

Also, I am not able to reproduce on Python 3.8.1/HTTPX 0.11.1.

@florimondmanca
Copy link
Member

Huhh… I'm sick with issues we've been having with asyncio's .close() vs .wait_closed()

You can also look at the end of #714. Our conclusion back then was to close off as "this is a Python bug". @tomchristie also noted:

Turns out it only replicates on Python 3.8.0. Does not replicate on 3.7.1 or 3.8.1.

which matches my experience on 3.8.1. The bug may have come back on 3.8.2…? 🤷‍♂

@florimondmanca
Copy link
Member

@JulienPalard I'm curious — what happens if you await asyncio.sleep(0) at the end of main()? Does this give asyncio the chance to execute the callback that closes the stream?

import asyncio

import httpx

async def mdk():
    async with httpx.AsyncClient() as client:
        r = await client.get('https://www.example.org/')
    print(r.text)
    await asyncio.sleep(0)  # <--

asyncio.run(mdk())

@tomchristie
Copy link
Member

tomchristie commented Feb 26, 2020

Feasible that it could be a duplicate of #824 if there happens to be proxies setup in the environment.

Few questions here @JulienPalard:

  • Does the issue still replicate for you when using async with httpx.AsyncClient(trust_env=False)?
  • Does the issue still replicate if you perform the same request in an example.py script, and run with asyncio.run(main()), rather than using a REPL?
  • Does the issue still replicate if you perform the same request in an example.py script, and run with trio.run(main), rather than using a REPL?

@JulienPalard
Copy link
Author

JulienPalard commented Feb 26, 2020

@florimondmanca No, adding asyncio.sleep(0) don't help but asyncio.sleep(.1) fixes it (probably randomly).

@JulienPalard
Copy link
Author

JulienPalard commented Feb 26, 2020

Checks done using httpx on current HEAD (50d337e), ipython==7.12.0, on Debian bullseye, trio==0.13.0:

Sorry I don't know how to design 3d tables in markdown. So, trust_env has no effect, (I tried True and False for each tests) cpython 3.8.1 fixed an issue that did not reappeared in 3.8.2.

Python 3.8.2

  • ipython, trust_env=*: No issue
  • script asyncio, trust_env=*: ResourceWarning
  • script trio, trust_env=*: No issue

Python 3.8.1

  • ipython, trust_env=*: No issue
  • script asyncio, trust_env=*: ResourceWarning
  • script trio, trust_env=*: No issue

Python 3.8.0

  • ipython, trust_env=*: An open stream object is being garbage collected
  • script asyncio, trust_env=*: ResourceWarning
  • script trio, trust_env=*: No issue

[1]:

import asyncio # or import trio

import httpx

async def mdk():
    async with httpx.AsyncClient(trust_env=False) as client:
        r = await client.get('https://www.example.org/')
    print(r.text)

asyncio.run(mdk()) # or trio.run(mdk)

@JulienPalard JulienPalard changed the title An open stream object is being garbage collected; call "stream.close()" explicitly. ResourceWarning: unclosed transport <_SelectorSocketTransport fd=6> Feb 26, 2020
@JulienPalard
Copy link
Author

I'm changing the title appropriately, as the "An open stream ..." has been fixed in Python 3.8.1, this discussion is about the ResourceWarning.

@tomchristie
Copy link
Member

Gotcha, thanks @JulienPalard.

We'd be able to resolve that if we were doing what we're supposed to and calling .wait_closed(), but we can't because it appears to have an issue where it can hang #634 (comment) which trio is particularly careful to work around https://github.com/python-trio/trio/blob/master/trio/_ssl.py#L782-L785

I guess things we ought to figure out are:

  • Can we narrow down the asyncio .wait_closed() issue to a simple enough reproducible case that we can constructively raise it as a Python bug?
  • Can we monkeypatch the issue ourselves, adding trio's careful behavior?

@tomchristie tomchristie changed the title ResourceWarning: unclosed transport <_SelectorSocketTransport fd=6> ResourceWarning: unclosed transport <_SelectorSocketTransport fd=6> (asyncio only) Feb 26, 2020
@tomchristie
Copy link
Member

Also worth investigating - how are the aiohttp team dealing with this, and is there an equivelent issue on their tracker?

@tomchristie
Copy link
Member

Here's a minimal reproducing case...

import asyncio
import ssl
import certifi

hostname = 'login.microsoftonline.com'
context = ssl.create_default_context()
context.load_verify_locations(cafile=certifi.where())

async def main():
    reader, writer = await asyncio.open_connection(hostname, 443, ssl=context)
    print('opened')
    writer.close()
    print('close started')
    await writer.wait_closed()
    print('close completed')

asyncio.run(main())

@tomchristie
Copy link
Member

Raised https://bugs.python.org/issue39758

@florimondmanca florimondmanca added the concurrency Issues related to concurrency and usage of async libraries label Feb 28, 2020
@florimondmanca florimondmanca added the bug Something isn't working label Mar 13, 2020
@tomchristie
Copy link
Member

Haven't looked into how awkward it is to do, but really I think we'd want to monkeypatch asyncio's socket unwrapping behaviour, so that we can reliably call .wait_closed(...) without it potentially hanging.

@yeraydiazdiaz yeraydiazdiaz added the external Root cause pending resolution in an external dependency label May 13, 2020
@florimondmanca florimondmanca added concurrency/asyncio Issues related to concurrency (asyncio-only) and removed concurrency Issues related to concurrency and usage of async libraries labels Aug 1, 2020
@tomchristie
Copy link
Member

Okay, so reviewed this one today, and looks like we can close it off.

Our asyncio backend has this comment...

        # NOTE: StreamWriter instances expose a '.wait_closed()' coroutine function,
        # but using it has caused compatibility issues with certain sites in
        # the past (see https://github.com/encode/httpx/issues/634), which is
        # why we don't call it here.
        # This is fine, though, because '.aclose()' schedules the actual closing of the
        # stream, meaning that at best it will happen during the next event loop
        # iteration, and at worst asyncio will take care of it on program exit.
        async with self.write_lock:
            with map_exceptions({OSError: CloseError}):
                self.stream_writer.close()

Also, this does not reproduce any warnings...

import httpx
import asyncio

async def main():
    async with httpx.AsyncClient() as client:
        r = await client.get("https://login.microsoftonline.com/")
        print(r)

asyncio.run(main(), debug=True)

@tomchristie
Copy link
Member

tomchristie commented Aug 20, 2020

Oh right, ResourceWarning is ignored by default, but this replicates when using eg. python -Wall

I think we'll need to look into a fix that allows us to call .wait_closed that doesn't ever hang if it can't unwrap the SSL connection cleanly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working concurrency/asyncio Issues related to concurrency (asyncio-only) external Root cause pending resolution in an external dependency
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants