Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken HTTP request parsing: Upgrade: h2c header leads to discarded body #8414

Closed
1 task done
BlobbyBob opened this issue May 14, 2024 · 8 comments · Fixed by #8597
Closed
1 task done

Broken HTTP request parsing: Upgrade: h2c header leads to discarded body #8414

BlobbyBob opened this issue May 14, 2024 · 8 comments · Fixed by #8597
Labels

Comments

@BlobbyBob
Copy link

Describe the bug

When sending two TCP segments, the first containing all headers including the final \r\n\r\n and the second containing a JSON body, the server parses them as two request. The first has an empty body while the second fails with aiohttp.http_exceptions.BadStatusLine as the body is interpreted as HTTP method

To Reproduce

Use the following as client:

docker run --rm -it --net=host ghcr.io/tls-attacker/tlsanvil worker -controller localhost:5001

Use the following as server:

import aiohttp.web


class AnvilWorkerAPI:
    def __init__(self):
        self.app = aiohttp.web.Application()
        self.setup_routes()

    def setup_routes(self):
        self.app.router.add_post("/api/v2/worker/register", self.register)
        self.app.router.add_post("/api/v2/worker/fetch", self.fetch)

    async def register(self, request: aiohttp.web.Request):
        body = await request.read()
        print(request.path, body)
        return aiohttp.web.json_response({})

    async def fetch(self, request):
        body = await request.read()
        print(request.path, body)
        return aiohttp.web.json_response({"command": "OK", "job": None})

    def run(self, **kwargs):
        aiohttp.web.run_app(self.app, **kwargs)


api = AnvilWorkerAPI()
api.run(port=5001)

Expected behavior

HTTP Requests are decoded correctly

Logs/tracebacks

======== Running on http://0.0.0.0:5001 ========
(Press CTRL+C to quit)
/api/v2/worker/register b''
Error handling request
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/aiohttp/web_protocol.py", line 350, in data_received
    messages, upgraded, tail = self._request_parser.feed_data(data)
  File "aiohttp/_http_parser.pyx", line 557, in aiohttp._http_parser.HttpParser.feed_data
aiohttp.http_exceptions.BadStatusLine: 400, message:
  Invalid method encountered:

    b'{"name":"worker 418"}'
      ^
/api/v2/worker/fetch b''
Error handling request
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/aiohttp/web_protocol.py", line 350, in data_received
    messages, upgraded, tail = self._request_parser.feed_data(data)
  File "aiohttp/_http_parser.pyx", line 557, in aiohttp._http_parser.HttpParser.feed_data
aiohttp.http_exceptions.BadStatusLine: 400, message:
  Invalid method encountered:

    b'{"id":null,"status":"IDLE","logs":"10:13:09 INFO : WorkerClient starting\\n10:13:09 INFO : Connecting to backend...\\n10:13:09 INFO : Connected\\n"}'
      ^


### Python Version

```console
$ python --version

aiohttp Version

$ python -m pip show aiohttp
Name: aiohttp
Version: 3.9.5
Summary: Async http client/server framework (asyncio)
Home-page: https://github.com/aio-libs/aiohttp
Author: None
Author-email: None
License: Apache 2
Location: /usr/local/lib/python3.9/dist-packages
Requires: yarl, multidict, async-timeout, frozenlist, attrs, aiosignal
Required-by:

multidict Version

$ python -m pip show multidict
Name: multidict
Version: 6.0.5
Summary: multidict implementation
Home-page: https://github.com/aio-libs/multidict
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache 2
Location: /usr/local/lib/python3.9/dist-packages
Requires: 
Required-by: yarl, aiohttp

yarl Version

$ python -m pip show yarl
Name: yarl
Version: 1.9.4
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache-2.0
Location: /usr/local/lib/python3.9/dist-packages
Requires: idna, multidict
Required-by: aiohttp

OS

Reproducible on both Arch Linux and openjdk:11 docker image (debian based)

Related component

Server

Additional context

Packet capture:
test.pcap.gz

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct
@BlobbyBob BlobbyBob added the bug label May 14, 2024
@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented May 14, 2024

Would be better if we can add a test to test_parser.py reproducing the problem. Also, does it work without C parser (AIOHTTP_NO_EXTENSIONS=1)?

@BlobbyBob
Copy link
Author

With AIOHTTP_NO_EXTENSIONS=1 there seems to be no issue

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented May 15, 2024

Then it likely needs a bug report at https://github.com/nodejs/llhttp/ with the actual message that trips the parser (they will need to reproduce it in isolation, without aiohttp or tlsanvil).

@BlobbyBob
Copy link
Author

I've narrowed the issue a bit down. It has to do with the Upgrade: h2c header in the request. A more minimal reproduction example:

Server

import aiohttp.web

async def printer(request):
    print(request.path, await request.read())
    return aiohttp.web.json_response({})

app = aiohttp.web.Application()
app.router.add_post("/", printer)
    
aiohttp.web.run_app(app, port=8000)

Client

import socket
import time

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, 0)
sock.connect(("localhost", 8000))
sock.send(b"POST / HTTP/1.1\r\n"
          b"Connection: Upgrade\r\n"
          b"Content-Length: 2\r\n"
          b"Upgrade: h2c\r\n"
          b"Content-Type: application/json\r\n\r\n"
          b"{}")

print(sock.recv(1024))

sock.close()

@BlobbyBob
Copy link
Author

BlobbyBob commented May 16, 2024

If you still think it is an issue with llhttp, could you open an issue over there? I'm not familiar with that project, so the bug report would probably be not too precise and helpful.

@BlobbyBob BlobbyBob changed the title Broken HTTP Request Parsing when Body is contained in second TCP segment Broken HTTP request parsing: Upgrade: h2c header leads to discarded body May 16, 2024
@doublex
Copy link

doublex commented May 16, 2024

We are faced with the same problem: http-body gets mangled.
aiohttp 3.9.3 works.

@Dreamsorcerer
Copy link
Member

I see the same result in 3.8. Test reproducer is:

def test_http_request_parser_upgrade(parser: Any) -> None:
    text = b"POST / HTTP/1.1\r\nConnection: Upgrade\r\nContent-Length: 2\r\nUpgrade: h2c\r\nContent-Type: application/json\r\n\r\n"
    msg = parser.feed_data(text)[0][0][0]
    assert parser.feed_data(b"{}") == ([], False, b"")

    assert msg.method == "POST"
    assert msg.path == "/"
    assert msg.url.path == "/"
    assert msg.version == (1, 1)
    assert not msg.should_close
    assert msg.compression is None
    assert msg.upgrade
    assert not msg.chunked

It only happens when the message is split over 2 feed_data() calls though.

@Dreamsorcerer
Copy link
Member

@BlobbyBob Would you be able to try out #8597? You'll need to follow the instructions to build the C parser: https://github.com/aio-libs/aiohttp/blob/fix-fail-upgrade/vendor/README.rst

The issue seems to be caused by the condition we used to decide when the request has been upgraded.

bdraco pushed a commit that referenced this issue Aug 7, 2024
…an upgrade request (#8630)

**This is a backport of PR #8597 as merged into master
(c99a1e2).**

Fixes #8414.

Co-authored-by: Sam Bull <git@sambull.org>
bdraco pushed a commit that referenced this issue Aug 7, 2024
…an upgrade request (#8629)

Co-authored-by: Sam Bull <git@sambull.org>
Co-authored-by: J. Nick Koston <nick@koston.org>
Fixes #8414.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants