Add HTTP3 support #829

karpetrosyan · 2023-10-18T14:09:17Z

This pull request tries to add HTTP/3 support.

As we know, the HTTP/2 and HTTP/3 protocols are very similar, except for the protocol they use.
This PR simply follows the steps described below.

Add the connect_udp method to "httpcore._backends.base.NetworkBackend".
Implement connect_udp only for the synchronous backend (only for now).
Add http3 extra into pyproject.toml
Create httpcore/_http3.py file
Implement HTTP/3 in that file, keeping the logic and flow maximum similar to the logic that we are using in _http2.py.

To support the HTTP/3 protocol, we need the aioquic package, which is a well-tested and well-designed implementation for the HTTP/3 and QUIC protocols.

For more details, see the issue in HTTPX, where the author of aioquic provides basic HTTP/3 integration for httpx.

There is a very basic example of how you can use HTTP/3 with the httpcore.

from httpcore import Origin
from httpcore import Request
from httpcore import HTTP3Connection
from httpcore import SyncBackend

host = "www.youtube.com"
port = 443

stream = SyncBackend().connect_udp(host=host, port=port)
conn = HTTP3Connection(
    origin=Origin(b"https", host.encode(), port), stream=stream
)

request = Request(
    method=b"GET",
    url=f"https://{host}",
    headers=[("host", host)],
    extensions={"timeout": {"read": 5, "write": 5}},
)


response = conn.handle_request(request=request)
print(response)   # <Response [200]>
print(response.extensions["http_version"])  # b'HTTP/3'
print(response.read())  # ...

Or with the high-level API:

import httpcore

pool = httpcore.ConnectionPool(http1=False, http3=True)

response = pool.request(
    "GET", "https://www.youtube.com", extensions={"timeout": {"read": 5, "write": 5}}
)

print(response)  # <Response [200]>
print(response.extensions["http_version"])  # b'HTTP/3'
print(response.read())  # ...

karpetrosyan · 2023-10-20T09:40:44Z

This is how I see the http3 implementation in httpcore.

The goals here are:

Make the http3 event-based implementation very similar to the http2 implementation to not make the maintenance process too complicated.
Fully cover the http3 implementation with the tests, using the event-based mocking logic, when we mock not the network stream but the underlying http3 IO-less connection, which gives us all the events connected to http3.
Add the HTTP3 section to the documentation, like the HTTP2 section.
And the last, but most important, is to deliver HTTP3 support to HTTPX!

I want to keep this pull request as simple as possible, but I'm also thinking about including alt-svc support as @tomchristie suggested in encode/httpx#275 in this pull request.

That is:

It's looking to me like httpx should never end up making an HTTP/3 request on an initial outgoing request, because either:

We see the upgrade in an Alt-Svc response headers, in which case we've already sent the request, and started receiving the response, not much point in tearing the connection down.

We might potentially see an ALTSVC HTTP/2 frame, but we don't want to block on waiting for that before starting to send a request (since it may not exist).
So I think the best we'll be able to do is storing altsvc information whenever it comes through, and potentially making subsequent requests over HTTP/3 using that information.

karpetrosyan · 2023-11-02T12:44:57Z

Any thoughts or ideas, @encode/maintainers?

tomchristie · 2023-11-02T14:58:10Z

Thanks @karpetrosyan!

Any thoughts or ideas

Here's my initial high level thoughts...

Review of the current landscape... Which sites currently use HTTP/3 and which browsers can you demonstrate using it? How can someone else observe this?
What's the use-case for HTTP/3 in httpx - are there conditions under which it's beneficial to the user?
How do we intend to maintain the HTTP/3 work alongside the existing HTTP/2 work with a minimal maintenance load?
What discovery mechanism are browsers currently using for HTTP/3 detection? Is detection over DNS records currently deployed and used?

karpetrosyan · 2023-11-03T09:35:10Z

Thank you for reviewing, Tom.

Excellent questions; here are my thoughts on that.

Review of the current landscape... Which sites currently use HTTP/3 and which browsers can you demonstrate using it? How can someone else observe this?

As an example, here are a few large corporations that support HTTP/3:

Using this script, you can already test it with httpcore.

import httpcore
import logging

logging.basicConfig(level=1)
pool = httpcore.ConnectionPool(http1=False, http3=True)

websites = [
    "https://google.com",
    "https://youtube.com",
    "https://instagram.com",
    "https://spotify.com",
    "https://cloudflare.com",
]


for website in websites:
    response = pool.request("GET", website, extensions={"timeouts": {"connect": 2}})
    print(response)

which browsers can you demonstrate using it

According to Wikipedia, all major browsers support the HTTP/3 protocol.

HTTP/3 is (at least partially) supported by 94% of tracked web browser installations (96% of "tracked mobile" and 94% of "tracked desktop" web browsers),[7] and 26% of the top 10 million websites.[8] It has been supported by Chromium (and derived projects including Google Chrome, Microsoft Edge, Samsung Internet, and Opera)[9] since April 2020 and by Mozilla Firefox since May 2021.[7][10] Safari 14 implemented the protocol but it remains disabled by default.[11]

You can also use this website to determine whether the request was sent over HTTP/3 or HTTP/1.1, and then open the dev tool to view the schema and headers that were sent over the network.

To learn more about HTTP/3 state in 2023, visit https://blog.cloudflare.com/http3-usage-one-year-on/.

What's the use-case for HTTP/3 in httpx - are there conditions under which it's beneficial to the user?

Here are some of the reasons why we should add HTTP/3 support.

It's very simple to support with our existing httpcore design. As you can see, I implemented HTTP/3 by adding a single _http3.py file and making only minor changes to other files, so the first reason is that it is simple to do thanks to the fantastic httpcore design
HTTPX is becoming a more appealing library for newcomers. They can debug websites with HTTP/3, or they can experiment with it for fun.
It is a more recent and improved version of HTTP/2.0. The usability of HTTP/3 was also discussed in the relevant issue of the httpx project, which can be found here.

How do we intend to maintain the HTTP/3 work alongside the existing HTTP/2 work with a minimal maintenance load?

Yes, this is an important question.

One of the goals was to implement HTTP/3 with as few differences as possible from HTTP/2.
I even copied the _http2.py file and only made http3-related changes, so the _http2.py and _http3.py files are 95% identical.

We can assume that _http2.py and _http3.py are the same files, but instead of the h2 library, _http3.py uses aioquic.
That keeps maintenance as simple as we already have.

What discovery mechanism are browsers currently using for HTTP/3 detection? Is detection over DNS records currently deployed and used?

This question has already been discussed in encode/httpx#275.
We can simply use a special header (Alt-Svc) and then use http3 in subsequent requests if we know the server supports HTTP/3.

There is also a section in RFC that describes the connection setup process, so you can find more detailed information there.

seidnerj · 2023-11-17T16:00:01Z

Would love to see HTTP3 support in httpx

T-256

IMO when enabling multiple HTTP versions, could consider http3 to have higher precedence than two others, since it uses UDP, though could still fallback to TCP.

httpcore/_async/connection.py

T-256 · 2023-12-17T10:12:35Z

IMO when enabling multiple HTTP versions, could consider http3 to have higher precedence than two others, since it uses UDP, though could still fallback to TCP.

An alternative example for this change:

can_connect_tcp = True

if self._http3:
    try:
        from . import AsyncHTTP3Connection

        stream = await self._connect_http3(request)
        self._connection = AsyncHTTP3Connection(
            origin=self._origin,
            stream=stream,
            keepalive_expiry=self._keepalive_expiry,
        )

        can_connect_tcp = False
    except Exception as exc:
        if not (self._http1 or self._http2):
            raise exc

if can_connect_tcp:
    stream = await self._connect(request)

    ssl_object = stream.get_extra_info("ssl_object")
    http2_negotiated = (
        ssl_object is not None
        and ssl_object.selected_alpn_protocol() == "h2"
    )
    if http2_negotiated or (self._http2 and not self._http1):
        from .http2 import AsyncHTTP2Connection

        self._connection = AsyncHTTP2Connection(
            origin=self._origin,
            stream=stream,
            keepalive_expiry=self._keepalive_expiry,
        )
    else:
        self._connection = AsyncHTTP11Connection(
            origin=self._origin,
            stream=stream,
            keepalive_expiry=self._keepalive_expiry,
        )

karpetrosyan · 2023-12-18T08:41:52Z

Ugh, let's consider next steps here.
I believe we should clarify some points, particularly how http3 negotiation should be implemented.

HTTP/3 Negotiation

There are at least three approaches we could take to solve this problem:

Alt-Svc header
HTTP/3 first, then HTTP/1 and HTTP/2.
HTTPS DNS records

Let's go over each one and provide some useful links so you can dig deeper.

Alt-Svc

Alt-Svc is a HTTP header that indicates that there are alternative services located on some port that use some protocol, and that the client can switch to that service if the protocol provided by that service is preferred.

HTTP servers, for example, frequently use this Alt-Svc header to inform browsers that they support the HTTP/3 protocol.

Alt-Svc: h3-25=":443";

In the world of HTTPX, we could potentially store information about the origin's alternative services and send subsequent requests based on supported protocols.

Also, the server may provide additional information with the alternative service, such as an expiration time, which we must respect and avoid using stale information about the alternative service.

Here is an example of an Alt-Svc header that could be interpreted as "I support HTTP/3 protocol on port 443, but you should not rely on this information after an hour."

Alt-Svc: h3-25=":443"; ma=3600

This also complicates the use of this approach, so we should account for it.

See also: https://http3-explained.haxx.se/en/h3/h3-altsvc

`HTTP/3` first, then `HTTP/1` and `HTTP/2`

The idea behind this approach is to always try HTTP/3 and, if that fails, fall back to HTTP/1 or HTTP/2 over TCP.

It appears that browsers do not use this approach, at least because it complicates the connection process and makes request sending even slower if the connection is reverted to TCP after attempting UDP.

In HTTPX, we can try HTTP/3 first if all other protocols are disabled, so the client indicated that it only wants to use HTTP/3 connections.

You can already send such requests by specifying that you only want to use the HTTP/3 protocol, as in:

pool = httpcore.ConnectionPool(http1=False, http2=False, http3=True)
response = pool.request("GET", "https://cloudflare.com")

`HTTPS` DNS records

HTTPS RR (HTTPS Resource Records) are relatively new DNS records that delivers configuration information and parameters for how to access a service via HTTPS.

An HTTPS RR can be used to optimize the process of connecting to a service using HTTPS.
Clients can use this record to negotiate protocols at the DNS layer rather than at the TLS layer, as we do with HTTP/1 and HTTP/2.

You can think of HTTPS records as TLS alpn for the DNS layer.

Here are some useful resources on this subject.

https://developer.mozilla.org/en-US/docs/Glossary/HTTPS_RR
https://datatracker.ietf.org/doc/draft-ietf-dnsop-svcb-https/00/
https://blog.cloudflare.com/speeding-up-https-and-http-3-negotiation-with-dns
https://emilymstark.com/2020/10/24/strict-transport-security-vs-https-resource-records-the-showdown.html

karpetrosyan · 2023-12-22T13:02:10Z

I'll leave some key differences between our http3 and http2 implementations here to make it easier to review.

Notes

If you are unfamiliar with HTTP/3 and HTTP/2, I recommend the following resources:

In a nutshell, HTTP3 uses the QUIC protocol, which is based on UDP and implements all the necessary logic, for example, re-transmissions.

Unlike in HTTP2, where we just have a single connection object that h2 provides us, and we can feed him data and ask him for data to send through the wire, this process is a little bit complicated in the HTTP3 implementation because now we have two such objects.

One is the quic connection itself, which handles all the data flow, what we should send, and what we have received. The second is the h3 connection, which is the HTTP/3 connection state, which can receive a quic event and understand what it should do next.

This separation can somewhat help the developer to distinguish the connection layer and the HTTP layer, so we have, for example, StreamReset and ConnectionTerminated events that are QUIC events, and we have DataReceived and ResponseReceived that are H3 events.

In h2, we do not have such separation because all the staff is implemented on top of the TCP protocol, whereas now we have an additional layer where stream handling happens.

The additional layer is also the reason why there are two "connection" objects in the aioquic package, because unlike in the http2 implementation, where we care only about tcp and http, now we should think about udp, quic, and http.

The QUIC layer also helps us to get rid of flow window control, connection setup, and related things that are now handled by the QUIC protocol itself.

Changed

Events

First, here is how we import HTTP2 events and HTTP3 events

import h2.events
from aioquic.h3 import events as h3_events
from aioquic.quic import events as quic_events

We handle five events in HTTP2 implementation; here is how those events look in HTTP3.

h2.events.ResponseReceived -> h3_events.ResponseReceived
h2.events.DataReceived -> h3_events.DataReceived
h2.events.ConnectionTerminated -> quic_events.ConnectionTerminated
h2.events.StreamReset -> quic_events.StreamReset
h2.events.StreamEnded -> None

Here is the code reference for that part in the http3.py and http2.py

Removed

Method `_send_connection_init`.

Because aioquic handles all connection establishment, this method is unnecessary in the http3 implementation.

Method `_receive_remote_settings_change`.

In http3 implementation, this staff is handled by aioquic.

Flow control window

In http3.py, we use QUIC, which handles all flow control for the entire abstract stream, so we do not handle this in our application.

rthalley · 2024-01-02T21:13:26Z

Re DNS HTTPS records, dnspython has good support for them, but dnspython also uses httpx (and thus httpcore) for DNS-over-HTTPS, so I'm not completely sure how to deal with the chicken-and-egg mutual module dependency issues, but I'm happy to work with the httpcore team.

mborsetti · 2024-09-19T01:18:27Z

Where is HTTP/3 support on the release schedule?

…ort-http3

graingert · 2024-09-19T16:33:50Z

httpcore/_async/http3.py

+                },
+            )
+        except BaseException as exc:  # noqa: PIE786
+            with AsyncShieldCancellation():


This might need changing wrt #927

karpetrosyan · 2024-09-19T16:40:04Z

I’m not sure why the pipeline failed, but the implementation works. I would like to continue working on this, and we need to cover the implementation with tests. What do you think, @encode/maintainers? Do we have any blockers? I would also appreciate a review from @jlaine, if possible.

karpetrosyan · 2024-09-19T16:46:29Z

I can already see a 10-20% speed boost on my machine compared to our HTTP/2 implementation as well.

graingert · 2024-09-19T17:05:44Z

I'm a bit concerned about how the pyopenssl context is configured. I think this would break httpx.get(..., verify=cafile)

Generally SSL in httpcore is configured by passing in a SSLContext but this PR seems to bypass that and pass certify.where()

graingert · 2024-09-19T20:26:21Z

I think the way to do it is move httpx.create_ssl_context( into httpcore then add an http3 kwarg that makes it return a dataclass with both an ssl context and a pyopenssl context as private fields

graingert · 2024-09-23T17:54:17Z

I've been thinking about this for a while and using two different contexts for the same httpx session is cryptographically fishy (and probably slow - loading the cert store twice). I've had a quick look at the anyio trio and sync ssl streams and I'm happy to make a PR to make httpcore support either ssl or pyopenssl contexts then we can require a pyopenssl context for http3=True

graingert · 2024-09-25T08:51:07Z

I've misunderstood how tls in aioquic works, I saw the dep on pyopenssl and made the assumption it's used for TLS. However aioquic uses it's own TLS 1.3 implementation which requires cadata, cafile, capath etc passed in, so we will need to create our own context that wraps the stdlib ssl context and the required parameters for aioquic

tomchristie · 2024-09-25T10:41:28Z

Pointers?

graingert · 2024-09-26T17:25:31Z

httpcore/_async/http3.py

+
+    async def _do_handshake(self, request: Request) -> None:
+        assert hasattr(self._network_stream, "_addr")
+        self._quic_conn.connect(addr=self._network_stream._addr, now=monotonic())


This should use the event loop time, so that trio can use auto jump clock

graingert · 2024-09-26T17:28:51Z

httpcore/_async/http3.py

+
+        return events
+
+    async def _write_outgoing_data(self, request: Request) -> None:


This needs to call quic.get_timer() and scheduler a timer so that quic can queue lost datagrams

graingert · 2024-09-26T17:32:51Z

httpcore/_async/connection.py

+                    ):  # pragma: no cover
+                        from .http3 import AsyncHTTP3Connection
+
+                        stream = await self._connect_http3(request)


This should be doing happy eyeballs

graingert · 2024-09-26T17:58:52Z

httpcore/_async/http3.py

+            raise self._read_exception  # pragma: nocover
+
+        try:
+            data = await self._network_stream.read(self.READ_NUM_BYTES, timeout)


I think you need a background task that's constantly reading any datagrams from the server as they can be sent unsolicited

karpetrosyan marked this pull request as draft October 18, 2023 14:10

karpetrosyan added the enhancement New feature or request label Oct 18, 2023

karpetrosyan force-pushed the support-http3 branch 2 times, most recently from e2af16a to 8fc5ff3 Compare October 19, 2023 08:32

Support HTTP/3

bd31869

karpetrosyan force-pushed the support-http3 branch from 8fc5ff3 to bd31869 Compare October 19, 2023 08:34

karpetrosyan added 4 commits October 19, 2023 16:30

Add http3 argument to ConnectionPool and HTTPConnection classes

383d4ce

Typo

acce83a

Fix the docstring

0d5bf66

Support both IPv4 and IPv6

37cb6b4

karpetrosyan marked this pull request as ready for review October 20, 2023 10:38

karpetrosyan requested review from tomchristie and a team October 20, 2023 10:39

karpetrosyan self-assigned this Oct 20, 2023

Merge branch 'master' into support-http3

bcd3934

zanieb self-requested a review November 2, 2023 14:12

Merge branch 'master' into support-http3

d21d675

karpetrosyan added 3 commits November 8, 2023 12:48

Merge branch 'master' into support-http3

306c3b4

Merge branch 'master' into support-http3

08c6271

Merge branch 'master' into support-http3

c00b2f5

Merge branch 'master' into support-http3

c29901b

karpetrosyan mentioned this pull request Nov 28, 2023

HTTP/3 support. encode/httpx#275

Open

karpetrosyan added 2 commits December 4, 2023 09:11

Merge branch 'master' into support-http3

94d4c22

Merge branch 'master' into support-http3

ca5e5ad

T-256 reviewed Dec 17, 2023

View reviewed changes

httpcore/_async/connection.py Show resolved Hide resolved

httpcore/_async/connection.py Show resolved Hide resolved

Merge branch 'master' into support-http3

120c8d1

Merge branch 'master' of https://github.com/encode/httpcore into supp…

6baead4

…ort-http3

graingert reviewed Sep 19, 2024

View reviewed changes

graingert reviewed Sep 26, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HTTP3 support #829

Add HTTP3 support #829

karpetrosyan commented Oct 18, 2023 •

edited

Loading

karpetrosyan commented Oct 20, 2023

karpetrosyan commented Nov 2, 2023

tomchristie commented Nov 2, 2023

karpetrosyan commented Nov 3, 2023

seidnerj commented Nov 17, 2023

T-256 left a comment

T-256 commented Dec 17, 2023

karpetrosyan commented Dec 18, 2023

karpetrosyan commented Dec 22, 2023 •

edited

Loading

rthalley commented Jan 2, 2024

mborsetti commented Sep 19, 2024

graingert Sep 19, 2024

karpetrosyan commented Sep 19, 2024

karpetrosyan commented Sep 19, 2024

graingert commented Sep 19, 2024 •

edited

Loading

graingert commented Sep 19, 2024

graingert commented Sep 23, 2024 •

edited

Loading

graingert commented Sep 25, 2024

tomchristie commented Sep 25, 2024

graingert Sep 26, 2024

graingert Sep 26, 2024 •

edited

Loading

graingert Sep 26, 2024

graingert Sep 26, 2024


		return events

		async def _write_outgoing_data(self, request: Request) -> None:

Add HTTP3 support #829

Are you sure you want to change the base?

Add HTTP3 support #829

Conversation

karpetrosyan commented Oct 18, 2023 • edited Loading

karpetrosyan commented Oct 20, 2023

karpetrosyan commented Nov 2, 2023

tomchristie commented Nov 2, 2023

karpetrosyan commented Nov 3, 2023

Review of the current landscape... Which sites currently use HTTP/3 and which browsers can you demonstrate using it? How can someone else observe this?

What's the use-case for HTTP/3 in httpx - are there conditions under which it's beneficial to the user?

How do we intend to maintain the HTTP/3 work alongside the existing HTTP/2 work with a minimal maintenance load?

What discovery mechanism are browsers currently using for HTTP/3 detection? Is detection over DNS records currently deployed and used?

seidnerj commented Nov 17, 2023

T-256 left a comment

Choose a reason for hiding this comment

T-256 commented Dec 17, 2023

karpetrosyan commented Dec 18, 2023

HTTP/3 Negotiation

Alt-Svc

HTTP/3 first, then HTTP/1 and HTTP/2

HTTPS DNS records

karpetrosyan commented Dec 22, 2023 • edited Loading

Notes

Changed

Events

Removed

Method _send_connection_init.

Method _receive_remote_settings_change.

Flow control window

rthalley commented Jan 2, 2024

mborsetti commented Sep 19, 2024

graingert Sep 19, 2024

Choose a reason for hiding this comment

karpetrosyan commented Sep 19, 2024

karpetrosyan commented Sep 19, 2024

graingert commented Sep 19, 2024 • edited Loading

graingert commented Sep 19, 2024

graingert commented Sep 23, 2024 • edited Loading

graingert commented Sep 25, 2024

tomchristie commented Sep 25, 2024

graingert Sep 26, 2024

Choose a reason for hiding this comment

graingert Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

graingert Sep 26, 2024

Choose a reason for hiding this comment

graingert Sep 26, 2024

Choose a reason for hiding this comment

karpetrosyan commented Oct 18, 2023 •

edited

Loading

`HTTP/3` first, then `HTTP/1` and `HTTP/2`

`HTTPS` DNS records

karpetrosyan commented Dec 22, 2023 •

edited

Loading

Method `_send_connection_init`.

Method `_receive_remote_settings_change`.

graingert commented Sep 19, 2024 •

edited

Loading

graingert commented Sep 23, 2024 •

edited

Loading

graingert Sep 26, 2024 •

edited

Loading