Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTPS proxies support #1434

Closed
florimondmanca opened this issue Dec 21, 2020 · 18 comments · Fixed by #2845
Closed

HTTPS proxies support #1434

florimondmanca opened this issue Dec 21, 2020 · 18 comments · Fixed by #2845
Labels
enhancement New feature or request httpcore Issues related to HTTPCore (core HTTP networking layer) - https://github.com/encode/httpcore proxies Issues related to HTTP and SOCKS proxies requests-compat Issues related to Requests backwards compatibility tls+pki Issues and PRs related to TLS and PKI
Milestone

Comments

@florimondmanca
Copy link
Member

florimondmanca commented Dec 21, 2020

Refs #1424, #1428

Terminology

  • HTTP proxy: a proxy server which supports connecting to it via HTTP. HTTP requests are forwarded, HTTPS requests are tunneled (via HTTP CONNECT). — HTTPX has good support for those, no questions asked. ✅
  • HTTPS proxy: a proxy server that supports connecting to it via HTTPS. Only HTTPS requests are supported, and must (?) be tunneled (TLS-in-TLS). — This is what seems to be missing still. ❌

Problem statement

We've seen reports of issues recently such as #1424 and #1428 that reveal that our proxies implementation does not properly support HTTPS proxies yet.

My understanding right now is that supporting this requires implementing a technique known as "TLS-in-TLS" (or perhaps "nested TLS"). Here's how that works:

  1. HTTPX issues a CONNECT request to the proxy, at https://proxy.org. This may use a dedicated proxy_ssl_context with proxy-specific certs, that I'll mark as TLS(p) ("p" as "proxy").
  2. The proxy establishes a TCP tunnel to the target. The HTTPX-proxy "half" of the TCP connection is over TLS(p), and the other one is not TLS-enabled yet.
  3. HTTPX must perform a TLS handshake with the target server, so that proxy-server "half" of the TCP connection becomes encrypted over TLS(t) ("t" as "target"). I'm not 100% certain I understand how that works. Right now I assume TLS handshake packets would be sent TLS(p)-encrypted to the proxy, which """decrypts""" them and sends them to the server for us. The server responds with its SERVER HELLO. The proxy """encrypts"" them back over TLS(p) and HTTPX sees them. The second pass of the handshake follows the same pattern. (I've put quotation marks because this is actually done without the proxy being able to actually intercept those packets — anyone could confirm?)

✅ Right now we can do steps 1/ and 2/, with the nuance that we have a single ssl_context option that's used for both proxy CONNECT and the handshake (we'd want to have proxy_ssl_context and ssl_context).

❌ What is definitely missing is step 3/.

Right now we attempt to do start_tls(), as if we were tunneling over a standard HTTP connection with the proxy — and generally that fais in a variety of ways depending on sync / async, async library, custom certs, HTTP/1.1 vs HTTP/2, proxy server implementation, etc.

To reproduce

Right now the following would fail:

proxies = {"https": "https://proxy.org:443"}
with httpx.Client(proxies=proxies) as client:
    response = client.get("https://example.org")

TODO: full pproxy setup (or perhaps proxy.py, which seems to support HTTPS proxying fully), full sample tracebacks.

Additional context

Marked this as "requests-compat" because this is now supported in urllib3 as of 1.26. It landed via this pull request: urllib3/urllib3#1923. AFAICT they had to implement TLS-in-TLS themselves, overriding the standard http.client connection implementation because that one doesn't support TLS-in-TLS.

Marked this as "httpcore" because our proxy implementation lives there: https://github.com/encode/httpcore

@florimondmanca florimondmanca added enhancement New feature or request tls+pki Issues and PRs related to TLS and PKI requests-compat Issues related to Requests backwards compatibility proxies Issues related to HTTP and SOCKS proxies httpcore Issues related to HTTPCore (core HTTP networking layer) - https://github.com/encode/httpcore labels Dec 21, 2020
@ech0-py
Copy link

ech0-py commented Dec 21, 2020

Hi there. In #1428 a had a problem with 3rd type of proxies not indicated here and called "reverse proxy" with SSL termination
It works in such way:

  1. client establishes connection with proxy via TLS connection, and sending "message" to the socket (I'll provide http1.1 examples, but my problem was with http2):
GET /index.html HTTP/1.1
Host: example.com
  1. server decrypts this "message", chooses the destination server, makes the same request as client but without TLS, and responds to the client with received answer

It's very useful when we're talking about load balancing for example. Client doesn't even know that he speaks to the proxy.
Mb this is too redundant information but I just want to be clear

@florimondmanca
Copy link
Member Author

florimondmanca commented Dec 22, 2020

@ech0-py Well, then it seems like you don't want to use proxies in that case?

Just send a request to the target host, as you would if there was no reverse proxy in place. I assume things must be setup in your infra so that the DNS hostname resolves to the reverse proxy IP so that traffic is directed there, but the reverse proxy isn't really a "proxy" in the sense that we're discussing in this issue.

Reverse proxies aren't supposed to be passed as proxies specifically because they're not really web proxies, but servers that defer requests to other servers, eg for the sake of load balancing, as you mentioned. Or am I missing something?

@ech0-py
Copy link

ech0-py commented Dec 22, 2020

@florimondmanca I think you're right. But in such case I need to setup NSS (nsswitch.conf) at least which requires sudo.
Is it ok to continue discussion about this in #1428, because I'm feeling my messages aren't related to this topic?

@epiccucumber15
Copy link

Hi. Any update on this?

@tomchristie
Copy link
Member

Note to self that requests doesn't support connecting to an HTTPS proxy. Note eg. that there's simply no API for specifying the proxy cert.

This project README has a useful set of past issues referencing this... https://github.com/phuslu/requests_httpsproxy

However urllib3 does support HTTPS proxies, and the tls-in-tls required to connect to an HTTPS website, through a HTTPS secured proxy. Their docs on this are better than anything else I've seen... https://urllib3.readthedocs.io/en/latest/advanced-usage.html#http-and-https-proxies

Am currently doing some digging into this, and looking into what it'll take in order for us to support tls-in-tls across all our backends.

@tomchristie
Copy link
Member

There's a PR to add support to this for requests, since URLLib3 now supports it... psf/requests#5665

I'm going to document an example of what's needed in order to demo urllib3's support for this...

Generate keys/certs for the proxy itself to use, with trustme:

$ venv/bin/trustme-cli 
Generated a certificate for 'localhost', '127.0.0.1', '::1'
Configure your server to use the following files:
  cert=/Users/tomchristie/GitHub/encode/httpx/server.pem
  key=/Users/tomchristie/GitHub/encode/httpx/server.key
Configure your client to use the following files:
  cert=/Users/tomchristie/GitHub/encode/httpx/client.pem

Start a secure proxy with proxy.py:

$ venv/bin/proxy --port 6000 --hostname 127.0.0.1 --cert-file server.pem --key-file server.key 
2021-05-24 09:51:05,093 - pid:25180 [I] load_plugins:334 - Loaded plugin proxy.http.proxy.HttpProxyPlugin
2021-05-24 09:51:05,093 - pid:25180 [I] listen:115 - Listening on 127.0.0.1:6000
2021-05-24 09:51:05,103 - pid:25180 [I] start_workers:136 - Started 6 workers

Send the HTTP request, using urllib3:

import certifi
import urllib3
from urllib3.util.ssl_ import create_urllib3_context


proxy_ssl_context = create_urllib3_context()
proxy_ssl_context.load_verify_locations("client.pem")
http = urllib3.ProxyManager(
    'https://127.0.0.1:6000/',
    ca_certs=certifi.where(),
    proxy_ssl_context=proxy_ssl_context
)
r = http.request('GET', 'https://example.com/', retries=False)
print(r.status)

@blagodatov
Copy link

@florimondmanca , do you have a time schedule or at least some general understanding when the HTTPS proxy mode (TLS-in-TLS) will be implemented? We have a client that needs it because of their security measures, and we are a bit in troubles not being able to provide them with the software that can connect to APNs via their proxy server.

@thisisamardeep
Copy link

i am working on this.Made fair progress .Will raise pr in next 1-2 weeks for sure.The tls-in-tls concept is roughly working.

@romis2012
Copy link

If it helps in any way, we have added secure proxies support in the latest version of httpx-socks

@blagodatov
Copy link

If it helps in any way, we have added secure proxies support in the latest version of httpx-socks

Awesome!!

@stale
Copy link

stale bot commented Feb 24, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Feb 24, 2022
@tomchristie
Copy link
Member

Needs an up to date, but still valid thx bot.

@stale stale bot removed the wontfix label Feb 24, 2022
@stale
Copy link

stale bot commented Mar 27, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Mar 27, 2022
@tomchristie
Copy link
Member

Steady on, bot.

@martinka
Copy link

Would love to see this implemented.

@tomchristie
Copy link
Member

tomchristie commented Sep 1, 2023

We have support for this in httpcore now.
See encode/httpcore#745 and encode/httpcore#786.

We should extend this into httpx, with an API like...

proxy = httpx.Proxy("https://", ssl_context=...)
client = httpx.Client(proxies=proxy)

Aside...

Do I dislike the proxies=... API? Yes I do.
Is adding proxy_ssl_context sufficient to the existing API okay for this ticket? Yes it is.

@karpetrosyan
Copy link
Member

karpetrosyan commented Sep 8, 2023

Should we provide low-level proxy_ssl_context access rather than high-level verify, certs arguments?

@tomchristie
Copy link
Member

tomchristie commented Sep 9, 2023

We should add a ssl_context=... parameter to the httpx.Proxy(...) configuration class. We don't want anything more than that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request httpcore Issues related to HTTPCore (core HTTP networking layer) - https://github.com/encode/httpcore proxies Issues related to HTTP and SOCKS proxies requests-compat Issues related to Requests backwards compatibility tls+pki Issues and PRs related to TLS and PKI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants