Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Remove Trailing Slashes from Hostname (#1254)
Hostnames can be passed in with trailing slashes. Since we blindly concat a path with a leading slash, paths like http://hostname//$rpc/path (note the double slash before $rpc) are possible. The duplicated slash can cause issues down the line, especially if the path is separated from the host in requests. Browsers/servers generally allow double slashes, and while they are valid syntactically under the RFC spec, there is no specified semantic meaning. So a server could interpret a path like http://hostname//$rpc/path as equivalent to http://hostname/$rpc/path (most common), as a separate path with an empty path component before /$rpc, or as something else. The problem is several interpretations are equally valid under the spec and it's implementation-dependent which is used. Additionally, the path can be separate from the host in a request depending on whether it's in origin form versus absolute form (see https://datatracker.ietf.org/doc/html/rfc7230#section-5.3). In absolute form (GET http://hostname//$rpc/path), the path and host are clear. In origin form (GET //$rpc/path, Host: hostname), it is still clear to us looking at it, but may not be clear to a server. For example the server may think the //$rpc/path is in absolute form due to the double slash, and thus parse it as host: $rpc, path: /path. This is especially problematic as https://datatracker.ietf.org/doc/html/rfc7230#section-5.4 says: "When a proxy receives a request with an absolute-form of request-target, the proxy MUST ignore the received Host header field (if any) and instead replace it with the host information of the request-target. A proxy that forwards such a request MUST generate a new Host field-value based on the received request-target rather than forward the received Host field-value." Which means a proxy would corrupt the request when in this format by overwriting the old host value with $rpc This parsing concern is not idle speculation either. Java's main URI class (https://docs.oracle.com/javase/8/docs/api/java/net/URI.html), parses a URI of the form //$rpc/path as having host/authority $rpc and path /path. Real bugs are observed from usage of this in commonly used libraries such as Netty, and various proxies. Basically, this doesn't cause issues in the most common cases, but is a bug waiting to happen in many other edge cases, and affects widely used infrastructure.
- Loading branch information