Netflix has discovered several resource exhaustion vectors affecting a variety of third-party HTTP/2 implementations. These attack vectors can be used to launch DoS attacks against servers that support HTTP/2 communication.
Netflix worked with Google and CERT/CC to coordinate disclosure to the Internet community.
Today, a number of vendors have announced patches to correct this suboptimal behavior. While we haven’t detected these vulnerabilities in our open source packages, we are issuing this security advisory to document our findings and to further assist the Internet security community in remediating these issues.
There are three broad areas of information security: confidentiality (information can’t be read by unauthorized people), integrity (information can’t be changed by unauthorized people), and availability (information and systems are available when you want them). All of the changes announced today are in the “availability” category. These HTTP/2 vulnerabilities do not allow an attacker to leak or modify information.
Rather, they allow a small number of low bandwidth malicious sessions to prevent connection participants from doing additional work. These attacks are likely to exhaust resources such that other connections or processes on the same machine may also be impacted or crash.
HTTP/2 (defined in RFCs 7540 and 7541) represents a significant change from HTTP/1.1. There are several new capabilities, including header compression and multiplexing of data from multiple streams, which make this attractive to the user community. To support these new features, HTTP/2 has grown to encompass some of the complexity of a Layer 3 transport protocol:
- Data is now carried in binary frames;
- There are both per-connection and per-stream windows that define how much data can be sent;
- There are several ICMP-like control messages (ping, reset, and settings frames, for example) which operate at the HTTP/2 connection layer; and,
- This is a fairly robust concept of stream prioritization.
While this added complexity enables some exciting new features, it also raises implementation questions. When implementations run on the internet and are exposed to malicious users, implementers may wonder:
- Should I limit any of the control messages?
- How do I implement the priority queueing scheme in a computationally efficient way?
- How do I implement the flow-control algorithms in a computationally efficient way?
- How could an attacker manipulate the flow-control algorithm at the HTTP/2 layer to cause unintended results? (And, can they manipulate the flow-control algorithms at both the HTTP/2 and TCP layers together to cause unintended results?)
The Security Considerations section of RFC 7540 (see Section 10.5) addresses some of this in a general way. However, unlike the expected “normal” behavior—which is well-documented and which implementations seem to follow very closely—the algorithms and mechanisms for detecting and mitigating “abnormal” behavior are significantly more vague and left as an exercise for the implementer. From a review of various software packages, it appears that this has led to a variety of implementations with a variety of good ideas, but also some weaknesses.
Most of these attacks work at the HTTP/2 transport layer. As illustrated in the diagram below, this layer sits above the TLS transport, but below the concept of a request. In fact, many of these attacks involve either 0 or 1 requests.
Since the early days of HTTP, tooling has been oriented around requests: logs often indicate requests (rather than connections); rate-limiting may occur at the request level; and, traffic controls may be triggered by requests.
By contrast, there is not as much tooling that looks at HTTP/2 connections to log, rate-limit, and trigger remediation based on a client’s behavior at the HTTP/2 connection layer. Therefore, organizations may find it more difficult to discover and block malicious HTTP/2 connections and may need to add additional tooling to handle these situations.
These attack vectors allow a remote attacker to consume excessive system resources. Some are efficient enough that a single end-system could potentially cause havoc on multiple servers. Other attacks are less efficient; however, even less efficient attacks can open the door for DDoS attacks which are difficult to detect and block.
Many of the attack vectors we found (and which were fixed today) are variants on a theme: a malicious client asks the server to do something which generates a response, but the client refuses to read the response. This exercises the server’s queue management code. Depending on how the server handles its queues, the client can force it to consume excess memory and CPU while processing its requests.
These are the attacks which are being disclosed today, all discovered by Jonathan Looney of Netflix, except for CVE-2019-9518 which was discovered by Piotr Sikora of Google:
- CVE-2019-9511 “Data Dribble”: The attacker requests a large amount of data from a specified resource over multiple streams. They manipulate window size and stream priority to force the server to queue the data in 1-byte chunks. Depending on how efficiently this data is queued, this can consume excess CPU, memory, or both, potentially leading to a denial of service.
- CVE-2019-9512 “Ping Flood”: The attacker sends continual pings to an HTTP/2 peer, causing the peer to build an internal queue of responses. Depending on how efficiently this data is queued, this can consume excess CPU, memory, or both, potentially leading to a denial of service.
- CVE-2019-9513 “Resource Loop”: The attacker creates multiple request streams and continually shuffles the priority of the streams in a way that causes substantial churn to the priority tree. This can consume excess CPU, potentially leading to a denial of service.
- CVE-2019-9514 “Reset Flood”: The attacker opens a number of streams and sends an invalid request over each stream that should solicit a stream of RST_STREAM frames from the peer. Depending on how the peer queues the RST_STREAM frames, this can consume excess memory, CPU, or both, potentially leading to a denial of service.
- CVE-2019-9515 “Settings Flood”: The attacker sends a stream of SETTINGS frames to the peer. Since the RFC requires that the peer reply with one acknowledgement per SETTINGS frame, an empty SETTINGS frame is almost equivalent in behavior to a ping. Depending on how efficiently this data is queued, this can consume excess CPU, memory, or both, potentially leading to a denial of service.
- CVE-2019-9516 “0-Length Headers Leak”: The attacker sends a stream of headers with a 0-length header name and 0-length header value, optionally Huffman encoded into 1-byte or greater headers. Some implementations allocate memory for these headers and keep the allocation alive until the session dies. This can consume excess memory, potentially leading to a denial of service.
- CVE-2019-9517 “Internal Data Buffering”: The attacker opens the HTTP/2 window so the peer can send without constraint; however, they leave the TCP window closed so the peer cannot actually write (many of) the bytes on the wire. The attacker then sends a stream of requests for a large response object. Depending on how the servers queue the responses, this can consume excess memory, CPU, or both, potentially leading to a denial of service.
- CVE-2019-9518 “Empty Frames Flood”: The attacker sends a stream of frames with an empty payload and without the end-of-stream flag. These frames can be DATA, HEADERS, CONTINUATION and/or PUSH_PROMISE. The peer spends time processing each frame disproportionate to attack bandwidth. This can consume excess CPU, potentially leading to a denial of service. (Discovered by Piotr Sikora of Google)
In most cases, an immediate workaround is to disable HTTP/2 support. However, this may cause performance degradation, and it might not be possible in all cases. To obtain software fixes, please contact your software vendor. More information can also be found in the CERT/CC vulnerability note.