Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Balsa HTTP/1 codec #21245

Open
bencebeky opened this issue May 11, 2022 · 23 comments
Open

Use Balsa HTTP/1 codec #21245

bencebeky opened this issue May 11, 2022 · 23 comments
Assignees
Labels
area/http enhancement Feature requests. Not bugs or questions. no stalebot Disables stalebot from closing an issue

Comments

@bencebeky
Copy link
Contributor

bencebeky commented May 11, 2022

The current HTTP/1 codec is http-parser. The plan is to transition Envoy to use Balsa instead. [edit: fix link]

@bencebeky bencebeky added enhancement Feature requests. Not bugs or questions. triage Issue requires triage labels May 11, 2022
@alyssawilk alyssawilk added area/http and removed triage Issue requires triage labels May 11, 2022
@alyssawilk
Copy link
Contributor

cc @envoyproxy/envoy-maintainers in case this wasn't on everyone's radar already

@daixiang0
Copy link
Member

@bencebeky hi, could you compare Balsa with http-parser to show the advantage of transition?

@wbpcode
Copy link
Member

wbpcode commented May 12, 2022

Yeah, is there any reason to make this transition?

@adisuissa
Copy link
Contributor

This is related to #10988.
The current http_parser is no longer maintained and adding features such as #18819 is non-trivial.

@alyssawilk
Copy link
Contributor

Ah, I should have cross-linked when I was triaging. The decision points largely documented at https://docs.google.com/document/d/1Z8neuR6GTL61EU8jC88SD2as6Nmq15B-BNWC-SBVKns/edit after a couple of rounds of discussion at the community meeting (notes here https://docs.google.com/document/d/1Yc4zkV-A_cC_R3C0u6D15WQAsRSxADs8p2l8UMMa4P4/edit# as it's not convenient for all time zones)

@kanongil
Copy link
Contributor

FYI, the balsa link is broken. They just moved it to: https://github.com/google/quiche/tree/main/quiche/balsa.

@bencebeky
Copy link
Contributor Author

FYI, the balsa link is broken. They just moved it to: https://github.com/google/quiche/tree/main/quiche/balsa.

Thank you. I edited to OP for convenience.

mattklein123 pushed a commit that referenced this issue May 12, 2022
* Simplify Parser::execute() and Parser::errnoName().

The motivation is to not expose integer error codes internal to
http-parser.  This will make it easier to add another Parser
implementation based on a different library.

LegacyHttpParserImpl::execute() returns HTTP_PARSER_ERRNO.  This serves
two purposes at the sole call site in ConnectionImpl::dispatchSlice():
first, it is converted to a ParserStatus and compared to Success and
Pause.  This PR calls getStatus() directly instead.  Then the errno is
fed back to LegacyHttpParserImpl::errnoName(), which converts it to a
string.  This PR removes the errnoName() argument and makes this method
call HTTP_PARSER_ERRNO instead (via Impl::getErrno()).  This allows the
error code to be removed from the return struct of Parser::execute()
altogether.

Also, errnoName() is renamed to the more generic errorMessage(), to
reflect that other Parser implementations might not call internal error
codes "errno".

Note that comparing an int to statusToInt(Success) (or Pause) is
equivalent to feeding that int to intToStatus and comparting the result
to Success (or Pause).

Tracking issue: #21245

Signed-off-by: Bence Béky <bnc@google.com>

* Restore accidentally deleted destructor.

Signed-off-by: Bence Béky <bnc@google.com>

* Update Parser::errorMessage() comment.

Signed-off-by: Bence Béky <bnc@google.com>

* Add const to two local variables.

Signed-off-by: Bence Béky <bnc@google.com>

Co-authored-by: Bence Béky <bnc@google.com>
KBaichoo pushed a commit that referenced this issue May 16, 2022
The motivation is that the int errno is implementation-specific (in this
case defined by http-parser).  Removing this method from the abstract
Parser interface and changing
ParserCallbacks::setAndCheckCallbackStatus* methods to return
ParserStatus instead of this implementation-specific int will make it
simpler to add another Parser implementation, one not based on
http-parser.

Instead of applying statusToInt() within setAndCheckCallbackStatus() and
setAndCheckCallbackStatusOr() before returning in ConnectionImpl, return
ParserStatus, and apply statusToInt() within LegacyHttpParserImpl.  This
allows statusToInt() to be moved to legacy_parser_impl.cc anonymous
namespace, right next to its counterpart intToStatus().

Tracking issue: #21245

Signed-off-by: Bence Béky <bnc@google.com>
ravenblackx pushed a commit to ravenblackx/envoy that referenced this issue Jun 8, 2022
* Simplify Parser::execute() and Parser::errnoName().

The motivation is to not expose integer error codes internal to
http-parser.  This will make it easier to add another Parser
implementation based on a different library.

LegacyHttpParserImpl::execute() returns HTTP_PARSER_ERRNO.  This serves
two purposes at the sole call site in ConnectionImpl::dispatchSlice():
first, it is converted to a ParserStatus and compared to Success and
Pause.  This PR calls getStatus() directly instead.  Then the errno is
fed back to LegacyHttpParserImpl::errnoName(), which converts it to a
string.  This PR removes the errnoName() argument and makes this method
call HTTP_PARSER_ERRNO instead (via Impl::getErrno()).  This allows the
error code to be removed from the return struct of Parser::execute()
altogether.

Also, errnoName() is renamed to the more generic errorMessage(), to
reflect that other Parser implementations might not call internal error
codes "errno".

Note that comparing an int to statusToInt(Success) (or Pause) is
equivalent to feeding that int to intToStatus and comparting the result
to Success (or Pause).

Tracking issue: envoyproxy#21245

Signed-off-by: Bence Béky <bnc@google.com>

* Restore accidentally deleted destructor.

Signed-off-by: Bence Béky <bnc@google.com>

* Update Parser::errorMessage() comment.

Signed-off-by: Bence Béky <bnc@google.com>

* Add const to two local variables.

Signed-off-by: Bence Béky <bnc@google.com>

Co-authored-by: Bence Béky <bnc@google.com>
ravenblackx pushed a commit to ravenblackx/envoy that referenced this issue Jun 8, 2022
The motivation is that the int errno is implementation-specific (in this
case defined by http-parser).  Removing this method from the abstract
Parser interface and changing
ParserCallbacks::setAndCheckCallbackStatus* methods to return
ParserStatus instead of this implementation-specific int will make it
simpler to add another Parser implementation, one not based on
http-parser.

Instead of applying statusToInt() within setAndCheckCallbackStatus() and
setAndCheckCallbackStatusOr() before returning in ConnectionImpl, return
ParserStatus, and apply statusToInt() within LegacyHttpParserImpl.  This
allows statusToInt() to be moved to legacy_parser_impl.cc anonymous
namespace, right next to its counterpart intToStatus().

Tracking issue: envoyproxy#21245

Signed-off-by: Bence Béky <bnc@google.com>
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Jun 11, 2022
@bencebeky
Copy link
Contributor Author

no stalebot

@KBaichoo KBaichoo added no stalebot Disables stalebot from closing an issue and removed stale stalebot believes this issue/PR has not been touched recently labels Jun 14, 2022
RyanTheOptimist pushed a commit that referenced this issue Jun 30, 2022
Add unused BalsaParser.

Add first draft of BalsaParser implementation, which is currently
unused.  This fails a large number of tests and will require a lot of
work until it can be wired up to ConnectionImpl.

Also make Parser::getStatus() const.  That change is too trivial for a
separate PR.

Tracking issue: #21245

Signed-off-by: Bence Béky <bnc@google.com>
reskin89 pushed a commit to reskin89/envoy that referenced this issue Jul 11, 2023
* [balsa] Add config field to enable custom methods.

This is no behavioral change by default: only methods from a hard-coded
list (that matches the list hard-coded in http-parser, and is slightly
different from the one that will be used by UHV) are accepted.

Then the new knob is true, BalsaParser does the exact same validation as
UHV will by default: method has to be non-empty and only contain allowed
characters.

When UHV method validation logic is turned on in the future, all
validation can be removed from BalsaParser. When non-UHV mode is
deprecated, this new proto field can be removed.

Tracking issue: envoyproxy#21245

Signed-off-by: Bence Béky <bnc@google.com>
Signed-off-by: Ryan Eskin <ryan.eskin89@protonmail.com>
reskin89 pushed a commit to reskin89/envoy that referenced this issue Jul 11, 2023
…nvoyproxy#28070)

* Add test to document broken behavior.

* [balsa] Do not signal error on POST method with no body.

This is for consistency with http-parser behavior.

Tracking issue: envoyproxy#21245

* Fix two existing tests.

Also remove newly added test, which turns out to be redundant with
existing tests.

---------

Signed-off-by: Bence Béky <bnc@google.com>
Signed-off-by: Ryan Eskin <ryan.eskin89@protonmail.com>
yanavlasov pushed a commit that referenced this issue Jul 12, 2023
Tracking issue: #21245

Signed-off-by: Bence Béky <bnc@google.com>

Commit Message: [balsa] Disallow invalid character in response header names.
Additional Description: This is for consistency with http-parser behavior.
Risk Level: low, changed code protected by existing default-false runtime flag.
Testing: //test/common/http/http1:codec_impl_test
Release Notes: n/a
Platform Specific Features: n/a
Runtime guard: envoy.reloadable_features.http1_use_balsa_parser

Co-authored-by: Bence Béky <bnc@google.com>
alyssawilk pushed a commit that referenced this issue Sep 6, 2023
This feature has been widely deployed by two large projects at different companies without any production issues. A third project has enabled it in staging and in end-to-end tests. Manual load testing does not show significant change in CPU or memory usage compared to the legacy HTTP/1 parser.

Tracking issue: #21245

Commit Message: Default-enable BalsaParser.
Additional Description:
Risk Level: medium
Testing: unit tests and e2e test within Envoy and by multiple embedders
Docs Changes: n/a
Release Notes: Default-enable BalsaParser.
Platform Specific Features: n/a
Signed-off-by: Bence Béky <bnc@google.com>
Co-authored-by: Bence Béky <bnc@google.com>
@andreycpp
Copy link

andreycpp commented Nov 5, 2023

Hi @bencebeky. I've recently bumped into the lack of custom methods support issue #18819 in one of our prod environments. Interestingly http-parser does support some custom methods but they are all hardcoded.

It looks like the solution to custom methods is to migrate to this Balsa library. Could you please clarify where is this work, and when is it expected to land? Is there possibly a way to use Balsa even today (maybe with some limitations/corner cases)? Thanks!

@wbpcode
Copy link
Member

wbpcode commented Nov 6, 2023

Iirc, balsa is enabled by default after envoy 1.28.

@bencebeky
Copy link
Contributor Author

Hi, as noted above, Balsa is default enabled. However, this is only the first step towards allowing arbitrary methods. BalsaParser has the list of allowed methods copied over verbatim from http-parser, see

static constexpr absl::string_view kValidMethods[] = {
"ACL", "BIND", "CHECKOUT", "CONNECT", "COPY", "DELETE", "GET",
"HEAD", "LINK", "LOCK", "MERGE", "MKACTIVITY", "MKCALENDAR", "MKCOL",
"MOVE", "MSEARCH", "NOTIFY", "OPTIONS", "PATCH", "POST", "PROPFIND",
"PROPPATCH", "PURGE", "PUT", "REBIND", "REPORT", "SEARCH", "SOURCE",
"SUBSCRIBE", "TRACE", "UNBIND", "UNLINK", "UNLOCK", "UNSUBSCRIBE"};
. The two parsers behave identically to minimize friction due to migrating from one parser to the other.

There's a related project called UHV ("universal header validator" or "unified header validation") that moves validation logic, including method validation, from the individual HTTP/1, HTTP/2, HTTP/3 codec implementations to a centralized place. It will provide a configuration setting for enabling custom methods.

@andreycpp
Copy link

Thanks @bencebeky!

Would envoy community accept addition of new methods to Balsa? The method we need right now is the BITS_POST which is used by BITS Upload Protocol. It acts exactly like POST, just has different name.

The UHV work - is it tracked via #10646?

@alyssawilk
Copy link
Contributor

Yep, that's the correct issue for UHV. You should be able to test with UHV on using envoy.reloadable_features.enable_universal_header_validator though there may be some minor differences not yet runtime guarded. I believe once you're using balsa and uhv you can follow thorugh on the TODO to complete defer method validation ot UHV and extend UHV to allow custom methods. Happy to do the review along with @yanavlasov

alyssawilk pushed a commit that referenced this issue May 15, 2024
Delay resetting BalsaFrame until the first byte of the next message is parsed. This is necessary for being able to access information about the parsed headers after the message is fully parsed, for example, response status code, and content-length and transfer-encoding headers.

Fixes #34096

Balsa implementation tracking issue: #21245

Commit Message: [balsa] Delay resetting BalsaFrame.
Additional Description:
Risk Level: low
Testing: test/extensions/transport_sockets/http_11_proxy:connect_test
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a
Runtime guard: yes

Signed-off-by: Bence Béky <bnc@google.com>
Co-authored-by: Bence Béky <bnc@google.com>
phlax pushed a commit to phlax/envoy that referenced this issue Nov 18, 2024
Delay resetting BalsaFrame until the first byte of the next message is parsed. This is necessary for being able to access information about the parsed headers after the message is fully parsed, for example, response status code, and content-length and transfer-encoding headers.

Fixes envoyproxy#34096

Balsa implementation tracking issue: envoyproxy#21245

Commit Message: [balsa] Delay resetting BalsaFrame.
Additional Description:
Risk Level: low
Testing: test/extensions/transport_sockets/http_11_proxy:connect_test
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a
Runtime guard: yes

Signed-off-by: Bence Béky <bnc@google.com>
Co-authored-by: Bence Béky <bnc@google.com>

Signed-off-by: Bence Béky <bencebeky@users.noreply.github.com>

Signed-off-by: Ryan Northey <ryan@synca.io>
phlax pushed a commit to phlax/envoy that referenced this issue Nov 18, 2024
Delay resetting BalsaFrame until the first byte of the next message is parsed. This is necessary for being able to access information about the parsed headers after the message is fully parsed, for example, response status code, and content-length and transfer-encoding headers.

Fixes envoyproxy#34096

Balsa implementation tracking issue: envoyproxy#21245

Commit Message: [balsa] Delay resetting BalsaFrame.
Additional Description:
Risk Level: low
Testing: test/extensions/transport_sockets/http_11_proxy:connect_test
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a
Runtime guard: yes

Signed-off-by: Bence Béky <bnc@google.com>
Co-authored-by: Bence Béky <bnc@google.com>

Signed-off-by: Bence Béky <bencebeky@users.noreply.github.com>

Signed-off-by: Ryan Northey <ryan@synca.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/http enhancement Feature requests. Not bugs or questions. no stalebot Disables stalebot from closing an issue
Projects
None yet
Development

No branches or pull requests

9 participants