Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http codecs: considering forking #10988

Closed
mattklein123 opened this issue Apr 28, 2020 · 19 comments
Closed

http codecs: considering forking #10988

mattklein123 opened this issue Apr 28, 2020 · 19 comments
Assignees
Labels
area/community area/http area/security no stalebot Disables stalebot from closing an issue

Comments

@mattklein123
Copy link
Member

mattklein123 commented Apr 28, 2020

This is a tracking issue to discuss whether we want to fork llhttp (once #5155 is complete) and nghttp2 into the Envoy project. The goal of the fork would be to:

  1. Very tightly audit and control what codec commits we bring in. Given the codecs themselves see very little change, it would probably not be a big deal to have Envoy maintainers and the security team code review and merge upstream commits.
  2. We need to move to a model in which all user visible changes are either feature flagged or configuration guarded. See the discussion and updates here: docs: updating runtime guard policy #10983. This may require additional patches not found in upstream, at least temporarily.

As a community we should discuss how to proceed forward with balancing forking against making sure we have the control that we need. cc @envoyproxy/security-team @alyssawilk

@mattklein123
Copy link
Member Author

cc @danzh2010 we also need to have a good procedure for Quiche/HTTP3 to make sure all relevant feature flags are exposed and documented as Envoy feature flags once HTTP3 is considered production ready.

@alyssawilk
Copy link
Contributor

Thankfully QUICHE should have functional changes flag guarded per internal policy. I think for QUICHE it boils down to integrating the quic flags into Envoy runtime (which we've got on our list) as well as making sure any which guard "behavioral changes" (whatever we come up with in #10983) stick around for 3-6 months where now I think they may only last for 1-2.
cc @ianswett for lifetime discussion + potential process changes

@alyssawilk
Copy link
Contributor

http parser should be pretty easy - if nothing else they've been averaging 1-2 commits a year so the overhead will be minimal. I think http/2 may end up more onerous as it's a faster moving code base and not much easier to read. I wonder if we want to start out with something a bit lighter, and have build options where one can easily pull the old or the new dependency so folks could try things out. For example with the identity breakage, I wonder if for http_parser we could just add support for what Lyft needed upstream, and support the old version until the newest was backwards compatible?

@mattklein123
Copy link
Member Author

I think http/2 may end up more onerous as it's a faster moving code base and not much easier to read.

One thing to keep in mind about nghttp2 is we only care about lib/ which doesn't see very many commits: https://github.com/nghttp2/nghttp2/commits/master/lib

@mattklein123
Copy link
Member Author

I wonder if we want to start out with something a bit lighter, and have build options where one can easily pull the old or the new dependency so folks could try things out. For example with the identity breakage, I wonder if for http_parser we could just add support for what Lyft needed upstream, and support the old version until the newest was backwards compatible?

Yes we can try this. We probably would need to do this anyway for the http_parser -> llhttp transition.

@antoniovicente
Copy link
Contributor

I assume we can runtime guard creation of H1 codec based on http_parser or llhttp. I assume we should expect some behavior changes as the parser is replaced.

@mattklein123
Copy link
Member Author

I assume we can runtime guard creation of H1 codec based on http_parser or llhttp. I assume we should expect some behavior changes as the parser is replaced.

s/can/must. :)

@antoniovicente
Copy link
Contributor

I assume we can runtime guard creation of H1 codec based on http_parser or llhttp. I assume we should expect some behavior changes as the parser is replaced.

s/can/must. :)

Of course. I just hadn't considered supporting both parsers for some time, seems necessary when making such a big change.

@yanavlasov
Copy link
Contributor

How would we manage our own changes to forked versions? Patch stack? Direct changes?
What does "...control what codec commits we bring in." mean in practice? Does it mean we will not move some commits, we deem risky or unnecessary?

@mattklein123
Copy link
Member Author

How would we manage our own changes to forked versions?

The easiest way would be to fork the repos into the envoyproxy org and repoint the Envoy build to those repos. Then we can have a process for reviewing all commits that go in with our own maintainers. We can decide to fast forward to upstream, possibly modify with temporary feature flags, etc.

@yanavlasov
Copy link
Contributor

So assuming forked nghttp2, would the process of pulling in new change look something like this?
Assuming that forked nghttp2 repo does not have any modifications and mirrors some commit A in nghttp2 source of truth. There is new commit B in nghttp2 SoT that we want to import and flag protect.

  1. import B and create PR-1 for it
  2. PR-1 is modified to include flag protection.
  3. PR-1 is reviewed and committed
  4. After deprecation interval PR-2 is created that removes flag protection.
  5. PR-2 is reviewed and committed

Or do we want to have two PRs PR-1 with SoT change and PR-2 on top of it with flag changes (which may undo large portions of PR-1)?

I have not seen llhttp code, but it may be more complicated since it is generated from javascript. The diff may be less obvious if there are also changes to the code generator and more difficult to flag protect. Unless of course we will do it in javascript.

@mattklein123
Copy link
Member Author

@yanavlasov I think what you propose would work. There are probably a number of ways we could do this. Honestly I think we should probably first decide if we want to fork, and then if so we can try it and see how it goes in terms of the workflows. We will probably need to iterate.

@antoniovicente
Copy link
Contributor

Envoy forking libraries like nghttp2 may cause some headaches for those of us that choose to compile multiple pieces of software against a single canonical version of each third-party library.

@alyssawilk
Copy link
Contributor

alyssawilk commented May 20, 2020 via email

@mattklein123
Copy link
Member Author

Envoy forking libraries like nghttp2 may cause some headaches for those of us that choose to compile multiple pieces of software against a single canonical version of each third-party library.

Monorepo driven development. ;) But yes, understood this may be an issue which we will need to discuss. One option would be to manually vendor both libraries within Envoy and cut the dependency, and then bring in patches manually. This is potentially more painful but would fix the monorepo issue potentially.

@ahedberg
Copy link
Contributor

What's the motivation for wanting to audit/control the codec commits? Concern about introduction of security vulnerabilities upstream, concern about changes in user-facing behavior, both, something else?

If security vulnerabilities, I'd be equally (if not more) concerned about losing security patches if we fork. This isn't that different from delaying bumping a dependency, I suppose, but the toil involved in merging a patch is probably higher than updating the repositories file.

Keeping a fork compatible with the canonical repo is probably hard without an automated test that ensures the libraries behave the same way.

@mattklein123
Copy link
Member Author

What's the motivation for wanting to audit/control the codec commits? Concern about introduction of security vulnerabilities upstream, concern about changes in user-facing behavior, both, something else?

Primarily concerns about changes to user-facing behavior that we might need to feature flag.

In an optimal world we would have maintainers also be maintainers of the codecs, and perhaps we should try that first. I mostly opened this issue from various discussions with @alyssawilk to discuss our options.

@yanavlasov
Copy link
Contributor

I think if we take protection of changes with feature flags as a requirement, then forking codecs is the right way to go. I think however that adding flag protection into codec's code would not be trivial as a result codec version upgrades would be costly. I have not seen llhtp generated source code, it may be even more of a challenge given is it generated. But I guess there is only one way to find out.

@mattklein123 mattklein123 self-assigned this Sep 2, 2020
@mattklein123 mattklein123 assigned htuch and unassigned mattklein123 Oct 29, 2020
@mattklein123
Copy link
Member Author

I think we have a plan forward here with a Google provided h2 codec in QUICHE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/community area/http area/security no stalebot Disables stalebot from closing an issue
Projects
None yet
Development

No branches or pull requests

6 participants