-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: updating runtime guard policy #10983
Conversation
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
cc @envoyproxy/maintainers I'll say we aren't super strict about this today - for example #10957 changes connect behavior fixing bugs for HTTP/1.0 (which barely anyone is using) and where someone sends "Connection: close, foo" which is both unlikely and was buggy before. At some point we may need to guard even the weird corner cases and bugfixes but I think we're not there yet. cc @AndresGuedez who may disagree :-P |
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point we may need to guard even the weird corner cases and bugfixes but I think we're not there yet. cc @AndresGuedez who may disagree :-P
TBH I was surprised you didn't guard those changes. Honestly, I would be fine just saying that we have to do it for all such changes at this point given the widespread usage and chance of breakage. (I mentioned to @alyssawilk offline that removing transfer-encoding: identity from HTTP/1 broke a team at Lyft.)
CONTRIBUTING.md
Outdated
change). | ||
change). Generally as a community we try to guard both high risk changes (major | ||
refactors such as replacing Envoy's buffer implementation) and most user-visible | ||
non-config-guarded changes to HTTP processing (for example additions or changes to HTTP headers or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably s/HTTP/protocol and make the thing in parenthesis an example, but would be curious to hear what others think. At a high level I think we should be config guarding any user visible non-config-guarded change for any protocol extension that is not alpha. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I was trying to (possibly unsuccessfully) differentiate between changes like adding and removing headers, to changes like chunk coalescing or fixing infinite buffering. do you think the memory fixes for chunk encoding and the gRPC infinite buffering fix should be guarded as well? They're user-visible insofar as the user could be running a VM and consuming those chunks, and those logs. I don't object to us getting that zealous but I wasn't arguing for it yet.
If you think it's time to guard all L7 changes I can retroactively guard the chunk PR - it's a mild bummer to keep the literally buggy cruft around for 6 months but such is life for an increasingly widely used proxy. :-P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think the memory fixes for chunk encoding and the gRPC infinite buffering fix should be guarded as well?
No (unless we view them as high enough regression risk). I was mainly saying that we should apply the same rigor for non-HTTP protocols that are not alpha, such as Redis, etc.
If you think it's time to guard all L7 changes I can retroactively guard the chunk PR - it's a mild bummer to keep the literally buggy cruft around for 6 months but such is life for an increasingly widely used proxy. :-P
IMO we can still apply maintainer discretion, but we should probably at least have the conversation for any such change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to make the change you suggested - I'm trying to get clarity around "any user visible"
I tend towards "user visible changes likely to cause problems" where for HTTP header changes are often going to, things like changing hashing is definitely going to cause rollout pain, but chunking the body differently is very unlikely to be problematic.
Not sure how much we want to spell out and how much we want to say "maintainers believe will be problematic"
I am also honestly curious about the gRPC thing. I think if we limit log failure to flush size + buffer size it might be OK, but realistically we're going from never dropping to sometimes dropping and it's dicey :-P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also changing to protocol because I agree with your overall sentiment. Might be good to have a !http example but I don't know enough about redis/kafka etc to suggest a likely to happen and user visible change. Ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how much we want to spell out and how much we want to say "maintainers believe will be problematic"
Yeah agreed. I think you can soften the language however you feel is appropriate.
I am also honestly curious about the gRPC thing. I think if we limit log failure to flush size + buffer size it might be OK, but realistically we're going from never dropping to sometimes dropping and it's dicey :-P
Sorry I wasn't clear which fix you were talking about above. Yeah I agree it's dicey and could potentially benefit from a flag, though this is one of the cases that I don't feel strongly about.
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo remaining thoughts. I can't think of a specific non-HTTP example right now but will add one if I think of it.
CONTRIBUTING.md
Outdated
change). Generally as a community we try to guard both high risk changes (major | ||
refactors such as replacing Envoy's buffer implementation) and most user-visible | ||
non-config-guarded changes to protocol processing (for example additions or changes to HTTP headers or | ||
how HTTP is serialized out) for non-alpha features. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of other thoughts:
- Should we have some blurb about maintainer discretion / consulting maintainers as needed?
- Should we update the PR template with a "feature flag / runtime" field to get people thinking about this? It wouldn't be filled out that often but it would require people to add "N/A" explicitly? WDYT?
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
How about having it optional for now to have folks thinking about it but not requiring n/a on the bulk of PRs? |
Sure happy to start there for sure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with small comment, thank you!
/wait
@@ -7,5 +7,6 @@ Risk Level: | |||
Testing: | |||
Docs Changes: | |||
Release Notes: | |||
[Optional Runtime guard:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document more fully in https://github.com/envoyproxy/envoy/blob/master/PULL_REQUESTS.md with links to the contributing guide, etc.?
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks.
Codifying that most L7 changes should be runtime guarded, updating deprecation timeline as we changed it a while back.
Risk Level: n/a
Testing: n/a
Docs Changes: yes
Release Notes: n/a