-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Any for opaque extension encoding #4475
Comments
#1680 also has a long discussion to be recalled in this context. For existing opaque configurations, we carefully hid |
This plan SGTM as long as we can do it without breaking any existing filters. |
@lizan that sounds reasonable, @snowp do you want to do what @lizan describes for the retry reselect extensions? Would love to find someone who can own end-to-end delivery of this issue. This is becoming a big issue for Istio (also CC @louiscryan). |
Yeah I'd be happy to update them, I was only using Just so I understand: is the benefit of using |
@snowp there's also (de)serialization costs. To be clear, we're not asking to switch to |
Got it, I'll take inspiration from |
I should point out to convert between json and proto, you need to distribute proto descriptors to Envoy. I don't think this is an issue for xDS, since we should always distribute proto binary, not json text. For API proxy, this could be a challenge to handle dynamically loaded APIs. |
@wora yep, thanks for weighing in. We can actually provide folks the flexibility by allowing them to embed We're having some internal discussions on whether we can find an owner for this at Google, if anyone is interested in stepping up in the wider community to own this that'd also be rad. |
`google.protobuf.Struct` is a proto message. It can be embedded into `Any`
just like other proto messages. Is there any problem with this approach?
A common way to settle this type of discussions is to create sample code
for critical user journeys. Evaluate the results visually, sometimes by
asking people around you.
|
@wora I think we're in exact agreement here. |
From a performance standpoint we know that what we have right now is very inefficient. That is because proto to struct translation is not a first class operation. We have an issue open with gogo proto to address that. Any will imminently bypass this perf issue. The other part is how does it affect the human readability of the yaml rendered form. Does proto -> encoded as struct -> encoded as Any have a sane yaml representation? |
@mandarjog the only difference is you need to include a type URL in the YAML representation, see https://developers.google.com/protocol-buffers/docs/proto3#json. See the existing xDS resource objects which are represented as Any as an example here (e.g. https://github.com/envoyproxy/envoy/blob/master/test/config/integration/server_xds.cds.yaml). I think we're all in agreement on what needs to be done and its benefits. Now we need to find an owner who can drive the above steps end-to-end. |
API for #4475. Risk Level: Low (not implemented) Testing: CI Docs Changes: Added but hided Release Notes: N/A, will add when adding impl. Signed-off-by: Lizan Zhou <lizan@tetrate.io>
Another PR for #4475. Refactor tracers config: move interface to include/ Introduce FactoryBase to reduce boiler plate Use Config::Utility to convert opaque config Risk Level: Low Testing: CI Docs Changes: N/A Release Notes: N/A Signed-off-by: Lizan Zhou <lizan@tetrate.io>
/cc watch |
Add support of Any as opaque config for extensions. Deprecates Struct configs. Fixes #4475. Risk Level: Low Testing: CI Docs Changes: Added. Release Notes: Added. Signed-off-by: Lizan Zhou <lizan@tetrate.io>
Another PR for envoyproxy#4475. Refactor tracers config: move interface to include/ Introduce FactoryBase to reduce boiler plate Use Config::Utility to convert opaque config Risk Level: Low Testing: CI Docs Changes: N/A Release Notes: N/A Signed-off-by: Lizan Zhou <lizan@tetrate.io> Signed-off-by: Fred Douglas <fredlas@google.com>
Add support of Any as opaque config for extensions. Deprecates Struct configs. Fixes envoyproxy#4475. Risk Level: Low Testing: CI Docs Changes: Added. Release Notes: Added. Signed-off-by: Lizan Zhou <lizan@tetrate.io> Signed-off-by: Fred Douglas <fredlas@google.com>
One other data point here, some clients for xDS might not have proto descriptors available for all opaque extensions, as their extension architecture is JSON oriented rather than protobuf. This is certainly going to be the case for folks who use the REST-JSON API, but is also true of systems such as Google gRPC core library. We will need to provide some method to have the client indicate this for gRPC endpoints, while it is implicit in REST-JSON endpoints. Should we continue to always support parallel Struct delivery to Any opaque fields? Should we require management servers to be capable of Struct-in-Any embedding? Reopening for further discussion. CC @markdroth. |
FWIW I think it's fine to continue to support struct/any in all cases. The code needed to support both is not large. |
We've been having a lot of internal discussion about this lately. Here's a proposal for discussion (please feel free to shoot down) for a general-purpose solution (can be applied to all places in the xDS APIs where we need extensibility):
I can think of two basic ways the management server could do this:
As an example, let's say that we support configurable intra-locality LB policies using something like this:
Now let's say that there's a custom LB policy called "foo" whose config is expressed as a Under approach (1), the data is stored in the management server as both a Under approach (2), the data is stored in the management server as a Note that approach (2) has one noteable caveat, which is that in order to convert a custom type to a Note that this restriction only applies in cases where the management server needs to support clients that cannot handle arbitrary custom types inside of an This approach seems like it provides a reasonable menu of options for implementations to choose between all of the various desirable properties here:
Thoughts...? |
I think fundamentally, it's reasonable to expect a maximal management server to possess the proto descriptors, as these are necessary for REST-JSON support. At the same time, as you point out, we want to support the management servers + clients that can work in a pure Any world without any knowledge of the descriptors at the management server. At this point, I think the main question is whether we want the client capability to indicate:
I think (1) is the proposal above. (2) would force a management server to either have descriptors or a parallel Then, we have the situation where some management serves won't support this capability. In this scenario, we need to reject the client in #6271 after it present the |
Good question. I think this raises the issue of how clients react on errors (not just for this particular issue, but actually any time the streaming call fails, regardless of why). In cases like this where there is a streaming call that the client is trying to basically always have open, what we generally recommend is that the client automatically retry the call whenever it fails, with appropriate expontential backoff to avoid hammering the server. If we're taking that approach, then the status code returned by the server probably doesn't really matter much, because the client behavior will be the same regardless. Note, however, that this means that the xds server will need to expect additional load from clients periodically retrying the call. Exponential backoff means that the server won't get hammered, but there can still be a non-trivial amount of load if there are a large number of clients. If this is something that we need to optimize for, then we can consider having some sort of special case here where a particular status code (e.g., It's also worth noting that at least initially, clients who start using the new capability will need to be prepared to handle servers that don't know anything about it, in which case the server would not fail the stream; it would just send responses that the client can't deal with. But that's purely a short-term thing to deal with backward compatibility; in the long term, we will eventually get to some version of the API in which the capabilities are required to be understood by servers, at which point we'll still need to define this behavior. Another way we could approach this would be to have the server send back a list of capabilities that the client requested that the server is willing to comply with. If the client requests a capability but the server doesn't repeat it back, the client can decide how to handle it. Although ultimately, if the client requires the capability, then the only thing it can really do is cancel the call and retry it, at which point it's basically the same as if the server failed the call. I think the error-handling scenarios here probably need more discussion. |
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions. |
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted". Thank you for your contributions. |
As discussed in various previous issues, e.g. #3155, and my post at https://blog.envoyproxy.io/dynamic-extensibility-and-protocol-buffers-dcd0bf0b8801, we want to head towards using
Any
as the canonical extension type. This is more efficient on the wire and we're now seeing use cases (e.g. Istio CC @costinm) whereAny
would be bring a real pay-off.We will continue to support
Struct
embedded inside anAny
for those folks who prefer working with such types.Rather than adhoc switching to
Any
, we should aim to maintain a consistent API theme around opaque extension configuration.My recommendation is this:
Any
.With the right use of templated conversion utils, I think this can be done without a lot of boiler plate.
The text was updated successfully, but these errors were encountered: