Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ext_proc: metadata and attributes #29069

Closed
wants to merge 11 commits into from

Conversation

jbohanon
Copy link
Contributor

@jbohanon jbohanon commented Aug 16, 2023

Commit Message:
Introduce the ability to send dynamic metadata and attributes in the External Processing Request. Also implements the API for returning dynamic metadata as part of the External Processing Response
Additional Description:
Risk Level:
Low
Testing:

  • Unit tests for processing request/response dynamic metadata
  • Integration tests for request/response attributes

Docs Changes: TODO
Release Notes: N/A
Platform Specific Features: N/A
Fixes #19881

@repokitteh-read-only
Copy link

Hi @jbohanon, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #29069 was opened by jbohanon.

see: more, trace.

@repokitteh-read-only
Copy link

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @markdroth
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #29069 was opened by jbohanon.

see: more, trace.

@jbohanon jbohanon marked this pull request as draft August 16, 2023 17:29
@tyxia
Copy link
Member

tyxia commented Aug 17, 2023

/assign @tyxia

//
// It works in a way similar to ``metadata_context_namespaces`` but allows envoy and external processing server to share the protobuf message definition
// in order to do a safe parsing.
repeated string typed_metadata_context_namespaces = 17;
Copy link
Member

@rshriram rshriram Aug 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated comment after really reading the comments..

We could collapse both into a single field cant we?
metadata_namespaces_to_forward
untyped:
- namespaces
typed:
- namespaces

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the proto changes to enable route-level overrides and in the process cleaned up this naming a bit.

encoder_callbacks_->connection()->streamInfo().dynamicMetadata().filter_metadata();
const auto& request_encoder_metadata =
encoder_callbacks_->streamInfo().dynamicMetadata().filter_metadata();
for (const auto& context_key : config_->metadataContextNamespaces()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we have four sources for {typed, }metadata.

What if they context_key is found in multiple sources?

Can we consider to have some sort of finer grained filtering list, that way you could also save unnecessary dict look ups.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have cut the lookups at least in half by dynamically determining from which callbacks we should be fetching the stream info to access the metadata. Currently, as stated in the comments, the request metadata will have precedent over the connection metadata.

Copy link
Contributor

@stevenzzzz stevenzzzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on it!

*req.mutable_metadata_context() = metadata_context;
}

void setDynamicMetadata(std::string ns, Http::StreamFilterCallbacks* cb,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes we fully trust whatever is sent from the external server, can we add knobs to control (disable, filter ) what could be added into Envoy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is necessary for a few reasons:

  1. The external processing filter is listed as unknown security posture, only to be used with both trusted upstreams and downstreams
  2. The external processing server has nearly arbitrary control over the request/response as it is, so I think it is safe to assume that it should be a trusted piece of code, at least as trusted as a custom Envoy filter which would be able to write dynamic metadata
  3. The returned metadata is only able to be written into dynamic metadata in the ext_proc filter's own namespace.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. its going to change soon. We have been vetting it and @yanjunxiang-google is going to flip this to trusted filter soon
  2. Not true.. if you look at all the knobs we have added recently, we are really tying to limit what an external server can and cannot do so that those who want to use it against trusted systems can do so without knobs but those who need to use it against untrusted systems can rely on the knobs to tightly control what can/cannot be done.
  3. follows 2 above in case of untrusted servers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the ability to disable writing the returned metadata. I think this is a good level of control for this initial implementation, but if more granularity is required for this work to merge, can you please point me to another similar implementation of the filtering?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the control knob. Yes, that's right. We are going to flip this filter to be robust to untrusted downstream/upstream soon.

@@ -158,7 +161,6 @@ message ProcessingResponse {
ImmediateResponse immediate_response = 7;
}

// [#not-implemented-hide:]
// Optional metadata that will be emitted as dynamic metadata to be consumed by the next
// filter. This metadata will be placed in the namespace ``envoy.filters.http.ext_proc``.
google.protobuf.Struct dynamic_metadata = 8;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we please add a knob for disabling this on Envoy side? just like disable immediate response etc..

Given the growing list of knobs, I wonder if this is an opportunity to consolidate all these enable/disable toggles into a message called serverCapabilities that has header mutation rules, allow_immediate_response, allow_dynamic_metadata, mode overrides, etc. And this capability proto should technically be advertised in the ProcessingRequest as well so that the ext_proc server knows what is and what is not allowed by the server.

Even if there is no appetite to do this capability thingy, we really need a boolean guard for every new piece of data that we will allow Envoy to accept from the Ext proc server. Some of our products use this filter to talk to untrusted services and we would like to be strict about what things our data plane ingests from a remote ext_proc server. Hence the reason for toggles like immediate response, mode overrides, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy to add a setting to allow disabling returned metadata to be written. I think that adding a capabilities message is outside the scope of this piece of work though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is already live in our systems, could you please turn this into a "enable_foo" knob rather than disable_foo ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I have changed it to opt-in

@jbohanon jbohanon force-pushed the ext-proc-metadata-attributes branch 4 times, most recently from 90c4b39 to e7fb495 Compare August 22, 2023 14:38
@jbohanon jbohanon marked this pull request as ready for review August 22, 2023 14:38
@jbohanon jbohanon requested a review from htuch as a code owner August 22, 2023 14:38
@jbohanon jbohanon force-pushed the ext-proc-metadata-attributes branch from e7fb495 to f0324c3 Compare August 22, 2023 14:59
@RyanTheOptimist
Copy link
Contributor

/assign @htuch


// If set to true, metadata returned from the external processing server will
// be written into the stream's dynamic metadata.
bool enable_returned_metadata = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why cant this be namespaces_to_forward, namespaces_to_accept ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it desired to allow writing to arbitrary metadata namespaces?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific use-case you're thinking of? It seems to me that it is safer and more consistent to keep all of the returned metadata under the ext_proc namespace to prevent the external service from overwriting internal filters' data

@nrjpoddar
Copy link

@rshriram long time!

Can we meet so we can move this forward and make sure we are not stepping on each other toes as it looks like both of our teams are working in this area.

@phlax
Copy link
Member

phlax commented Aug 29, 2023

seems this has some design decisions to work through

in the meantime @jbohanon the CI fails look real

/wait

Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>
…in the codebase

Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>
Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>
Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>
@jbohanon
Copy link
Contributor Author

/retest

@jbohanon
Copy link
Contributor Author

@tyxia @yanjunxiang-google can you please take a first pass on this?

@tyxia @yanjunxiang-google bumping this request.

@rshriram
Copy link
Member

The API changes LGTM. Thanks for accommodating our requests.

@tyxia
Copy link
Member

tyxia commented Sep 14, 2023

Apologize for delay. I will take a look this week if that is OK.

I think introducing CEL is ok but i think there is a better way of introducing it to ext_proc. I will comment with more details once I get the chance to take a close look of full PR.

Copy link
Member

@tyxia tyxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just looked at the matching part. I think we should introduce xds.type.matcher.v3.Matcher (i.e. Unified Matcher API) to ext_proc rather than CEL-only approach here, because:

  1. It will be more powerful and flexible: With matching API, it not only supports the CEL, but
    also supports various matching API such as stringMatcher, etc. Also, it provides linear
    and sublinear matching. Thus, the matching mechanism can be customized by ext_proc
    end-user based on their use case.
  2. It avoids expressive operations like CEL parsing(and checking if needed) in dataplane.
    CEL team recommend doing this in control plane
  3. It will be a much cleaner implementation. I think the CEL related functionalities in this PR
    are already available in the shared cel_input and cel_match extension. It can be reused in ext_proc by several lines of code.

You can refer to match_delegate and rate_limit_quota for examples. They are http filters using CEL already.

Furthermore, I think the matching-related stuff is technically separated from the main objective of this PR –- send/receive dynamic metadata and attributes. Therefore, I would recommend making this PR more self-contained and smaller by splitting the matching to a separate PR, if possible. This will also help robustness and review of this pr, which help move this PR forward.

@rshriram
Copy link
Member

@jbohanon may I suggest that you split this PR into two parts? one that forwards/accepts metadata. This seems non controversial and we have a use case for this internally as well. The second is the attribute forwarding where @tyxia has higher level questions on unified matcher vs cel.

@jbohanon
Copy link
Contributor Author

@rshriram I am on parental leave for a couple weeks more but I will ping my team to check on our priority for this

@markdroth
Copy link
Contributor

/lgtm api

@jbohanon
Copy link
Contributor Author

jbohanon commented Nov 6, 2023

Per suggestion from @rshriram and others this PR will be closed in favor of smaller, self-contained PRs for the constituent features.

Dynamic Metadata: #30747
Matching: N/A
Request/Response Attributes: #30781

@jbohanon jbohanon closed this Nov 6, 2023
@jbohanon
Copy link
Contributor Author

jbohanon commented Nov 8, 2023

@tyxia

I just looked at the matching part. I think we should introduce xds.type.matcher.v3.Matcher (i.e. Unified Matcher API) to ext_proc rather than CEL-only approach here, because:

I don't agree that this would be an appropriate place to introduce the Unified Matcher API. Primarily because we are not taking user-provided predicates and performing a disparate action based on the outcome, rather we are utilizing CEL to look up the values represented by these attributes and send them to the processing server.

  1. It will be more powerful and flexible: With matching API, it not only supports the CEL, but
    also supports various matching API such as stringMatcher, etc. Also, it provides linear
    and sublinear matching. Thus, the matching mechanism can be customized by ext_proc
    end-user based on their use case.

It is not relevant to support alternate methods of matching because Attributes are implemented explicitly in CEL and not anywhere else

  1. It avoids expressive operations like CEL parsing(and checking if needed) in dataplane.
    CEL team recommend doing this in control plane

The expression parsing as implemented in this PR is done at config-time rather than at request-time. If absolutely necessary, we could look into parsing on the control plane and passing a parsed expression in via config, but this seems like it would add unnecessary readability issues into the config provided to Envoy. I admit the added complexity of the USE_CEL_PARSER ifdef is less than desirable but it is not without precedent in the project (see access_loggers and rate_limit_descriptors)

  1. It will be a much cleaner implementation. I think the CEL related functionalities in this PR
    are already available in the shared [cel_input]
    (https://github.com/envoyproxy/envoy/tree/main/source/extensions/matching/http/cel_input) and cel_match extension. It can be reused in ext_proc by several lines of code.

IF we were to pass in parsed expressions, it is possible that we could use the cel_input's HttpCelDataInput::get method to create the activation, but that seems to be jumping through a lot of unnecessary hoops to get just the activation, which still needs to be evaluated against the parsed expression(s). cel_match is not useful to us because again it is only outputting a boolean whereas we need the value of the attribute itself.

You can refer to match_delegate and rate_limit_quota for examples. They are http filters using CEL already.

Each of these filters are using CEL to achieve a different result than we need here. Again, the goal here is to access the actual data which the attribute represents, not to assess it against a pre-determined predicate

jbohanon added a commit to jbohanon/envoy that referenced this pull request Nov 8, 2023
Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>
jbohanon added a commit to jbohanon/envoy that referenced this pull request Nov 9, 2023
Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>
@tyxia
Copy link
Member

tyxia commented Nov 22, 2023

@jbohanon Sorry i have been busy with other projects and haven't got a chance to really look into this PR. But a few things from quick scan regarding your last comment:
--->"The expression parsing as implemented in this PR is done at config-time rather than at request-time. "
I know it is config-time, but it is data plane config time which is still different from the control plane.

---> "this seems like it would add unnecessary readability issues into the config provided to Envoy"
There is no readability concern as you can always pass the config with simple original CEL string

Back to the motivation of using CEL: you are using CEL is only because Attributes are implemented explicitly in CEL and you use CEL for attributes look up?

nfuden pushed a commit to solo-io/envoy-fork that referenced this pull request Nov 30, 2023
Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>
ashishb-solo pushed a commit to solo-io/envoy-fork that referenced this pull request Nov 30, 2023
Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>
ashishb-solo added a commit to solo-io/envoy-fork that referenced this pull request Nov 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dynamic Metadata for External Processing Filters Similar to Dynamic Metadata for External Authorization