ext_proc: metadata and attributes #29069

jbohanon · 2023-08-16T17:28:33Z

Commit Message:
Introduce the ability to send dynamic metadata and attributes in the External Processing Request. Also implements the API for returning dynamic metadata as part of the External Processing Response
Additional Description:
Risk Level:
Low
Testing:

Unit tests for processing request/response dynamic metadata
Integration tests for request/response attributes

Docs Changes: TODO
Release Notes: N/A
Platform Specific Features: N/A
Fixes #19881

repokitteh-read-only · 2023-08-16T17:28:36Z

Hi @jbohanon, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #29069 was opened by jbohanon.

see: more, trace.

repokitteh-read-only · 2023-08-16T17:28:42Z

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @markdroth
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #29069 was opened by jbohanon.

see: more, trace.

tyxia · 2023-08-17T19:35:40Z

/assign @tyxia

rshriram · 2023-08-17T19:36:02Z

api/envoy/extensions/filters/http/ext_proc/v3/ext_proc.proto

+  //
+  // It works in a way similar to ``metadata_context_namespaces`` but allows envoy and external processing server to share the protobuf message definition
+  // in order to do a safe parsing.
+  repeated string typed_metadata_context_namespaces = 17;


Updated comment after really reading the comments..

We could collapse both into a single field cant we?
metadata_namespaces_to_forward
untyped:
- namespaces
typed:
- namespaces

I've updated the proto changes to enable route-level overrides and in the process cleaned up this naming a bit.

stevenzzzz · 2023-08-17T20:27:02Z

source/extensions/filters/http/ext_proc/ext_proc.cc

+      encoder_callbacks_->connection()->streamInfo().dynamicMetadata().filter_metadata();
+  const auto& request_encoder_metadata =
+      encoder_callbacks_->streamInfo().dynamicMetadata().filter_metadata();
+  for (const auto& context_key : config_->metadataContextNamespaces()) {


So we have four sources for {typed, }metadata.

What if they context_key is found in multiple sources?

Can we consider to have some sort of finer grained filtering list, that way you could also save unnecessary dict look ups.

I have cut the lookups at least in half by dynamically determining from which callbacks we should be fetching the stream info to access the metadata. Currently, as stated in the comments, the request metadata will have precedent over the connection metadata.

stevenzzzz

Thanks for working on it!

stevenzzzz · 2023-08-17T20:34:15Z

source/extensions/filters/http/ext_proc/ext_proc.cc

+  *req.mutable_metadata_context() = metadata_context;
+}
+
+void setDynamicMetadata(std::string ns, Http::StreamFilterCallbacks* cb,


This assumes we fully trust whatever is sent from the external server, can we add knobs to control (disable, filter ) what could be added into Envoy?

I don't think this is necessary for a few reasons:

The external processing filter is listed as unknown security posture, only to be used with both trusted upstreams and downstreams

The external processing server has nearly arbitrary control over the request/response as it is, so I think it is safe to assume that it should be a trusted piece of code, at least as trusted as a custom Envoy filter which would be able to write dynamic metadata

The returned metadata is only able to be written into dynamic metadata in the ext_proc filter's own namespace.

its going to change soon. We have been vetting it and @yanjunxiang-google is going to flip this to trusted filter soon

Not true.. if you look at all the knobs we have added recently, we are really tying to limit what an external server can and cannot do so that those who want to use it against trusted systems can do so without knobs but those who need to use it against untrusted systems can rely on the knobs to tightly control what can/cannot be done.

follows 2 above in case of untrusted servers

I've added the ability to disable writing the returned metadata. I think this is a good level of control for this initial implementation, but if more granularity is required for this work to merge, can you please point me to another similar implementation of the filtering?

Thanks for adding the control knob. Yes, that's right. We are going to flip this filter to be robust to untrusted downstream/upstream soon.

rshriram · 2023-08-18T13:02:21Z

api/envoy/service/ext_proc/v3/external_processor.proto

@@ -158,7 +161,6 @@ message ProcessingResponse {
    ImmediateResponse immediate_response = 7;
  }

-  // [#not-implemented-hide:]
  // Optional metadata that will be emitted as dynamic metadata to be consumed by the next
  // filter. This metadata will be placed in the namespace ``envoy.filters.http.ext_proc``.
  google.protobuf.Struct dynamic_metadata = 8;


Could we please add a knob for disabling this on Envoy side? just like disable immediate response etc..

Given the growing list of knobs, I wonder if this is an opportunity to consolidate all these enable/disable toggles into a message called serverCapabilities that has header mutation rules, allow_immediate_response, allow_dynamic_metadata, mode overrides, etc. And this capability proto should technically be advertised in the ProcessingRequest as well so that the ext_proc server knows what is and what is not allowed by the server.

Even if there is no appetite to do this capability thingy, we really need a boolean guard for every new piece of data that we will allow Envoy to accept from the Ext proc server. Some of our products use this filter to talk to untrusted services and we would like to be strict about what things our data plane ingests from a remote ext_proc server. Hence the reason for toggles like immediate response, mode overrides, etc.

I am happy to add a setting to allow disabling returned metadata to be written. I think that adding a capabilities message is outside the scope of this piece of work though

Given that this is already live in our systems, could you please turn this into a "enable_foo" knob rather than disable_foo ?

Yes I have changed it to opt-in

RyanTheOptimist · 2023-08-23T13:51:59Z

/assign @htuch

rshriram · 2023-08-23T19:10:36Z

api/envoy/extensions/filters/http/ext_proc/v3/ext_proc.proto

+
+  // If set to true, metadata returned from the external processing server will
+  // be written into the stream's dynamic metadata.
+  bool enable_returned_metadata = 2;


why cant this be namespaces_to_forward, namespaces_to_accept ?

Is it desired to allow writing to arbitrary metadata namespaces?

Is there a specific use-case you're thinking of? It seems to me that it is safer and more consistent to keep all of the returned metadata under the ext_proc namespace to prevent the external service from overwriting internal filters' data

nrjpoddar · 2023-08-25T13:22:21Z

@rshriram long time!

Can we meet so we can move this forward and make sure we are not stepping on each other toes as it looks like both of our teams are working in this area.

phlax · 2023-08-29T10:14:11Z

seems this has some design decisions to work through

in the meantime @jbohanon the CI fails look real

/wait

Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>

…in the codebase Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>

Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>

jbohanon · 2023-09-13T18:27:24Z

/retest

jbohanon · 2023-09-13T19:42:42Z

@tyxia @yanjunxiang-google can you please take a first pass on this?

@tyxia @yanjunxiang-google bumping this request.

rshriram · 2023-09-13T21:02:10Z

The API changes LGTM. Thanks for accommodating our requests.

tyxia · 2023-09-14T02:52:18Z

Apologize for delay. I will take a look this week if that is OK.

I think introducing CEL is ok but i think there is a better way of introducing it to ext_proc. I will comment with more details once I get the chance to take a close look of full PR.

tyxia

I just looked at the matching part. I think we should introduce xds.type.matcher.v3.Matcher (i.e. Unified Matcher API) to ext_proc rather than CEL-only approach here, because:

It will be more powerful and flexible: With matching API, it not only supports the CEL, but
also supports various matching API such as stringMatcher, etc. Also, it provides linear
and sublinear matching. Thus, the matching mechanism can be customized by ext_proc
end-user based on their use case.
It avoids expressive operations like CEL parsing(and checking if needed) in dataplane.
CEL team recommend doing this in control plane
It will be a much cleaner implementation. I think the CEL related functionalities in this PR
are already available in the shared cel_input and cel_match extension. It can be reused in ext_proc by several lines of code.

You can refer to match_delegate and rate_limit_quota for examples. They are http filters using CEL already.

Furthermore, I think the matching-related stuff is technically separated from the main objective of this PR –- send/receive dynamic metadata and attributes. Therefore, I would recommend making this PR more self-contained and smaller by splitting the matching to a separate PR, if possible. This will also help robustness and review of this pr, which help move this PR forward.

rshriram · 2023-10-24T13:07:50Z

@jbohanon may I suggest that you split this PR into two parts? one that forwards/accepts metadata. This seems non controversial and we have a use case for this internally as well. The second is the attribute forwarding where @tyxia has higher level questions on unified matcher vs cel.

jbohanon · 2023-10-24T13:37:10Z

@rshriram I am on parental leave for a couple weeks more but I will ping my team to check on our priority for this

markdroth · 2023-10-25T17:55:29Z

/lgtm api

jbohanon · 2023-11-06T13:42:33Z

Per suggestion from @rshriram and others this PR will be closed in favor of smaller, self-contained PRs for the constituent features.

Dynamic Metadata: #30747
Matching: N/A
Request/Response Attributes: #30781

jbohanon · 2023-11-08T15:29:00Z

@tyxia

I just looked at the matching part. I think we should introduce xds.type.matcher.v3.Matcher (i.e. Unified Matcher API) to ext_proc rather than CEL-only approach here, because:

I don't agree that this would be an appropriate place to introduce the Unified Matcher API. Primarily because we are not taking user-provided predicates and performing a disparate action based on the outcome, rather we are utilizing CEL to look up the values represented by these attributes and send them to the processing server.

It will be more powerful and flexible: With matching API, it not only supports the CEL, but
also supports various matching API such as stringMatcher, etc. Also, it provides linear
and sublinear matching. Thus, the matching mechanism can be customized by ext_proc
end-user based on their use case.

It is not relevant to support alternate methods of matching because Attributes are implemented explicitly in CEL and not anywhere else

It avoids expressive operations like CEL parsing(and checking if needed) in dataplane.
CEL team recommend doing this in control plane

The expression parsing as implemented in this PR is done at config-time rather than at request-time. If absolutely necessary, we could look into parsing on the control plane and passing a parsed expression in via config, but this seems like it would add unnecessary readability issues into the config provided to Envoy. I admit the added complexity of the USE_CEL_PARSER ifdef is less than desirable but it is not without precedent in the project (see access_loggers and rate_limit_descriptors)

It will be a much cleaner implementation. I think the CEL related functionalities in this PR
are already available in the shared [cel_input]
(https://github.com/envoyproxy/envoy/tree/main/source/extensions/matching/http/cel_input) and cel_match extension. It can be reused in ext_proc by several lines of code.

IF we were to pass in parsed expressions, it is possible that we could use the cel_input's HttpCelDataInput::get method to create the activation, but that seems to be jumping through a lot of unnecessary hoops to get just the activation, which still needs to be evaluated against the parsed expression(s). cel_match is not useful to us because again it is only outputting a boolean whereas we need the value of the attribute itself.

You can refer to match_delegate and rate_limit_quota for examples. They are http filters using CEL already.

Each of these filters are using CEL to achieve a different result than we need here. Again, the goal here is to access the actual data which the attribute represents, not to assess it against a pre-determined predicate

Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>

tyxia · 2023-11-22T17:01:41Z

@jbohanon Sorry i have been busy with other projects and haven't got a chance to really look into this PR. But a few things from quick scan regarding your last comment:
--->"The expression parsing as implemented in this PR is done at config-time rather than at request-time. "
I know it is config-time, but it is data plane config time which is still different from the control plane.

---> "this seems like it would add unnecessary readability issues into the config provided to Envoy"
There is no readability concern as you can always pass the config with simple original CEL string

Back to the motivation of using CEL: you are using CEL is only because Attributes are implemented explicitly in CEL and you use CEL for attributes look up?

Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>

…a-unmerged extract metadata changes from envoyproxy#29069

jbohanon requested a review from mattklein123 as a code owner August 16, 2023 17:28

repokitteh-read-only bot added the api label Aug 16, 2023

repokitteh-read-only bot assigned markdroth Aug 16, 2023

jbohanon marked this pull request as draft August 16, 2023 17:29

repokitteh-read-only bot assigned tyxia Aug 17, 2023

rshriram reviewed Aug 17, 2023

View reviewed changes

stevenzzzz reviewed Aug 17, 2023

View reviewed changes

rshriram reviewed Aug 18, 2023

View reviewed changes

jbohanon force-pushed the ext-proc-metadata-attributes branch 4 times, most recently from 90c4b39 to e7fb495 Compare August 22, 2023 14:38

jbohanon marked this pull request as ready for review August 22, 2023 14:38

jbohanon requested a review from htuch as a code owner August 22, 2023 14:38

jbohanon requested review from stevenzzzz and rshriram August 22, 2023 14:46

jbohanon force-pushed the ext-proc-metadata-attributes branch from e7fb495 to f0324c3 Compare August 22, 2023 14:59

repokitteh-read-only bot assigned htuch Aug 23, 2023

jbohanon force-pushed the ext-proc-metadata-attributes branch from 3123bf4 to 1df3763 Compare August 23, 2023 15:03

rshriram reviewed Aug 23, 2023

View reviewed changes

repokitteh-read-only bot added the waiting label Aug 29, 2023

jbohanon force-pushed the ext-proc-metadata-attributes branch from 69385a5 to f17e7d7 Compare August 29, 2023 21:32

repokitteh-read-only bot removed the waiting label Aug 29, 2023

jbohanon added 3 commits September 13, 2023 15:08

test enhancements

7e05d8f

Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>

don't use CEL parser on windows; mirror other windows CEL exclusions …

3090fa2

…in the codebase Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>

allow specifying returned metadata namespaces to write

a3eeec6

Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>

jbohanon force-pushed the ext-proc-metadata-attributes branch from 5899892 to a3eeec6 Compare September 13, 2023 15:15

jbohanon requested a review from yanavlasov as a code owner September 13, 2023 15:15

fix state construction call

6a00a96

Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>

tyxia reviewed Sep 18, 2023

View reviewed changes

ravenblackx added the waiting:any label Sep 26, 2023

repokitteh-read-only bot removed the waiting:any label Oct 24, 2023

repokitteh-read-only bot removed the api label Oct 25, 2023

jbohanon mentioned this pull request Nov 6, 2023

ext_proc: send and receive dynamic metadata #30747

Merged

jbohanon closed this Nov 6, 2023

jbohanon mentioned this pull request Nov 8, 2023

ext_proc: send attributes #30781

Merged

jbohanon added a commit to jbohanon/envoy that referenced this pull request Nov 8, 2023

extract attributes changes from envoyproxy#29069

87cf9ba

Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>

jbohanon added a commit to jbohanon/envoy that referenced this pull request Nov 9, 2023

extract metadata changes from envoyproxy#29069

8685df9

Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>

nfuden pushed a commit to solo-io/envoy-fork that referenced this pull request Nov 30, 2023

extract metadata changes from envoyproxy#29069

369bd16

Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>

ashishb-solo pushed a commit to solo-io/envoy-fork that referenced this pull request Nov 30, 2023

extract metadata changes from envoyproxy#29069

3d0c7f6

Signed-off-by: Jacob Bohanon <jacob.bohanon@solo.io>

ashishb-solo added a commit to solo-io/envoy-fork that referenced this pull request Nov 30, 2023

Merge pull request #12 from solo-io/extproc-1.27-from-upstreammetadat…

dcf8bfb

…a-unmerged extract metadata changes from envoyproxy#29069

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ext_proc: metadata and attributes #29069

ext_proc: metadata and attributes #29069

jbohanon commented Aug 16, 2023 •

edited

Loading

repokitteh-read-only bot commented Aug 16, 2023

repokitteh-read-only bot commented Aug 16, 2023

tyxia commented Aug 17, 2023

rshriram Aug 17, 2023 •

edited

Loading

jbohanon Aug 21, 2023

stevenzzzz Aug 17, 2023

jbohanon Aug 21, 2023

stevenzzzz left a comment

stevenzzzz Aug 17, 2023

jbohanon Aug 17, 2023

rshriram Aug 18, 2023

jbohanon Aug 21, 2023

yanjunxiang-google Aug 23, 2023

rshriram Aug 18, 2023

jbohanon Aug 18, 2023

rshriram Aug 21, 2023

jbohanon Aug 21, 2023

RyanTheOptimist commented Aug 23, 2023

rshriram Aug 23, 2023

jbohanon Aug 23, 2023

jbohanon Aug 23, 2023

nrjpoddar commented Aug 25, 2023

phlax commented Aug 29, 2023

jbohanon commented Sep 13, 2023

jbohanon commented Sep 13, 2023

rshriram commented Sep 13, 2023

tyxia commented Sep 14, 2023 •

edited

Loading

tyxia left a comment

rshriram commented Oct 24, 2023

jbohanon commented Oct 24, 2023

markdroth commented Oct 25, 2023

jbohanon commented Nov 6, 2023 •

edited

Loading

jbohanon commented Nov 8, 2023 •

edited

Loading

tyxia commented Nov 22, 2023 •

edited

Loading

ext_proc: metadata and attributes #29069

ext_proc: metadata and attributes #29069

Conversation

jbohanon commented Aug 16, 2023 • edited Loading

repokitteh-read-only bot commented Aug 16, 2023

repokitteh-read-only bot commented Aug 16, 2023

tyxia commented Aug 17, 2023

rshriram Aug 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stevenzzzz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RyanTheOptimist commented Aug 23, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nrjpoddar commented Aug 25, 2023

phlax commented Aug 29, 2023

jbohanon commented Sep 13, 2023

jbohanon commented Sep 13, 2023

rshriram commented Sep 13, 2023

tyxia commented Sep 14, 2023 • edited Loading

tyxia left a comment

Choose a reason for hiding this comment

rshriram commented Oct 24, 2023

jbohanon commented Oct 24, 2023

markdroth commented Oct 25, 2023

jbohanon commented Nov 6, 2023 • edited Loading

jbohanon commented Nov 8, 2023 • edited Loading

tyxia commented Nov 22, 2023 • edited Loading

jbohanon commented Aug 16, 2023 •

edited

Loading

rshriram Aug 17, 2023 •

edited

Loading

tyxia commented Sep 14, 2023 •

edited

Loading

jbohanon commented Nov 6, 2023 •

edited

Loading

jbohanon commented Nov 8, 2023 •

edited

Loading

tyxia commented Nov 22, 2023 •

edited

Loading