Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inbound: Include server labels in tap responses #1239

Merged
merged 51 commits into from
Sep 3, 2021
Merged

inbound: Include server labels in tap responses #1239

merged 51 commits into from
Sep 3, 2021

Conversation

olix0r
Copy link
Member

@olix0r olix0r commented Sep 3, 2021

This change modifies the inbound tapping behavior so that server &
authorization labels are included in tap metadata for HTTP requests.

We recently introduced metrics to help surface inbound policy decisions,
but in practice these haven't been as useful as we might hope.
Specifically, error metrics do not include the `target_addr` label so
these metrics can't be correlated with servers, etc. This change
improves error metrics and also introduces new metrics to describe
authorization decisions: authorization denials shouldn't be classified
as errors, really, anyway.

This change also improves TCP forwarding authorization so that policy
changes can be honored at runtime: previously authorized connections may
dropped if the policy is updated so that the connection is no longer
authorized.

The gateway is also updated to enforce HTTP policies at runtime as well
so that policy changes can be honored after the connection has been
established.

This change introduces new metrics:

* `inbound_http_authz_allow_total`
* `inbound_http_authz_deny_total`
* `inbound_tcp_authz_allow_total`
* `inbound_tcp_authz_deny_total`
* `inbound_tcp_authz_terminate_total`

_allow_ metrics include `target_addr`, `srv_name`, and `saz_name`
labels. _deny_ and _terminate_ metics include only `target_addr` and
`srv_name` labels.

Authorization denials are no longer reflected in inbound_tcp_error or
inbound_http_error metrics.

A number of internal changes have been made to support this:

* The `inbound::policy::authorize` module includes middlewares for TCP
  and HTTP authorization, replacing the prior method of enforcing policy
  in the stack/router. This module ensures that metrics are recorded for
  policy decisions.
* The `error-metrics` crate has been removed. In its place a `monitor`
  type has been added to the `stack` crate, supporting a general way to
  observe errors, decoupled from the metrics registry.
* Inbound and outbound error metrics are now tracked in the inbound and
  outbound crates, respectively. Inbound- and outbound-specific error
  types are also moved into their rspective crates.
* The `app_core::errors` module has been updated to only define the
  types it needs to instrument the error response layer. Error responses
  are now primarily instrumented via the `HttpError` type so that errors
  that should be handled can be configured where the error is thrown.
  The error type now holds an underlying source error so that the error
  metrics layer can see through this wrapper type to track the
  underlying error cause.
* Server & Authorization labels are no longer handled as a free-form
  maps. We currently read only the `name` label from each; and this
  label is required.
This change modifies the inbound tapping behavior so that server &
authorization labels are included in tap metadata for HTTP requests.
@olix0r olix0r requested a review from a team September 3, 2021 00:21
Base automatically changed from ver/monitor to main September 3, 2021 02:26
@olix0r olix0r merged commit b5cfbf4 into main Sep 3, 2021
@olix0r olix0r deleted the ver/tap-labels branch September 3, 2021 02:44
olix0r added a commit to linkerd/linkerd2 that referenced this pull request Sep 3, 2021
This release features some performance improvements: tokio has been
updated to pick up tokio-rs/tokio#4055, and link-time optimizations have
been enabled in release builds. These changes reduce CPU and memory
overhead in benchmarks.

Inbound policy enforcement has been updated so that TCP forwarding is
interrupted if a policy update revokes a previously-established
authorization. New metrics are exposed to reflect how policies are used
by the proxy: `inbound_http_authz_{allow,deny}_total` and
`inbound_tcp_authz_{allow,deny,terminate}_total`.

The proxy's error metrics, `{inbound,outbound}_{http,tcp}_errors_total`,
have been updated to include the traffic target. And the `traffic_addr`
metric label is augmented by `target_ip` and `target_port` labels to
support more flexible prometheus queries.

Inbound TCP metrics now only include a `srv_name` label, as it can't be
expected for all inbound connections to include authorization labels
(hence the new authz metrics). However, all inbound HTTP metrics--except
for the HTTP errors metric, which includes only a `srv_name`
label--include both `srv_name` and `saz_name` label.

Finally, the inbound and outbound proxies now only exports
Route-oriented metrics when a ServiceProfile is enabled, preventing
redundant metrics from being exported with no differentiating labels.

---

* profiles: Avoid creating a default route stack (linkerd/linkerd2-proxy#1223)
* build(deps): bump arbitrary from 1.0.1 to 1.0.2 (linkerd/linkerd2-proxy#1224)
* build(deps): bump trust-dns-resolver from `f08860c` to `3d0667a` (linkerd/linkerd2-proxy#1225)
* build(deps): bump libc from 0.2.100 to 0.2.101 (linkerd/linkerd2-proxy#1226)
* Enable link-time optimizations (linkerd/linkerd2-proxy#1227)
* build(deps): bump serde_json from 1.0.66 to 1.0.67 (linkerd/linkerd2-proxy#1228)
* build(deps): bump flate2 from 1.0.20 to 1.0.21 (linkerd/linkerd2-proxy#1230)
* build(deps): bump thiserror from 1.0.26 to 1.0.28 (linkerd/linkerd2-proxy#1231)
* build(deps): bump futures from 0.3.16 to 0.3.17 (linkerd/linkerd2-proxy#1232)
* build(deps): bump parking_lot from 0.11.1 to 0.11.2 (linkerd/linkerd2-proxy#1234)
* build(deps): bump trust-dns-resolver from `3d0667a` to `v0.21.0-alpha.2` (linkerd/linkerd2-proxy#1233)
* Rename push_on_response to push_on_service (linkerd/linkerd2-proxy#1235)
* build(deps): bump tokio from 1.10.1 to 1.11.0 (linkerd/linkerd2-proxy#1236)
* metrics: Add `target_ip` and `target_port` labels (linkerd/linkerd2-proxy#1238)
* inbound: Improve policy metrics (linkerd/linkerd2-proxy#1237)
* inbound: Include server labels in tap responses (linkerd/linkerd2-proxy#1239)
* Revert rustc update for release builds
olix0r added a commit to linkerd/linkerd2 that referenced this pull request Sep 3, 2021
This release features some performance improvements: tokio has been
updated to pick up tokio-rs/tokio#4055, and link-time optimizations have
been enabled in release builds. These changes reduce CPU and memory
overhead in benchmarks.

Inbound policy enforcement has been updated so that TCP forwarding is
interrupted if a policy update revokes a previously-established
authorization. New metrics are exposed to reflect how policies are used
by the proxy: `inbound_http_authz_{allow,deny}_total` and
`inbound_tcp_authz_{allow,deny,terminate}_total`.

The proxy's error metrics, `{inbound,outbound}_{http,tcp}_errors_total`,
have been updated to include the traffic target. And the `traffic_addr`
metric label is augmented by `target_ip` and `target_port` labels to
support more flexible prometheus queries.

Inbound TCP metrics now only include a `srv_name` label, as it can't be
expected for all inbound connections to include authorization labels
(hence the new authz metrics). However, all inbound HTTP metrics--except
for the HTTP errors metric, which includes only a `srv_name`
label--include both `srv_name` and `saz_name` label.

Finally, the inbound and outbound proxies now only exports
Route-oriented metrics when a ServiceProfile is enabled, preventing
redundant metrics from being exported with no differentiating labels.

---

* profiles: Avoid creating a default route stack (linkerd/linkerd2-proxy#1223)
* build(deps): bump arbitrary from 1.0.1 to 1.0.2 (linkerd/linkerd2-proxy#1224)
* build(deps): bump trust-dns-resolver from `f08860c` to `3d0667a` (linkerd/linkerd2-proxy#1225)
* build(deps): bump libc from 0.2.100 to 0.2.101 (linkerd/linkerd2-proxy#1226)
* Enable link-time optimizations (linkerd/linkerd2-proxy#1227)
* build(deps): bump serde_json from 1.0.66 to 1.0.67 (linkerd/linkerd2-proxy#1228)
* build(deps): bump flate2 from 1.0.20 to 1.0.21 (linkerd/linkerd2-proxy#1230)
* build(deps): bump thiserror from 1.0.26 to 1.0.28 (linkerd/linkerd2-proxy#1231)
* build(deps): bump futures from 0.3.16 to 0.3.17 (linkerd/linkerd2-proxy#1232)
* build(deps): bump parking_lot from 0.11.1 to 0.11.2 (linkerd/linkerd2-proxy#1234)
* build(deps): bump trust-dns-resolver from `3d0667a` to `v0.21.0-alpha.2` (linkerd/linkerd2-proxy#1233)
* Rename push_on_response to push_on_service (linkerd/linkerd2-proxy#1235)
* build(deps): bump tokio from 1.10.1 to 1.11.0 (linkerd/linkerd2-proxy#1236)
* metrics: Add `target_ip` and `target_port` labels (linkerd/linkerd2-proxy#1238)
* inbound: Improve policy metrics (linkerd/linkerd2-proxy#1237)
* inbound: Include server labels in tap responses (linkerd/linkerd2-proxy#1239)
* Revert rustc update for release builds
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants