Design proposal: redacting sensitive user data from debug logs #27820

tsaarni · 2023-06-06T12:07:25Z

This issue is a design proposal for redacting sensitive user data that might be leaked via application logs when debug level logs are enabled in production.

Updates: #9652 and #27579

Goals

While doing troubleshooting it might be necessary to collect, distribute and store debug level application logs, even in production. Currently HTTP headers are printed to the debug log, when --log-level debug is enabled, which introduces a risk of unintended data leakage via revealing sensitive user information accidentally.

This proposal is intended to discuss about a mechanism that user can (optionally) use to redact sensitive HTTP header information from application logs.

Design

The sensitive information shall be masked by adding [redacted] in place of potentially sensitive values.

It shall be possible to redact values of common headers known to contain sensitive information, while preserving non-sensitive parts of header values. It shall also be possible to redact custom application specific headers that are used to pass sensitive information.

See "Example debug log entries" section for more details.

Alternative: Configurable list of headers to redact

In this alternative, it is assumed that redaction of the HTTP headers is configurable but new extension point to add custom "redactors" is NOT introduced. The reasoning is following:

There are only a couple of well-known headers that require separate logic to redact. This logic can be predefined.
It is considered unlikely that new redactors are added beyond the predefined ones. Any application specific headers are redacted by removing the header value completely, which can be done by using generic redactor.

Following predefined redactors shall be implemented:

Query string redactor leaves the request path untouched and redacts query string /request-path?[redacted].
cookie header redactor leaves cookie names but redacts the values: key1=[redacted]; key2=[redacted]
set-cookie header redactor leaves cookie names and attributes, but redacts the value: key=[redacted]; Path=/; Domain=example.com
Generic header value redactor removes the value but leaves the header name as is.

Bootstrap config is used to select which headers are redacted:

application_log_config:
  redact_headers:
    - cookie
    - set-cookie
    - authorization
    - x-vault-token

Assuming that user wants to redact any sensitive information, the first three headers are highly likely candidates that are always present. User will also need to identify any application specific headers depending on the upstream applications, such as x-vault-token in the example above. The generic header value redactor (4) is used for headers that do not have a specific redactor logic associated with them.

Alternative: Add new extension point for printing redacted headers into debug log

In this alternative, it is assumed that new redactors will be added over time. A new extension point shall be introduced to format header values (HeaderEntry instances in HeaderMap), before printing them into the debug log.

The HeaderEntryFormatter interface shall look like:

class HeaderEntryFormatter {
public:
  /**
   * Prints the header entry to the output stream,
   * The formatter can redact sensitive information before output.
   * @param os the output stream to print to.
   * @param header the header entry to print.
   */
  virtual void format(std::ostream& os, const Http::HeaderEntry& header) const PURE;
};

The configuration shall look like:

application_log_config:
  redact_headers:
    - name: envoy.header_entry_formatter.redact_cookies
      header_names:
        - cookie
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.header_entry_formatter.redact_cookies.v3.RedactCookies
    - name: envoy.header_entry_formatter.redact_set_cookie
      header_names:
        - set-cookie
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.header_entry_formatter.redact_set_cookie.v3.RedactSetCookie
    - name: envoy.header_entry_formatter.redact_value
      header_names:
        - authorization
        - x-vault-token
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.header_entry_formatter.redact_value.v3.RedactValue

Each entry in the list defines header_names and the corresponding formatter to apply for the headers. The formatter is passed configuration as typed config. The config messages in this example are empty. If entry is not found for header present in HTTP request or response, the header is printed as it is.

Example debug log entries

Currently, headers are printed via Http::HeaderMap::operator<< stream insertion operator and implemented by Http::HeaderMapImpl::dumpState().

Following output is produced to debug log for HTTP request:

':authority', '127.0.0.1:8080'
':path', '/foo?secret'
':method', 'GET'
'user-agent', 'HTTPie/2.6.0'
'accept-encoding', 'gzip, deflate'
'accept', '*/*'
'connection', 'keep-alive'
'cookie', 'sessionid=secret; another=secret'
'authorization', 'Basic am9lOnBhc3N3b3Jk

It includes request headers such as cookie and authorization and the query string.
Similarly, sensitive information can be revealed in HTTP response headers when cookies are set:

':status', '200'
'server', 'envoy'
'date', 'Mon, 05 Jun 2023 11:17:49 GMT'
'content-type', 'text/html'
'set-cookie', 'sessionid=secret; Path=/; Domain=example.com'
'set-cookie', 'another=secret'
'x-envoy-upstream-service-time', '0'

Some applications use custom headers to pass sensitive information. Following example is from Vault:

'x-vault-token', 'hvs.Cjlceb2odgPPYOXJ3IjiJwDZ'

Related information:

Here is a list of older issue where sensitive information was redacted from various printouts:

The text was updated successfully, but these errors were encountered:

tsaarni · 2023-06-06T12:09:05Z

Hi @yanavlasov, as suggested in #27579 (comment) I'm submitting a design proposal issue. It includes some alternatives but I'm of course willing to adjust according to whatever is decided!

ravenblackx · 2023-06-06T15:01:39Z

Meant to cc @zuercher not assign. :)

tsaarni · 2023-06-14T06:55:53Z

Given discussion here #27579 (comment), I interpret that it is discouraged to implement filtering around header logging code, which this design proposal is about. In case the situation would change, I'm willing work with this topic.

github-actions · 2023-07-14T08:01:23Z

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

github-actions · 2023-07-21T08:01:28Z

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.

tsaarni added the triage Issue requires triage label Jun 6, 2023

ravenblackx added design proposal Needs design doc/proposal before implementation area/envoy_log and removed triage Issue requires triage labels Jun 6, 2023

ravenblackx assigned zuercher and unassigned zuercher Jun 6, 2023

This was referenced Jun 8, 2023

debug: redact sensitive info from debug logs #27579

Closed

Prevent sensitive request headers from being logged in debug level #9652

Open

kyessenov mentioned this issue Jul 5, 2023

Envoy can shield specified log content？ #28233

Closed

github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Jul 14, 2023

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design proposal: redacting sensitive user data from debug logs #27820

Design proposal: redacting sensitive user data from debug logs #27820

tsaarni commented Jun 6, 2023 •

edited

Loading

tsaarni commented Jun 6, 2023

ravenblackx commented Jun 6, 2023

tsaarni commented Jun 14, 2023

github-actions bot commented Jul 14, 2023

github-actions bot commented Jul 21, 2023

Design proposal: redacting sensitive user data from debug logs #27820

Design proposal: redacting sensitive user data from debug logs #27820

Comments

tsaarni commented Jun 6, 2023 • edited Loading

Goals

Design

Alternative: Configurable list of headers to redact

Alternative: Add new extension point for printing redacted headers into debug log

Example debug log entries

Related information:

tsaarni commented Jun 6, 2023

ravenblackx commented Jun 6, 2023

tsaarni commented Jun 14, 2023

github-actions bot commented Jul 14, 2023

github-actions bot commented Jul 21, 2023

tsaarni commented Jun 6, 2023 •

edited

Loading