Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design proposal: redacting sensitive user data from debug logs #27820

Closed
tsaarni opened this issue Jun 6, 2023 · 5 comments
Closed

Design proposal: redacting sensitive user data from debug logs #27820

tsaarni opened this issue Jun 6, 2023 · 5 comments
Labels
area/envoy_log design proposal Needs design doc/proposal before implementation stale stalebot believes this issue/PR has not been touched recently

Comments

@tsaarni
Copy link
Member

tsaarni commented Jun 6, 2023

This issue is a design proposal for redacting sensitive user data that might be leaked via application logs when debug level logs are enabled in production.

Updates: #9652 and #27579

Goals

While doing troubleshooting it might be necessary to collect, distribute and store debug level application logs, even in production. Currently HTTP headers are printed to the debug log, when --log-level debug is enabled, which introduces a risk of unintended data leakage via revealing sensitive user information accidentally.

This proposal is intended to discuss about a mechanism that user can (optionally) use to redact sensitive HTTP header information from application logs.

Design

The sensitive information shall be masked by adding [redacted] in place of potentially sensitive values.

It shall be possible to redact values of common headers known to contain sensitive information, while preserving non-sensitive parts of header values. It shall also be possible to redact custom application specific headers that are used to pass sensitive information.

See "Example debug log entries" section for more details.

Alternative: Configurable list of headers to redact

In this alternative, it is assumed that redaction of the HTTP headers is configurable but new extension point to add custom "redactors" is NOT introduced. The reasoning is following:

  • There are only a couple of well-known headers that require separate logic to redact. This logic can be predefined.
  • It is considered unlikely that new redactors are added beyond the predefined ones. Any application specific headers are redacted by removing the header value completely, which can be done by using generic redactor.

Following predefined redactors shall be implemented:

  1. Query string redactor leaves the request path untouched and redacts query string /request-path?[redacted].
  2. cookie header redactor leaves cookie names but redacts the values: key1=[redacted]; key2=[redacted]
  3. set-cookie header redactor leaves cookie names and attributes, but redacts the value: key=[redacted]; Path=/; Domain=example.com
  4. Generic header value redactor removes the value but leaves the header name as is.

Bootstrap config is used to select which headers are redacted:

application_log_config:
  redact_headers:
    - cookie
    - set-cookie
    - authorization
    - x-vault-token

Assuming that user wants to redact any sensitive information, the first three headers are highly likely candidates that are always present. User will also need to identify any application specific headers depending on the upstream applications, such as x-vault-token in the example above. The generic header value redactor (4) is used for headers that do not have a specific redactor logic associated with them.

Alternative: Add new extension point for printing redacted headers into debug log

In this alternative, it is assumed that new redactors will be added over time. A new extension point shall be introduced to format header values (HeaderEntry instances in HeaderMap), before printing them into the debug log.

The HeaderEntryFormatter interface shall look like:

class HeaderEntryFormatter {
public:
  /**
   * Prints the header entry to the output stream,
   * The formatter can redact sensitive information before output.
   * @param os the output stream to print to.
   * @param header the header entry to print.
   */
  virtual void format(std::ostream& os, const Http::HeaderEntry& header) const PURE;
};

The configuration shall look like:

application_log_config:
  redact_headers:
    - name: envoy.header_entry_formatter.redact_cookies
      header_names:
        - cookie
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.header_entry_formatter.redact_cookies.v3.RedactCookies
    - name: envoy.header_entry_formatter.redact_set_cookie
      header_names:
        - set-cookie
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.header_entry_formatter.redact_set_cookie.v3.RedactSetCookie
    - name: envoy.header_entry_formatter.redact_value
      header_names:
        - authorization
        - x-vault-token
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.header_entry_formatter.redact_value.v3.RedactValue

Each entry in the list defines header_names and the corresponding formatter to apply for the headers. The formatter is passed configuration as typed config. The config messages in this example are empty. If entry is not found for header present in HTTP request or response, the header is printed as it is.

Example debug log entries

Currently, headers are printed via Http::HeaderMap::operator<< stream insertion operator and implemented by Http::HeaderMapImpl::dumpState().

Following output is produced to debug log for HTTP request:

':authority', '127.0.0.1:8080'
':path', '/foo?secret'
':method', 'GET'
'user-agent', 'HTTPie/2.6.0'
'accept-encoding', 'gzip, deflate'
'accept', '*/*'
'connection', 'keep-alive'
'cookie', 'sessionid=secret; another=secret'
'authorization', 'Basic am9lOnBhc3N3b3Jk

It includes request headers such as cookie and authorization and the query string.
Similarly, sensitive information can be revealed in HTTP response headers when cookies are set:

':status', '200'
'server', 'envoy'
'date', 'Mon, 05 Jun 2023 11:17:49 GMT'
'content-type', 'text/html'
'set-cookie', 'sessionid=secret; Path=/; Domain=example.com'
'set-cookie', 'another=secret'
'x-envoy-upstream-service-time', '0'

Some applications use custom headers to pass sensitive information. Following example is from Vault:

'x-vault-token', 'hvs.Cjlceb2odgPPYOXJ3IjiJwDZ'

Related information:

Here is a list of older issue where sensitive information was redacted from various printouts:

@tsaarni tsaarni added the triage Issue requires triage label Jun 6, 2023
@tsaarni
Copy link
Member Author

tsaarni commented Jun 6, 2023

Hi @yanavlasov, as suggested in #27579 (comment) I'm submitting a design proposal issue. It includes some alternatives but I'm of course willing to adjust according to whatever is decided!

@ravenblackx ravenblackx added design proposal Needs design doc/proposal before implementation area/envoy_log and removed triage Issue requires triage labels Jun 6, 2023
@ravenblackx ravenblackx assigned zuercher and unassigned zuercher Jun 6, 2023
@ravenblackx
Copy link
Contributor

Meant to cc @zuercher not assign. :)

@tsaarni
Copy link
Member Author

tsaarni commented Jun 14, 2023

Given discussion here #27579 (comment), I interpret that it is discouraged to implement filtering around header logging code, which this design proposal is about. In case the situation would change, I'm willing work with this topic.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Jul 14, 2023
@github-actions
Copy link

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/envoy_log design proposal Needs design doc/proposal before implementation stale stalebot believes this issue/PR has not been touched recently
Projects
None yet
Development

No branches or pull requests

3 participants