Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Collector security documentation #5209

Draft
wants to merge 31 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
631930d
Edit security index.md
tiffany76 Sep 17, 2024
2dc2e17
Copy content into config-best-practices.md
tiffany76 Sep 17, 2024
0d88c24
Merge branch 'main' into collector-security
tiffany76 Sep 19, 2024
135fc1d
Fix spelling issues
tiffany76 Sep 19, 2024
6063b77
Make linter fixes
tiffany76 Sep 19, 2024
2de00e2
Update links on index.md
tiffany76 Sep 19, 2024
7cbbcf6
Copy content into hosting-best-practices.md
tiffany76 Sep 19, 2024
b4987c3
Add TODO
tiffany76 Sep 19, 2024
79688ec
Edits to config and hosting best practices
tiffany76 Sep 19, 2024
be0d8ee
Make Prettier fix
tiffany76 Sep 19, 2024
3e7c3c9
Apply suggestions from Juraci
tiffany76 Sep 23, 2024
932a3d5
Merge branch 'main' into collector-security
tiffany76 Sep 23, 2024
a1427f2
Update index.md with PII no-no
tiffany76 Sep 23, 2024
cc131a8
Add headings for child pages to index.md
tiffany76 Sep 23, 2024
cbf67a9
Update config receivers and exporters section
tiffany76 Sep 24, 2024
02d0944
Make link and linter fixes
tiffany76 Sep 24, 2024
3cee030
Merge branch 'main' into collector-security
tiffany76 Sep 24, 2024
500d331
Update DOS safeguard section
tiffany76 Sep 24, 2024
0a05a95
Adjust info architecture of all pages
tiffany76 Sep 24, 2024
b53353a
Merge branch 'main' into collector-security
tiffany76 Sep 26, 2024
18283d5
Edit protocol section
tiffany76 Sep 27, 2024
bb8852f
Change headings and info arch
tiffany76 Sep 27, 2024
0532e02
Edit scrubbing sensitive data section
tiffany76 Sep 27, 2024
92eccee
Merge branch 'main' into collector-security
tiffany76 Oct 6, 2024
36d55c1
Create new top level section and rework content
tiffany76 Oct 6, 2024
c95ba0c
Create 'specific risk' section
tiffany76 Oct 6, 2024
dbfd1b6
Remove forwarding section
tiffany76 Oct 6, 2024
2a582e7
Merge branch 'main' into collector-security
tiffany76 Oct 14, 2024
8605abb
Edit resource utlization section
tiffany76 Oct 14, 2024
44ae838
Minor edits
tiffany76 Oct 15, 2024
2e606e2
Make cSpell happy
tiffany76 Oct 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions content/en/docs/security/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,41 @@
title: Security
weight: 970
---

Learn how the OpenTelemetry project discloses vulnerabilities and responds to
incidents. Find out how to ensure your observability data is collected and
transmitted in a secure manner.

## Common Vulnerabilities and Exposures (CVEs)

For CVEs across all repositories, see
[Common Vulnerabilities and Exposures](/docs/security/cve).

## Incident response

Learn how to report a vulnerability or find out how incident responses are
handled in
[Community incident response guidelines](/docs/security/security-response).

## Collector security

When setting up the OpenTelemetry (OTel) Collector, consider implementing
security best practices in both your hosting infrastructure and your OTel
Collector configuration. Running a secure Collector can help you

- Protect telemetry that shouldn't but might contain sensitive information, such
as personally identifiable information (PII), application-specific data, or
network traffic patterns.
- Prevent data tampering that makes telemetry unreliable and disrupts incident
responses.
- Comply with data privacy and security regulations.
- Defend against denial of service (DoS) attacks.

See [Hosting best practices](/docs/security/hosting-best-practices) to learn how
to secure your Collector's infrastructure.

See [Configuration best practices](/docs/security/config-best-practices) to
learn how to securely configure your Collector.

For Collector component developers, see
[Security best practices](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md).
234 changes: 234 additions & 0 deletions content/en/docs/security/config-best-practices.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,234 @@
---
title: Collector configuration best practices
linkTitle: Collector configuration
weight: 112
cSpell:ignore: exporterhelper
---

When configuring the OpenTelemetry (OTel) Collector, consider these best
practices to better secure your Collector instance.

## Create secure configurations

Follow these guidelines to secure your Collector's configuration and its
pipelines.

### Store your configuration securely

The Collector's configuration might contain sensitive information including:

- Authentication information such as API tokens.
- TLS certificates including private keys.

You should store sensitive information securely such as on an encrypted
filesystem or secret store. You can use environment variables to handle
sensitive and non-sensitive data as the Collector supports
[environment variable expansion](/docs/collector/configuration/#environment-variables).

### Use encryption and authentication

Your OTel Collector configuration should include encryption and authentication.

- For communication encryption, see
[Configuring certificates](/docs/collector/configuration/#setting-up-certificates).
- For authentication, use the OTel Collector's authentication mechanism, as
described in [Authentication](/docs/collector/configuration/#authentication).

### Minimize the number of components

We recommend limiting the set of components in your Collector configuration to
only those you need. Minimizing the number of components you use minimizes the
attack surface exposed.

- Use the
[OpenTelemetry Collector Builder (`ocb`)](/docs/collector/custom-collector) to
create a Collector distribution that uses only the components you need.
- If you find that you have unused receivers and exporters, remove them from
your configuration.

### Configure with care

Some components can increase the security risk of your Collector pipelines.

- Receivers and exporters can be push- or pull-based. In either case, you should
establish the connection at least over a secure channel, potentially
authenticated as well.
- Receivers and exporters might expose buffer, queue, payload, and worker
settings using configuration parameters. If these settings are available, you
should proceed with caution before modifying the default configuration values.
Improperly setting these values might expose the OpenTelemetry Collector to
additional attack vectors.

## Manage specific security risks

Configure your Collector to block these security threats.

### Protect against denial of service attacks

For server-like receivers and extensions, you can protect your Collector from
exposure to the public internet or to wider networks than necessary by binding
these components' endpoints to addresses that limit connections to authorized
users. Try to always use specific interfaces, such as a pod's IP, or `localhost`
instead of `0.0.0.0`. For more information, see
[CWE-1327: Binding to an Unrestricted IP Address](https://cwe.mitre.org/data/definitions/1327.html).

From Collector v0.110.0, the default endpoints for all servers in Collector
components are set to `localhost:4317` for `gRPC` ports or `localhost:4318` for
`http` ports. For earlier versions of the Collector, change the default endpoint
from `0.0.0.0` to `localhost` in all components by enabling the
`component.UseLocalHostAsDefaultHost`
[feature gate](https://github.com/open-telemetry/opentelemetry-collector/tree/main/featuregate).

If `localhost` resolves to a different IP due to your DNS settings, then
explicitly use the loopback IP instead: `127.0.0.1` for IPv4 or `::1` for IPv6.
For example, here's an IPv4 configuration using a `gRPC` port:

```yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 127.0.0.1:4317
```

In IPv6 setups, make sure your system supports both IPv4 and IPv6 loopback
addresses so the network functions properly in dual-stack environments and
applications, where both protocol versions are used.

If you are working in environments that have nonstandard networking setups, such
as Docker or Kubernetes, see the
[example configurations](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks)
in our component developer documentation for ideas on how to bind your component
endpoints.

### Scrub sensitive data

[Processors](/docs/collector/configuration/#processors) are the Collector
components that sit between receivers and exporters. They are responsible for
processing telemetry before it's analyzed. You can use the OpenTelemetry
Collector's `redaction` processor to obfuscate or scrub sensitive data before
exporting it to a backend.

The
[`redaction` processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/redactionprocessor)
deletes span, log, and metric datapoint attributes that don't match a list of
allowed attributes. It also masks attribute values that match a blocked value
list. Attributes that aren't on the allowed list are removed before any value
checks are done.

For example, here is a configuration that masks values containing credit card
numbers:

```yaml
processors:
redaction:
allow_all_keys: false
allowed_keys:
- description
- group
- id
- name
ignored_keys:
- safe_attribute
blocked_values: # Regular expressions for blocking values of allowed span attributes
- '4[0-9]{12}(?:[0-9]{3})?' # Visa credit card number
- '(5[1-5][0-9]{14})' # MasterCard number
summary: debug
```

See the
[documentation](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/redactionprocessor)
to learn how to add the `redaction` processor to your Collector configuration.

### Safeguard resource utilization

After implementing safeguards for resource utilization in your
[hosting infrastructure](/docs/security/hosting-best-practices/), consider also
adding these safeguards to your OpenTelemetry Collector configuration.

Batching your telemetry and limiting the memory available to your Collector can
prevent out-of-memory errors and usage spikes. You can also handle traffic
spikes by adjusting queue sizes to manage memory usage while avoiding data loss.
For example, use the
[`exporterhelper`](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md)
to manage queue size for your `otlp` exporter:

```yaml
exporters:
otlp:
endpoint: <ENDPOINT>
sending_queue:
queue_size: 800
```

Filtering unwanted telemetry is another way you can protect your Collector's
resources. Not only does filtering protect your Collector instance, but it also
reduces the load on your backend. You can use the
[`filter` processor](/docs/collector/transforming-telemetry/#basic-filtering) to
drop logs, metrics, and spans you don't need. For example, here's a
configuration that drops non-HTTP spans:

```yaml
processors:
filter:
error_mode: ignore
traces:
span:
- attributes["http.request.method"] == nil
```

You can also configure your components with appropriate timeout and retry
limits. These limits should allow your Collector to handle failures without
accumulating too much data in memory. See the
[`exporterhelper` documentation](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md)
for more information.

Finally, consider using compression with your exporters to reduce the send size
of your data and conserve network and CPU resources. By default, the
[`otlp` exporter](https://github.com/open-telemetry/opentelemetry-collector/tree/main/exporter/otlpexporter)
uses `gzip` compression.

## Extensions

While receivers, processors, and exporters handle telemetry directly, extensions
serve different needs.

<!--- TODO: Extensions SHOULD NOT expose sensitive health or telemetry data. How? What can you do? -->

### Health and telemetry

Extensions are available for health check information, Collector metrics and
traces, and generating and collecting profiling data. When enabled with their
default settings, all of these extensions, except the health check extension,
are accessible only locally to the OpenTelemetry Collector. Take care to protect
sensitive information when configuring these extensions for remote access, as
they might expose it accidentally.

### Collector's internal telemetry

<!--- INSERT RECOMMENDATIONS HERE. For example:

1. Remove zPages.
1. Remove configuration endpoints.
-->

### Observers

An observer is a component that discovers services in endpoints. Other
components of the OpenTelemetry Collector, such as receivers, can subscribe to
these extensions to be notified of endpoints coming or going.

Observers might require certain permissions in order to discover services. For
example, the `k8s_observer` requires certain RBAC permissions in Kubernetes,
while the `host_observer` requires the OpenTelemetry Collector to run in
privileged mode.

<!--- But what about Juraci's comment here: https://github.com/open-telemetry/opentelemetry.io/pull/3652/files?diff=unified&w=0#r1417409370 --->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's OK -- what we shouldn't be telling people is to run things as root or blindly disable security protections (like selinux).


### Subprocesses

Extensions can also be used to run subprocesses when the Collector can't
natively run the collection mechanisms (for example, FluentBit). Subprocesses
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure where this is coming from: the only component that I know of spawning new processes is the jmx receiver.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section comes directly from the security README. Should I remove it?

expose a completely separate attack vector that depends on the subprocess
itself. In general, take care before running any subprocesses alongside the
Collector.
2 changes: 1 addition & 1 deletion content/en/docs/security/cve.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Common Vulnerabilities and Exposures
weight: 102
weight: 100
---

This is a list of reported Common Vulnerabilities and Exposures (CVEs) across
Expand Down
60 changes: 60 additions & 0 deletions content/en/docs/security/hosting-best-practices.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
title: Collector hosting best practices
linkTitle: Collector hosting
weight: 115
---

When setting up hosting for OpenTelemetry (OTel) Collector, consider these best
practices to better secure your hosting instance.

## Storing configuration information securely

<!--- TODO: SHOULD ensure sensitive configuration information is stored securely. How? -->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a reference to the secrets management practices for Kubernetes?


## Permissions

<!--- TODO: SHOULD not run the OpenTelemetry Collector as root/admin user. Why? (Give the reader motivation.) How do you do that?
- NOTE: MAY require privileged access for some components

The Collector SHOULD NOT require privileged access, except where the data it's obtaining is in a privileged location. For instance, in order to get pod logs by mounting a node volume, the Collector daemonset needs enough privileges to get that data.

The rule of least privilege applies here. --->

## Receivers and exporters

To limit the exposure of servers to authorized users:

- Enable authentication, for instance by using bearer token authentication
extensions and basic authentication extensions.
- Restrict the IPs that OTel Collector runs on.

## Processors

[Processors](/docs/collector/configuration/#processors) sit between receivers
and exporters. They are responsible for processing telemetry before it's
analyzed. From a security perspective, processors are useful in a few ways.

### Safeguarding resource utilization

In addition, processors offer safeguards around resource utilization.

The `batch` and `memory_limiter` processors help ensure that the OpenTelemetry
Collector is resource efficient and does not run out memory when overloaded.
These two processors should be enabled on every defined pipeline.

For more information on recommended processors and how to order them in your
configuration, see the
[Collector processor](https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor)
documentation.

After installing resource utilization safeguards in your hosting, make sure your
Collector configuration uses those
[safeguards in its configuration](/docs/security/config-best-practices/).

### Another example

<!--- TODO: INSERT ADDITIONAL EXAMPLES HERE. -->

## Extensions

<!--- TODO: Extensions SHOULD NOT expose sensitive health or telemetry data. How? What can you do? -->
2 changes: 1 addition & 1 deletion content/en/docs/security/security-response.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Community Incident Response Guidelines
title: Community incident response guidelines
weight: 102
---

Expand Down