Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move Collector architecture documentation and diagrams from Github to opentelemetry.io #4029

Merged
merged 24 commits into from
Mar 7, 2024
Merged
Changes from 4 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
a9b4f7c
Create new file with design.md content
tiffany76 Feb 16, 2024
2236830
Copyedit text copied from design.md
tiffany76 Feb 19, 2024
4b8eff0
Add TODOs
tiffany76 Feb 20, 2024
87e96b2
Add first of six Mermaid diagrams
tiffany76 Feb 20, 2024
e09a8bf
Apply suggestions from initial review
tiffany76 Feb 21, 2024
0a3184f
Update and add Mermaid diagrams
tiffany76 Feb 21, 2024
6072452
Merge branch 'main' into collector_architecture
theletterf Feb 23, 2024
a84d8b6
Apply suggestions from follow-up review
tiffany76 Feb 26, 2024
1b7117e
Make small copyedits and formatting changes
tiffany76 Feb 26, 2024
3dcfa04
Add Receivers mermaid diagram
tiffany76 Feb 27, 2024
0a016d4
Add Agent mermaid diagram
tiffany76 Feb 27, 2024
677af81
Neutralize formatting of Agent diagram
tiffany76 Feb 27, 2024
282fbf9
Clean up and add style to Agent mermaid diagram
tiffany76 Feb 28, 2024
1cc36cd
Add Service mermaid diagram
tiffany76 Feb 28, 2024
8e12f26
Change mentions of jaeger exporter to otlp
tiffany76 Feb 28, 2024
d9858b7
Make Prettier fixes
tiffany76 Feb 28, 2024
64865d0
Merge branch 'main' into collector_architecture
tiffany76 Feb 28, 2024
f3abc95
Update content/en/docs/collector/architecture.md
tiffany76 Mar 4, 2024
392906b
Merge branch 'main' into collector_architecture
tiffany76 Mar 4, 2024
2740396
Remove mention of google doc
tiffany76 Mar 4, 2024
3a46360
Merge branch 'main' into collector_architecture
tiffany76 Mar 6, 2024
ff7579f
Replace mentions of tags processor
tiffany76 Mar 7, 2024
ef248c2
Fix Exporter diagram
tiffany76 Mar 7, 2024
d0aeb5a
Merge branch 'main' into collector_architecture
cartermp Mar 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
277 changes: 277 additions & 0 deletions content/en/docs/collector/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,277 @@
---
title: Architecture
weight: 28
# prettier-ignore
cSpell:ignore: # TODO: Add keywords to ignore.
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved
---

OpenTelemetry Collector is an executable that can receive telemetry data,
optionally process it, and export it further.

The Collector supports several popular open source protocols for receiving and
sending telemetry data as well as offering a pluggable architecture for adding
more protocols.
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

Data receiving, processing, and exporting are done using
[Pipelines](#pipelines). The Collector can be configured to have one or more
pipelines. Each pipeline includes the following:

- A set of [Receivers](#receivers) that receive the data.
- A series of optional [Processors](#processors) that get the data from
receivers and process it.
- A set of [Exporters](#exporters) which get the data from processors and send
it outside the Collector.

The same receiver can be included in multiple pipelines and multiple pipelines
can include the same exporter.
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

## Pipelines

A pipeline defines a path that data follows in the Collector: from reception, to
processing (or modification), and finally to export.

Pipelines can operate on three telemetry data types: traces, metrics, and logs.
The data type is a property of the pipeline defined by its configuration.
Receivers, processors, and exporters used in a pipeline must support the
particular data type otherwise `ErrDataTypeIsNotSupported` will be reported when
the configuration is loaded. A pipeline can be depicted the following way:
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

```mermaid
---
title: Pipeline
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved
---
flowchart LR
A(Receiver 1) --> D[Processor 1]
B(Receiver 2) --> D
C(Receiver N) --> D
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved
D --> E[Processor 2]
E --> F[Processor N]
F --> G((fan out))
G --> H[[Exporter 1]]
G --> I[[Exporter 2]]
G --> J[[Exporter N]]

classDef default fill:#e3e8fc,stroke:#4f62ad
```

Pipelines can have one or more receivers. Data from all receivers is pushed to
the first processor, which processes the data and then pushes it to the next
processor (or a processor may drop the data if it is a “sampling” processor, for
example), and so on until the last processor in the pipeline pushes the data to
the exporters. Each exporter gets a copy of each data element. The last
processor uses a `fanoutconsumer` to fan out the data to multiple exporters.

Check warning on line 62 in content/en/docs/collector/architecture.md

View workflow job for this annotation

GitHub Actions / SPELLING check

Unknown word (fanoutconsumer)

The pipeline is constructed during Collector startup based on pipeline
definition in the configuration.
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

A pipeline configuration typically looks like this:

```yaml
service:
pipelines: # section that can contain multiple subsections, one per pipeline
traces: # type of the pipeline
receivers: [otlp, jaeger, zipkin]
processors: [memory_limiter, batch]
exporters: [otlp, jaeger, zipkin]
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved
```

The above example defines a pipeline for the “traces” type of telemetry data,
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved
with three receivers, two processors, and three exporters.

For details of config file format see
[this document](https://docs.google.com/document/d/1NeheFG7DmcUYo_h2vLtNRlia9x5wOJMlV4QKEK05FhQ/edit#).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this in a google doc and not in a markdown file in the collector repo? I guess this has historical reasons, maybe we can migrate it as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the document's origin, but I'm happy to convert it to Markdown and move it to the Collector repo if that's the consensus.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question where this should live, this is up to the @open-telemetry/collector-approvers

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove it altogether. It's so old that a few things are already wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 2740396. Thanks!


### Receivers

Receivers typically listen on a network port and receive telemetry data. Usually
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved
one receiver is configured to send received data to one pipeline. However, it is
also possible to configure the same receiver to send the same received data to
multiple pipelines. This can be done by listing the same receiver in the
“receivers” key of several pipelines:
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

```yaml
receivers:
otlp:
protocols:
grpc:
endpoint: localhost:4317

service:
pipelines:
traces: # a pipeline of “traces” type
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [jaeger]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I count 11 more mentions of jaeger in the document, including several in diagrams (two of which I'm still working on).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need to remove all of them, only the instances where the jaeger exporter is used since it has been deprecated a while ago already, the receivers are still fine, but we still could consider using OTLP there. Jaeger can remain if named as a potential backend (which now supports OTLP)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed mentions of jaeger exporter to otlp in 8e12f26.

traces/2: # another pipeline of “traces” type
receivers: [otlp]
processors: [batch]
exporters: [opencensus]
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved
```

In the above example, `otlp` receiver will send the same data to pipeline
`traces` and to pipeline `traces/2`. (Note: The configuration uses composite key
names in the form of `type[/name]` as defined in
[this document](https://docs.google.com/document/d/1NeheFG7DmcUYo_h2vLtNRlia9x5wOJMlV4QKEK05FhQ/edit#)).
svrnm marked this conversation as resolved.
Show resolved Hide resolved

When the Collector loads this config, the result will look like this diagram
(part of processors and exporters are omitted for brevity):

<!--TODO: Add Receivers image via Mermaid.-->

> Important: When the same receiver is referenced in more than one pipeline, the
> Collector creates only one receiver instance at runtime that sends the data to
> a fan out consumer. The fan out consumer in turn sends the data to the first
> processor of each pipeline. The data propagation from receiver to the fan out
> consumer and then to processors is completed using a synchronous function
> call. This means that if one processor blocks the call, the other pipelines
> attached to this receiver are blocked from receiving the same data, and the
> receiver itself stops processing and forwarding newly received data.
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

### Exporters

Exporters typically forward the data they get to a destination on a network, but
they can also send the data elsewhere. For example, `logging` exporter writes
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved
the telemetry data to the logging destination.

The configuration allows for multiple exporters of the same type, even in the
same pipeline. For example, you can have two `otlp` exporters defined, each one
sending to a different OTLP endpoint:

```yaml
exporters:
otlp/1:
endpoint: example.com:4317
otlp/2:
endpoint: localhost:14317
```

An exporter usually gets the data from one pipeline. However, you can configure
multiple pipelines to send data to the same exporter:

```yaml
exporters:
jaeger:
protocols:
grpc:
endpoint: localhost:14250

service:
pipelines:
traces: # a pipeline of “traces” type
receivers: [zipkin]
processors: [memory_limiter]
exporters: [jaeger]
traces/2: # another pipeline of “traces” type
receivers: [otlp]
processors: [batch]
exporters: [jaeger]
```

In the above example, `jaeger` exporter gets data from pipeline `traces` and
from pipeline `traces/2`. When the Collector loads this config, the result looks
like this diagram (part of processors and receivers are omitted for brevity):

<!--TODO: Add Exporters image via Mermaid.-->

### Processors

A pipeline can contain sequentially connected processors. The first processor
gets the data from one or more receivers that are configured for the pipeline,
and the last processor sends the data to one or more exporters that are
configured for the pipeline. All processors between the first and last receive
the data from only one preceding processor and send data to only one succeeding
processor.

Processors can transform the data before forwarding it, such as adding or
removing attributes from spans. They can also drop the data by deciding not to
forward it (for example, the `probabilisticsampler` processor). Or they can

Check warning on line 187 in content/en/docs/collector/architecture.md

View workflow job for this annotation

GitHub Actions / SPELLING check

Unknown word (probabilisticsampler)
generate new data, as the `spanmetrics` processor does by producing metrics for

Check warning on line 188 in content/en/docs/collector/architecture.md

View workflow job for this annotation

GitHub Actions / SPELLING check

Unknown word (spanmetrics)
spans processed by the pipeline.

The same name of the processor can be referenced in the `processors` key of
multiple pipelines. In this case, the same configuration is used for each of
these processors, but each pipeline always gets its own instance of the
processor. Each of these processors has its own state, and the processors are
never shared between pipelines. For example, if `batch` processor is used in
several pipelines, each pipeline has its own batch processor, but each batch
processor is configured exactly the same way if they reference the same key in
the configuration. See the following configuration:

```yaml
processors:
batch:
send_batch_size: 10000
timeout: 10s

service:
pipelines:
traces: # a pipeline of “traces” type
receivers: [zipkin]
processors: [batch]
exporters: [jaeger]
traces/2: # another pipeline of “traces” type
receivers: [otlp]
processors: [batch]
exporters: [otlp]
```

When the Collector loads this config, the result looks like this diagram:

<!--TODO: Add Processors image via Mermaid.-->

Note that each `batch` processor is an independent instance, although they are
configured the same way with a `send_batch_size` of 10000.
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

> Note: The same name of the processor must not be referenced multiple times in
> the `processors` key of a single pipeline.

## <a name="opentelemetry-agent"></a>Running as an Agent

On a typical VM/container, user applications are running in some processes/pods
with OpenTelemetry Library (Library). Previously, Library did all the recording,
collecting, sampling, and aggregation of traces, metrics, and logs, and then
either exported the data to other persistent storage backends via the Library
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved
exporters, or displayed it on local zpages. This pattern has several drawbacks,

Check warning on line 234 in content/en/docs/collector/architecture.md

View workflow job for this annotation

GitHub Actions / SPELLING check

Unknown word (zpages)
for example:

1. For each OpenTelemetry Library, exporters and zpages must be re-implemented

Check warning on line 237 in content/en/docs/collector/architecture.md

View workflow job for this annotation

GitHub Actions / SPELLING check

Unknown word (zpages)
in native languages.
2. In some programming languages (for example, Ruby or PHP), it is difficult to
do the stats aggregation in process.
3. To enable exporting of OpenTelemetry spans, stats, or metrics, application
users need to manually add library exporters and redeploy their binaries.
This is especially difficult when an incident has occurred, and users want to
use OpenTelemetry to investigate the issue right away.
4. Application users need to take the responsibility for configuring and
initializing exporters. These tasks are error-prone (for example, setting up
incorrect credentials or monitored resources), and users may be reluctant to
“pollute” their code with OpenTelemetry.

To resolve the issues above, you can run OpenTelemetry Collector as an Agent.
The Agent runs as a daemon in the VM/container and can be deployed independent
of Library. Once Agent is deployed and running, it should be able to retrieve
traces, metrics, and logs from Library, and export them to other backends. We
may also give Agent the ability to push configurations (such as sampling
probability) to Library. For those languages that cannot do stats aggregation in
process, they can send raw measurements and have Agent do the aggregation.
theletterf marked this conversation as resolved.
Show resolved Hide resolved

<!--TODO: Add Agent image via Mermaid.-->

> For developers and maintainers of other libraries: By adding specific
> receivers, you can configure Agent to accept traces, metrics, and logs from
> other tracing/monitoring libraries, such as Zipkin, Prometheus, etc. See
> [Receivers](#receivers) for details.

## <a name="opentelemetry-collector"></a>Running as a Gateway

The OpenTelemetry Collector can run as a Gateway instance and receive spans and
metrics exported by one or more Agents or Libraries or by tasks/agents that
emit in one of the supported protocols. The Collector is configured to send data
to the configured exporter(s). The following figure summarizes the deployment
architecture:

<!--TODO: Add Service image via Mermaid.-->

The OpenTelemetry Collector can also be deployed in other configurations, such
as receiving data from other agents or clients in one of the formats supported
by its receivers.
Loading