-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add OTel resource attribute promotion proposal
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
- Loading branch information
Showing
1 changed file
with
89 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
# OTel resource attribute promotion | ||
|
||
* **Owners:** | ||
* Arve Knudsen [@aknuds1](https://github.com/aknuds1) [arve.knudsen@grafana.com](mailto:arve.knudsen@grafana.com) | ||
|
||
* **Implementation Status:** Partially implemented | ||
|
||
* **Related Issues and PRs:** | ||
* [WIP: OTLP Translator prometheusremotewrite: Support resource attribute promotion](https://github.com/prometheus/prometheus/pull/14200) | ||
|
||
* **Other docs or links:** | ||
|
||
> This proposal collects the requirements and implementation proposals for supporting OTel resource attribute promotion to labels. | ||
## Why | ||
|
||
Currently, Prometheus encodes OpenTelemetry (OTel for short) [resource attributes](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/resource/sdk.md) as labels of the `target_info` metric. | ||
OTel resource attributes model metadata about the environment producing metrics received by the backend (e.g. Prometheus). | ||
Typically, OTel users want to include some of these attributes (as `target_info` labels) in their Prometheus query results, to correlate them with entities of theirs (e.g. K8s pods). | ||
|
||
Based on user demand, it would be preferable if Prometheus were to have better UX for including OTel resource attributes in query results. | ||
The current solution is to join with `target_info in queries, to pick also the labels one is interested in (corresponding to OTel resource attributes). | ||
This requires relatively advanced knowledge of PromQL though and is a barrier to many users. | ||
Take as an example querying HTTP request rates per K8s cluster and status code, while having to join with the `target_info` metric to obtain the `k8s.cluster.name` resource attribute (encoded as `k8s_cluster_name`): | ||
|
||
```promql | ||
# Join with target_info on job and instance labels, to include k8s_cluster_name. | ||
sum by (k8s_cluster_name, http_status_code) ( | ||
rate(http_server_request_duration_seconds_count[2m]) | ||
* on (job, instance) group_left (k8s_cluster_name) | ||
target_info | ||
) | ||
``` | ||
|
||
### Pitfalls of the current solution | ||
|
||
As already mentioned, the current solution of including OTel resource attributes in query results through join queries represents a technical barrier to users. | ||
Also, it requires the user to know which `target_info` labels can be joined on (i.e., `job` and `instance`), plus which labels represent the various OTel resource attributes. | ||
All in all, the UX for including OTel resource attributes in Prometheus query results is not very smooth. | ||
|
||
## Goals | ||
|
||
Goals and use cases for the solution as proposed in [How](#how): | ||
|
||
* Support, in the OTLP endpoint, automatic promotion of a configurable set of OTel resource attributes to metric labels. | ||
|
||
### Audience | ||
|
||
Prometheus maintainers. | ||
|
||
## How | ||
|
||
* Make the OTLP endpoint support a configurable set of OTel resource attributes to promote to metric labels. | ||
* Add a Prometheus configuration parameter for which OTel resource attributes to promote (default: none). | ||
|
||
With OTel resource attribute promotion configured to `[k8s.cluster.name]`, we can simplify the previously given PromQL join example as follows: | ||
|
||
``` | ||
sum by (k8s_cluster_name, http_status_code) ( | ||
rate(http_server_request_duration_seconds_count[2m]) | ||
) | ||
``` | ||
|
||
## Alternatives | ||
|
||
### Simplify joins with info metrics in PromQL | ||
|
||
Instead of promoting selected OTel resource attributes to labels at ingest time, another [proposal](https://github.com/prometheus/proposals/pull/37) is to simplify the joining with `target_info` in queries. | ||
These proposals are not necessarily competing though, as the respective proposed features can co-exist. | ||
|
||
#### Pros | ||
|
||
* Avoids having to add more labels to metrics than strictly required to identify them. | ||
* Avoids series churn when one or more of the promoted OTel resource attributes change. | ||
* More labels per metric increases CPU/memory usage. | ||
* Avoids the user having to decide up front which OTel resource attributes to promote at ingestion time. | ||
* Avoids series churn when the user changes which OTel resource attributes to promote. | ||
* Simply improves the UX for the existing solution of encoding OTel resource attributes as `target_info` labels. | ||
|
||
#### Cons | ||
|
||
* Much more complicated to implement. | ||
* Requires the user to call `info` in their queries. | ||
|
||
## Action Plan | ||
|
||
The tasks to do in order to migrate to the new idea. | ||
|
||
* [ ] https://github.com/prometheus/prometheus/pull/14200 |