Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt semantic conventions for the span name of messaging systems #690

Merged
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@ the release.

## Unreleased

New:

Updates:

- Adapt semantic conventions for the span name of messaging systems ([#690](https://github.com/open-telemetry/opentelemetry-specification/pull/690))
- Add semantic convention for NGINX custom HTTP 499 status code.

## v0.6.0 (07-01-2020)
Expand All @@ -21,6 +26,7 @@ New:
- Add peer.service to provide a user-configured name for a remote service ([#652](https://github.com/open-telemetry/opentelemetry-specification/pull/652))

Updates:

- Improve root Span description ([#645](https://github.com/open-telemetry/opentelemetry-specification/pull/645))
- Extend semantic conventions for RPC and allow non-gRPC calls ([#604](https://github.com/open-telemetry/opentelemetry-specification/pull/604))
- Revise and extend semantic conventions for databases ([#575](https://github.com/open-telemetry/opentelemetry-specification/pull/575))
Expand Down
85 changes: 72 additions & 13 deletions specification/trace/semantic_conventions/messaging.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,21 @@
<!-- toc -->

- [Definitions](#definitions)
* [Destinations](#destinations)
* [Message consumption](#message-consumption)
* [Conversations](#conversations)
* [Temporary destinations](#temporary-destinations)
- [Conventions](#conventions)
* [Span name](#span-name)
* [Span kind](#span-kind)
* [Operation names](#operation-names)
- [Messaging attributes](#messaging-attributes)
* [Attributes specific to certain messaging systems](#attributes-specific-to-certain-messaging-systems)
+ [RabbitMQ](#rabbitmq)
- [Examples](#examples)
* [Topic with multiple consumers](#topic-with-multiple-consumers)
* [Batch receiving](#batch-receiving)
* [Batch processing](#batch-processing)

<!-- tocstop -->

Expand All @@ -20,48 +32,93 @@ A *message* usually consists of headers (or properties, or meta information) and
* Physically: some message *broker* (which can be e.g., a single server, or a cluster, or a local process reached via IPC). The broker handles the actual routing, delivery, re-delivery, persistence, etc. In some messaging systems the broker may be identical or co-located with (some) message consumers.
* Logically: some particular message *destination*.

### Destinations

A destination is usually identified by some name unique within the messaging system instance, which might look like an URL or a simple one-word identifier.
Two kinds of destinations are distinguished: *topic*s and *queue*s.
A message that is sent (the send-operation is often called "*publish*" in this context) to a *topic* is broadcasted to all *subscribers* of the topic.
A message submitted to a queue is processed by a message *consumer* (usually exactly once although some message systems support a more performant at-least-once mode for messages with [idempotent][] processing).

[idempotent]: https://en.wikipedia.org/wiki/Idempotence

### Message consumption

The consumption of a message can happen in multiple steps.
First, the lower-level receiving of a message at a consumer, and then the logical processing of the message.
Often, the waiting for a message is not particularly interesting and hidden away in a framework that only invokes some handler function to process a message once one is received
(in the same way that the listening on a TCP port for an incoming HTTP message is not particularly interesting).
However, in a synchronous conversation, the wait time for a message is important.

In some messaging systems, a message can receive a reply message that answers a particular other message that was sent earlier. All messages that are grouped together by such a reply-relationship are called a *conversation*. The grouping usually happens through some sort of "In-Reply-To:" meta information or an explicit conversation ID. Sometimes a conversation can span multiple message destinations (e.g. initiated via a topic, continued on a temporary one-to-one queue).
### Conversations

Some messaging systems support the concept of *temporary destination* (often only temporary queues) that are established just for a particular set of communication partners (often one to one) or conversation. Often such destinations are unnamed or have an auto-generated name.
In some messaging systems, a message can receive a reply message that answers a particular other message that was sent earlier. All messages that are grouped together by such a reply-relationship are called a *conversation*.
The grouping usually happens through some sort of "In-Reply-To:" meta information or an explicit *conversation ID* (sometimes called *correlation ID*).
Sometimes a conversation can span multiple message destinations (e.g. initiated via a topic, continued on a temporary one-to-one queue).

[idempotent]: https://en.wikipedia.org/wiki/Idempotence
### Temporary destinations

Some messaging systems support the concept of *temporary destination* (often only temporary queues) that are established just for a particular set of communication partners (often one to one) or conversation.
Often such destinations are unnamed or have an auto-generated name.

## Conventions

Given these definitions, the remainder of this section describes the semantic conventions that shall be followed for Spans describing interactions with messaging systems.

**Span name:** The span name should usually be set to the message destination name.
The conversation ID should be used instead when it is expected to have lower cardinality.
In particular, the conversation ID must be used if the message destination is unnamed or temporary unless multiple conversations can be combined to a logical destination of lower cardinality.
### Span name

The span name SHOULD be set to the message destination name and the operation being performed in the following format:
jmacd marked this conversation as resolved.
Show resolved Hide resolved

```
<destination name> <operation name>
```

The destination name SHOULD only be used for the span name if it is known to be of low cardinality (cf. [general span name guidelines](../api.md#span)).
This can be assumed if it is statically derived from application code or configuration.
If the destination name is dynamic, such as a [conversation ID](#conversations) or a value obtained from a `Reply-To` header, it SHOULD NOT be used for the span name.
In these cases, an artificial destination name that best expresses the destination, or a generic, static fallback like `"(temporary)"` for [temporary destinations](#temporary-destinations) SHOULD be used instead.

The values allowed for `<operation name>` are defined in the section [Operation names](#operation-names) below.
If the format above is used, the operation name MUST match the `messaging.operation` attribute defined for message consumer spans below.

Examples:

* `shop.orders send`
* `shop.orders receive`
* `shop.orders process`
* `print_jobs send`
* `topic with spaces process`
* `AuthenticationRequest-Conversations process`
* `(temporary) send` (`(temporary)` being a stable identifier for randomly generated, temporary destination names)

**Span kind:** A producer of a message should set the span kind to `PRODUCER` unless it synchronously waits for a response: then it should use `CLIENT`.
### Span kind

A producer of a message should set the span kind to `PRODUCER` unless it synchronously waits for a response: then it should use `CLIENT`.
The processor of the message should set the kind to `CONSUMER`, unless it always sends back a reply that is directed to the producer of the message
(as opposed to e.g., a queue on which the producer happens to listen): then it should use `SERVER`.

### Operation names

The following operations related to messages are defined for these semantic conventions:

| Operation name | Description |
| -------------- | ----------- |
| `send` | A message is sent to a destination by a message producer/client. |
| `receive` | A message is received from a destination by a message consumer/server. |
| `process` | A message that was previously received from a destination is processed by a message consumer/server. |
arminru marked this conversation as resolved.
Show resolved Hide resolved

## Messaging attributes

| Attribute name | Notes and examples | Required? |
| -------------- | ---------------------------------------------------------------------- | --------- |
| `messaging.system` | A string identifying the messaging system such as `kafka`, `rabbitmq` or `activemq`. | Yes |
| `messaging.destination` | The message destination name, e.g. `MyQueue` or `MyTopic`. This might be equal to the span name but is required nevertheless. | Yes |
| `messaging.destination_kind` | The kind of message destination: Either `queue` or `topic`. | Yes, if either of them applies. |
| `messaging.temp_destination` | A boolean that is `true` if the message destination is temporary. | If temporary (assumed to be `false` if missing). |
| `messaging.temp_destination` | A boolean that is `true` if the message destination is [temporary](#temporary-destinations). | If temporary (assumed to be `false` if missing). |
| `messaging.protocol` | The name of the transport protocol such as `AMQP` or `MQTT`. | No |
| `messaging.protocol_version` | The version of the transport protocol such as `0.9.1`. | No |
| `messaging.url` | Connection string such as `tibjmsnaming://localhost:7222` or `https://queue.amazonaws.com/80398EXAMPLE/MyQueue`. | No |
| `messaging.message_id` | A value used by the messaging system as an identifier for the message, represented as a string. | No |
| `messaging.conversation_id` | A value identifying the conversation to which the message belongs, represented as a string. Sometimes called "Correlation ID". | No |
| `messaging.conversation_id` | The [conversation ID](#conversations) identifying the conversation to which the message belongs, represented as a string. Sometimes called "Correlation ID". | No |
| `messaging.message_payload_size_bytes` | The (uncompressed) size of the message payload in bytes. Also use this attribute if it is unknown whether the compressed or uncompressed payload size is reported. | No |
| `messaging.message_payload_compressed_size_bytes` | The compressed size of the message payload in bytes. | No |

Expand All @@ -77,10 +134,12 @@ For message consumers, the following additional attributes may be set:

| Attribute name | Notes and examples | Required? |
| -------------- | ---------------------------------------------------------------------- | --------- |
| `messaging.operation` | A string identifying which part and kind of message consumption this span describes: either `receive` or `process`. (If the operation is `send`, this attribute must not be set: the operation can be inferred from the span kind in that case.) | No |
| `messaging.operation` | A string identifying the kind of message consumption as defined in the [Operation names](#operation-names) section above. Only `"receive"` and `"process"` are used for this attribute. If the operation is `"send"`, this attribute MUST NOT be set, since the operation can be inferred from the span kind in that case. | No |

The _receive_ span is be used to track the time used for receiving the message(s), whereas the _process_ span(s) track the time for processing the message(s).
Note that one or multiple Spans with `messaging.operation` = `process` may often be the children of a Span with `messaging.operation` = `receive`.
The distinction between receiving and processing of messages is not always of particular interest or sometimes hidden away in a framework (see the [Message consumption](#message-consumption) section above) and therefore the attribute can be left out.
arminru marked this conversation as resolved.
Show resolved Hide resolved
For batch receiving and processing (see the [Batch receiving](#batch-receiving) and [Batch processing](#batch-processing) examples below) in particular, the attribute SHOULD be set.
Even though in that case one might think that the processing span's kind should be `INTERNAL`, that kind MUST NOT be used.
Instead span kind should be set to either `CONSUMER` or `SERVER` according to the rules defined above.

Expand Down Expand Up @@ -108,7 +167,7 @@ Process CB: | Span CB1 |

| Field or Attribute | Span Prod1 | Span CA1 | Span CB1 |
|-|-|-|-|
| Name | `"T"` | `"T"` | `"T"` |
| Span name | `"T send"` | `"T process"` | `"T process"` |
| Parent | | Span Prod1 | Span Prod1 |
| Links | | | |
| SpanKind | `PRODUCER` | `CONSUMER` | `CONSUMER` |
Expand Down Expand Up @@ -137,7 +196,7 @@ Process C: | Span Recv1 |

| Field or Attribute | Span Prod1 | Span Prod2 | Span Recv1 | Span Proc1 | Span Proc2 |
|-|-|-|-|-|-|
| Name | `"Q"` | `"Q"` | `"Q"` | `"Q"` | `"Q"` |
| Span name | `"Q send"` | `"Q send"` | `"Q receive"` | `"Q process"` | `"Q process"` |
| Parent | | | | Span Recv1 | Span Recv1 |
| Links | | | | Span Prod1 | Span Prod2 |
| SpanKind | `PRODUCER` | `PRODUCER` | `CONSUMER` | `CONSUMER` | `CONSUMER` |
Expand Down Expand Up @@ -170,7 +229,7 @@ Process C: | Span Recv1 | Span Recv2 |

| Field or Attribute | Span Prod1 | Span Prod2 | Span Recv1 | Span Recv2 | Span Proc1 |
|-|-|-|-|-|-|
| Name | `"Q"` | `"Q"` | `"Q"` | `"Q"` | `"Q"` |
| Span name | `"Q send"` | `"Q send"` | `"Q receive"` | `"Q receive"` | `"Q process"` |
| Parent | | | Span Prod1 | Span Prod2 | |
| Links | | | | | Span Prod1 + Prod2 |
| SpanKind | `PRODUCER` | `PRODUCER` | `CONSUMER` | `CONSUMER` | `CONSUMER` |
Expand Down