Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide guidelines for low-cardinality span names #416

Merged
merged 5 commits into from
Jan 24, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 21 additions & 3 deletions specification/api-tracing.md
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,7 @@ sub-operations.

`Span`s encapsulate:

- The operation name
- The span name
- An immutable [`SpanContext`](#spancontext) that uniquely identifies the
`Span`
- A parent span in the form of a [`Span`](#span), [`SpanContext`](#spancontext),
Expand All @@ -212,6 +212,24 @@ sub-operations.
- A list of timestamped [`Event`s](#add-events)
- A [`Status`](#set-status).

The _span name_ is a human-readable string which concisely identifies the work
represented by the Span, for example, an RPC method name, a function name,
or the name of a subtask or stage within a larger computation. The span name
should be the most general string that identifies a (statistically) interesting
_class of Spans_, rather than individual Span instances. That is, "get_user" is
a reasonable name, while "get_user/314159", where "314159" is a user ID, is not
a good name due to its high cardinality.

For example, here are potential span names for an endpoint that gets a
hypothetical account information:

| Span Name | Guidance |
| ----------------- | ------------ |
| `get` | Too general |
| `get_account/42` | Too specific |
| `get_account` | Good, and account_id=42 would make a nice Span attribute |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you do in the case that you have the url, but don't know which part of it is an id or dynamic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then you can't use it. It's covered in the 2nd file.

| `get_account/{accountId}` | Also good (using the "HTTP route") |

The `Span`'s start and end timestamps reflect the elapsed real time of the
operation. A `Span`'s start time SHOULD be set to the current time on [span
creation](#span-creation). After the `Span` is created, it SHOULD be possible to
Expand All @@ -238,7 +256,7 @@ as a separate operation.

The API MUST accept the following parameters:

- The operation name. This is a required parameter.
- The span name. This is a required parameter.
- The parent Span or parent Span context, and whether the new `Span` should be a
root `Span`. API MAY also have an option for implicit parent context
extraction from the current context as a default behavior.
Expand Down Expand Up @@ -428,7 +446,7 @@ started with the explicit timestamp from the past at the moment where the final

Required parameters:

- The new **operation name**, which supersedes whatever was passed in when the
- The new **span name**, which supersedes whatever was passed in when the
`Span` was started

#### End
Expand Down
17 changes: 13 additions & 4 deletions specification/data-http.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,19 @@ and various HTTP versions like 1.1, 2 and SPDY.

## Name

Given an [RFC 3986](https://tools.ietf.org/html/rfc3986) compliant URI of the form `scheme:[//host[:port]]path[?query][#fragment]`,
the span name of the span SHOULD be set to the URI path value,
unless another value that represents the identity of the request and has a lower cardinality can be identified
(e.g. the route for server spans; see below).
HTTP spans MUST follow the overall [guidelines for span names](./api-tracing.md#span).
Many REST APIs encode parameters into URI path, e.g. `/api/users/123` where `123`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make a more specific recommendation. E.g. using one of the formats "HTTP {METHOD-NAME}" or "{METHOD-NAME} {HOST-NAME}" sounds reasonable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method is already suggested in the following sentences, I can change to use the exact pattern.

Host name is not acceptable. If clients are directly integrated with service discovery, then host name may be very high cardinality.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but I would make it more clear that this not merely a suggestion but actually what should be used (in absence of a better option like endpoint name, route, ...). And I would definitely prescribe an exact (fallback) format, because that would allow grouping HTTP requests from different languages together on the back end.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added exact recommendation in L34, but I don't know how strongly prescriptive we want to be.

is a user id, which creates high cardinality value space not suitable for span
names. In case of HTTP servers, these endpoints are often mapped by the server
frameworks to more concise _HTTP routes_, e.g. `/api/users/{user_id}`, which are
recommended as the low cardinality span names. However, the same approach usually
does not work for HTTP client spans, especially when instrumentation is provided
by a lower-level middleware that is not aware of the specifics of how the URIs
are formed. Therefore, HTTP client spans SHOULD be using conservative, low
cardinality names formed from the available parameters of an HTTP request,
such as `"HTTP {METHOD_NAME}"`. Instrumentation MUST NOT default to using URI
path as span name, but MAY provide hooks to allow custom logic to override the
default span name.

## Status

Expand Down