Better structure for span identification #531

pauldraper · 2020-03-28T16:57:00Z

Consider these canonical examples:

Span Name	Guidance
get_account	Good, and account_id=42 would make a nice Span attribute
get_account/{accountId}	Also good (using the "HTTP route")

https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/api-tracing.md#span

These, are quite frankly terrible identifications for a span. The headliner information doesn't give me a clue whether I'm looking an HTTP request/response, an RPC call, a database procedure/query, a cloud function, a cache lookup, an internal computation etc.

The trace itself isn't the best particular example, but consider Datadog's tracing interface:

There is both:

A span "type" that is instrumentation-determined (Datadog vocab: "name" or "operation").
- http.request
- mongodb.query
- lambda.invocation
- grpc.call
- java.function
A span "name" that is application-determined (Datadog vocab: "resource").
- get_account
- /users/{id}
- SELECT * FROM pokedex
- com.example.Thing.run.

Whether this is done as syntax in the span name (TYPE:NAME), or whether as attribute (type: TYPE, component: TYPE), there should be some standard method of assigning classification.

Otherwise, I wind up with spans auto-named "get_account," all of wildly different flavors (HTTP, RPC, Message Queue, DB), and I'm left trying to tell them apart. Naturally, with enough inspection into attributes that is possible, but there are a lot of attributes to look through (a high level view of trace usually doesn't show them due to their number).

(Note I am not talking about tracer name, which is refers to the instrumentation. I am talking about either the instrumented technology, or type of operation.)

I believe this overlaps with #271, though there is little recorded discussion, so I'm not entirely sure what happened.

The text was updated successfully, but these errors were encountered:

arminru · 2020-03-30T14:33:02Z

Backends should already be able to deduce the type of action using the attributes added to spans according to the semantic conventions defined in this spec. The easiest way would be a if-else cascade checking for the presence of mandatory attributes in a certain order (db.type, messaging.system, http.method, rpc.service, ...). Therefore, the problem you mentioned is merely one of the backend and not the data model itself.

PS: Does the color coding in your screenshot correspond to the types ("names") that you list below or how can you tell which one it is?

dyladan · 2020-03-30T14:39:21Z

@arminru dd trace color scheme is either by service or host depending on settings.

pauldraper · 2020-05-07T03:34:02Z

Backends should already be able to deduce the type of action using the attributes added to spans according to the semantic conventions defined in this spec. The easiest way would be a if-else cascade checking for the presence of mandatory attributes in a certain order

And exactly what backend would you recommend for this?

I really don't get the hesitation to make spans have properly a discriminated type.

arminru · 2020-05-07T08:55:06Z

@pauldraper

And exactly what backend would you recommend for this?

Well I work on a backend where we do it this way, so my recommendation would most probably be strongly biased at least. 😄

I really don't get the hesitation to make spans have properly a discriminated type.

The type you'd like to have added had actually been there in the past, it was named component but removed in #271. Unfortunately, the rationale for the decision was not really properly documented on the issue nor in the meeting notes. Based on the description provided by @yurishkuro, who opened the issue, I'd say it was removed for the motivation he stated - the fact that component is redundant since the type/kind of span can be inferred by looking at (required) span attributes as I also mentioned above (#531 (comment)).

Apart from that, at the time the issue was opened, component was not well-specified. For database spans, for example, component was not defined as a fixed string "db" or "database" but rather an unbounded, free text value as initially criticized in #245 (title was reworded after component was removed):
component: Database driver name or database name (when known) "JDBI", "jdbc", "odbc", "postgreSQL".
This definition would not have been of any help for the purposes you described but could've been fixed as well, of course, rather than removing component entirely.

pauldraper · 2020-05-07T15:54:05Z

the fact that component is redundant

Not really. Without it, you need to add information (an algorithm for deducing type) that you wouldn't otherwise.

This definition would not have been of any help for the purposes you described but could've been fixed

It certain would help. I don't need it to necessarily be standardized. I just need unique operation names.

The canonical examples of good span names are get_account and get_account/{accountId}.

I have no earthy idea which of the various flavors of "get_account" I have in my stack: database, HTTP, in-process function, cloud function, AMP message? I don't necessary need a perfectly uniform component classification scheme, but I do need to tell the HTTP request get_account apart from the database query get_account apart when they show up in report, list, etc. And tacking on 20 attributes of every possible kind to achieve that uniqueness isn't wieldy.

Now, perhaps the specification just has really, really bad examples of span names. Maybe the good span names would be HTTP:get_account, JDBI:get_account, etc. I don't care whether it's an attribute or span name prefix; I just want to tell my operations apart, and currently the spec seems to do a very bad job of that.

Programming a backend in order to that basic thing...that seems unnecessarily complex and poorly supported.

Oberon00 · 2020-05-07T18:59:51Z

Since you complain about span name, I think it currently has an unclear purpose, see related issue #557.

tedsuo · 2020-07-23T16:51:32Z

Hi @pauldraper, I've taken a shot at resolving some of this issues raised in this thread and others here (#730), by adding display hints. Please take a look.

tigrannajaryan · 2020-09-08T18:35:43Z

I suggest to remove release:required-for-ga label.

The "component" approach was already discussed and rejected in the past. The type of the Span can be deduced by the presence of required attributes. It may not be convenient but it is possible. It is also more powerful since it allows to record multiple types simultaneously while a single "type" or "component" does not (what is the type of a Span representing an HTTP call to a database? Is it "http" or "db"?).

It is likely too late for 1.0 to introduce a new way of specifying the Span type that is better than what we have. The are likely better ways but I don't think we have time to introduce, discuss and agree on an approach quickly enough to make it part of 1.0 release.

carlosalberto · 2020-09-09T14:33:10Z

+1 on making this release:after-ga .

andrewhsu · 2020-09-09T15:17:45Z

From the issue triage mtg today, i'm changing the label to release:after-ga since it looks like from the comments this can be punted.

pauldraper · 2021-05-04T10:47:48Z

1. Does anyone use Datadog? Or am I the only user of the largest commercial monitoring platform?

Because I don't see how Otel is going to work with Datadog using the it can intelligently produce an operation and resource ("type" and "name").

2. Does anyone thing these trace names are actually good?

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#span

Like what the heck are they....a file, a GRPC operation, a HTTP request, a DB query, something else?

Not at all obvious.

Oberon00 · 2021-05-04T11:14:43Z

Like what the heck are they....a file, a GRPC operation, a HTTP request, a DB query, something else?

Not at all obvious.

The details are specified in the semantic conventions: https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/trace/semantic_conventions

Oberon00 · 2021-05-04T11:18:05Z

Because I don't see how Otel is going to work with Datadog using the it can intelligently produce an operation and resource ("type" and "name").

All semantic conventions should have a "marker attribute" or at least a set thereof. E.g. a database operation can be identified by having a db.system span attribute, an HTTP span always has http.method, etc. (but see #653)

Sturgelose · 2022-05-18T17:45:02Z

Providing some feedback as a current DataDog user (platform to manage traces in my company), Jaeger user (testing locally)and trying to manage a way in my company to standardize not only spans that have defined usecases (http/grpc request, db, SQS, lambda, etc), but also private custom conventions inside of my company.

I agree and understand that db.system or http.method can be used to identify a span "type of action", "component", "type" or however we want to call it. However, as Paul comments, just checking if this tag exists it is not enough.

We might add new tags to the spec or deprecate some. Technology evolves and for sure we will need to add or remove metadata in spans. However, it is not feasible to operate on them if we do not know which spec we are targeting to.

Thus, is where it makes sense to have a component, type or any kind of type identifying not only the type of the span, but also the version of the schema we are mapping it to! This will simplify parsers, make it easier for users to identify which kind of data they have available and also upgrading queries to support new standardized semantic conventions or potential additions in the future.

Internally in my company I'm working to define span schemas for different similar types of spans that map to business logic. These are totally independent from the semantic conventions defined in oTel, but we still have similar challenges. We are trying to adapt and implement the different tags whenever is possible or relevant for that business logic span. However we are iterating on it, and we are versioning them, so we end with payments-v2 or identity-v4 (as an example), and we know which is the expected structure and tags that we will have in each span.

Otherwise, there is no magic way to understand how spans will change in the future, and of course, it makes it really hard for processors to identify them or understand which kind of span we are looking at. The only option, as Paul says is to make a crazy algorithm, that for sure will have issues when doing changes at the schema that will try to identify the type of span.

Maybe it is not a blocker for a ga release, but it is definitely a must how oTel will version the changes in the semantic conventions (which maybe should be defined (or are) as schemas?) to make sure that in 1-2 years (of convention changing) we know of which kind of http/db/queue, etc metadata we are speaking about.

tigrannajaryan · 2022-05-18T18:07:14Z

Thus, is where it makes sense to have a component, type or any kind of type identifying not only the type of the span, but also the version of the schema we are mapping it to! This will simplify parsers, make it easier for users to identify which kind of data they have available and also upgrading queries to support new standardized semantic conventions or potential additions in the future.

@Sturgelose The version of the spec the span conforms to is already possible to include. SchemaURL can be included the emitted telemetry. See schemas: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/schemas/overview.md

Schemas are also how the evolution of the conventions is supposed to be handled. See https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/versioning-and-stability.md#semantic-conventions-stability

Does this address your concerns?

bogdandrutu added the spec:trace Related to the specification/trace directory label Jun 12, 2020

carlosalberto added the release:required-for-ga Must be resolved before GA release, or nice to have before GA label Jul 2, 2020

pauldraper mentioned this issue Jul 16, 2020

feat: [do not merge] OpenTelemetry Datadog Exporter open-telemetry/opentelemetry-js#1316

Closed

andrewhsu added the priority:p1 Highest priority level label Jul 17, 2020

andrewhsu assigned tedsuo Jul 21, 2020

tedsuo mentioned this issue Jul 23, 2020

And semantic conventions for Display Hints #730

Closed

pauldraper mentioned this issue Aug 6, 2020

Making tracing SDK metrics aware #381

Open

bogdandrutu added priority:p3 Lowest priority level and removed priority:p1 Highest priority level labels Aug 10, 2020

andrewhsu added release:after-ga Not required before GA release, and not going to work on before GA and removed priority:p3 Lowest priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA labels Sep 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better structure for span identification #531

Better structure for span identification #531

pauldraper commented Mar 28, 2020 •

edited

Loading

arminru commented Mar 30, 2020

dyladan commented Mar 30, 2020

pauldraper commented May 7, 2020 •

edited

Loading

arminru commented May 7, 2020

pauldraper commented May 7, 2020 •

edited

Loading

Oberon00 commented May 7, 2020

tedsuo commented Jul 23, 2020

tigrannajaryan commented Sep 8, 2020

carlosalberto commented Sep 9, 2020

andrewhsu commented Sep 9, 2020

pauldraper commented May 4, 2021

Oberon00 commented May 4, 2021

Oberon00 commented May 4, 2021

Sturgelose commented May 18, 2022

tigrannajaryan commented May 18, 2022 •

edited

Loading

Better structure for span identification #531

Better structure for span identification #531

Comments

pauldraper commented Mar 28, 2020 • edited Loading

arminru commented Mar 30, 2020

dyladan commented Mar 30, 2020

pauldraper commented May 7, 2020 • edited Loading

arminru commented May 7, 2020

pauldraper commented May 7, 2020 • edited Loading

Oberon00 commented May 7, 2020

tedsuo commented Jul 23, 2020

tigrannajaryan commented Sep 8, 2020

carlosalberto commented Sep 9, 2020

andrewhsu commented Sep 9, 2020

pauldraper commented May 4, 2021

Oberon00 commented May 4, 2021

Oberon00 commented May 4, 2021

Sturgelose commented May 18, 2022

tigrannajaryan commented May 18, 2022 • edited Loading

pauldraper commented Mar 28, 2020 •

edited

Loading

pauldraper commented May 7, 2020 •

edited

Loading

pauldraper commented May 7, 2020 •

edited

Loading

tigrannajaryan commented May 18, 2022 •

edited

Loading