Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP span name: Recommend handler over route name #548

Closed
wants to merge 2 commits into from

Conversation

Oberon00
Copy link
Member

@Oberon00 Oberon00 commented Apr 6, 2020

Recommend handler name over route name, improve formatting (split into paragraphs).

Depends-on: #557

Copy link
Member

@arminru arminru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable!

@carlosalberto
Copy link
Contributor

Overall LGTM

Thus, the term "endpoint" should be avoided.

Because multiple routes can map to the same handler, but not vice versa,
the handler name is recommended as the low cardinality span name.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with this. When an Ops person is looking at a trace, seeing the route (service API related concept) is likely going to be more informative than the handler method name (an implementation detail).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, if sampling is done based on the span name, then this new name is an improvement. I think that is more a problem of the span name concept which seems to play a double-role as a human-readable name and a grouping key.

Still, a reasonable argument.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, the route is also in a semantic attribute http.route, and it should be possible for a backend to replace the name with that (in the UI). This could even be configured differently for ops teams and dev teams.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same argument can be made against this to have the handler name as convention.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I closed this PR and opened #557 to track the span name double bind issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdandrutu The handler name will always have a lower or the same cardinality as the route, so it is strictly better as grouping key if it's known.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro In #730 (comment) you argue against the display.name convention by saying that backends will compute a better display name anyway based on span attributes. Here you argue against using a particular span name because it is bad as display name. It seems to me like these comments contradict each other. Did you change your opinion or am I misunderstanding you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't say the backends necessarily would do that, but they might do it if they wanted to. My argument against display.name was that in most cases it is redundant, especially in the examples provided in #730. But there may be cases when it is useful. However, it has another big issue when it comes to displaying aggregations, which I just posted in #730 (comment).

I think my position remains the same - span name is both a grouping key and the display name. The additional display.name attribute is possible, but not well thought-through yet, it raises more questions for me now than it answers.

Recommend handler name over route name, improve formatting (split into paragraphs).
@Oberon00
Copy link
Member Author

Oberon00 commented Apr 9, 2020

Closing this, as @yurishkuro made a good argument as to why the current suggestion of the route name is at least not worse than the handler name and there seems to be no traction in another direction.

@Oberon00
Copy link
Member Author

The current discussion in #557 seems to lean towards low cardinality being more important and we should provide a new semantic convention for the display name if we need one. Thus I'm reopening this for discussion.

@Oberon00 Oberon00 reopened this May 26, 2020
@bogdandrutu
Copy link
Member

@Oberon00 should this PR be removed?

@Oberon00
Copy link
Member Author

Oberon00 commented Jun 3, 2020

If that's what you mean, the re-open was intentional, see the comment above. I would rather merge this PR than remove it 😃

@yurishkuro
Copy link
Member

yurishkuro commented Jun 4, 2020

I'm still -1 on handler function name. Imagine you have two versions of REST APIs, /api/v1/xxx and /api/v2/xxx. Some routes in these APIs may not have changed between versions and may be implemented by the same handler - an information I consider very unimportant when looking at the trace, where the fact that /v1/doThis or /v2/doThis was called is much more interesting to the user of the trace (who may have never worked on this service). Using routes means we're referring to public API endpoints. Handler names are implementation detail.

@Oberon00
Copy link
Member Author

Oberon00 commented Jun 4, 2020

@yurishkuro With #557 we would use the span name only as a default grouping. The route is still available as http.route. And we could add a display_name thing later. As #557 says, using the span name both as a display name and a grouping key is bad, so this PR ignores the usability of the span name as a display name and improves its value as a grouping key.

This is what the spec currently already says about the span name:

The span name should be the most general string that identifies a (statistically) interesting class of Spans, rather than individual Span instances.

(emphasis added by me).

@carlosalberto carlosalberto added area:semantic-conventions Related to semantic conventions spec:trace Related to the specification/trace directory labels Jun 12, 2020
@bogdandrutu
Copy link
Member

Assigned @yurishkuro to make a decision on this PR, move forward or close.

@Oberon00
Copy link
Member Author

This PR needs a decision on #557 first.
Only if we decide the span name is (mostly) a "grouping key" as opposed to a display name (which is the direction the discussion on #557 has been going lately), would it make sense to merge this PR.

@iNikem
Copy link
Contributor

iNikem commented Aug 14, 2020

Don't we blur the line between data collection and backend too much?

@Oberon00
Copy link
Member Author

How so?

@iNikem
Copy link
Contributor

iNikem commented Aug 14, 2020

Seems like we try to dictate what backend should do. Yes, we define semantics of the data we collect, but actually saying "this should be used for displaying and this for grouping" is going much farther, no?

@Oberon00
Copy link
Member Author

I see it that way: If we want backends to be able to use this as grouping key, we have to generate something that is suitable as grouping key. For example a full URL would be fine as display name but usually not so much as grouping key. Of course, we won't stop backends from using the grouping key also or only as display name (for which it is usually still usable, though maybe not optimal).

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Aug 22, 2020
@Oberon00
Copy link
Member Author

Oberon00 commented Aug 22, 2020

Since #557 is now resolved via #810 to specify that human-readability is less important than being the "most general" string, I think this PR should now be merged as a direct consequence. Also, it has two approvals, no vetoes, and no changes since April.

@carlosalberto
Copy link
Contributor

Ping @yurishkuro

@yurishkuro
Copy link
Member

For the record, my disagreement in #548 (comment) had 3 upvotes. The discussion in #557 about grouping key vs. display name does not affect my argument, because the HTTP route is ultimately the API exposed by the server, it's the notion that users use to think about the service, express metrics and SLAs, reason about inter-service dependencies, etc. Those are not purely display concerns. The handler name is an implementation detail that is almost irrelevant to all those concerns.

@github-actions github-actions bot removed the Stale label Aug 23, 2020
@Oberon00
Copy link
Member Author

Oberon00 commented Aug 24, 2020

@yurishkuro

The discussion in #557 about grouping key vs. display name does not affect my argument

Is the handler name (if available!) a more general name than the route? Yes, because you can map more than one route to the same handler. Is it specific enough to still be meaningful? I do think so. Is the handler name human readable? Yes (I agree that most will find the route more useful to read but as per #810, the generality is more important).

The purpose of #810 was precisely to have a guideline, criteria with which to avoid such a discussion.

the HTTP route is ultimately the API exposed by the server, it's the notion that users use to think about the service, express metrics and SLAs, reason about inter-service dependencies, etc. Those are not purely display concerns. The handler name is an implementation detail that is almost irrelevant to all those concerns.

Mostly agree on your points here, but we already have the http.route attribute for that. The span name has only one purpose: Group the largest meaningful set of spans together. If you need to track an SLA on a route, you can surely customize your sampler and backend to sample/group per http.route instead of the name. But usually if multiple routes map to the same handler, it does make sense to group them together, at least initially.

EDIT: I'm not sure about the inter-service dependencies. Maybe different calling services will use different inbound routes to access the service, but for the same handler we should always have the same outbound calls. So I think from an inter-service dependency view, grouping by handler does actually simplify the "graph" without real loss of fidelity.

@yurishkuro
Copy link
Member

But usually if multiple routes map to the same handler, it does make sense to group them together, at least initially.

It may be sensible in some scenarios, but it is not unequivocal. I don't think the spec should be insisting on it. Some HTTP frameworks may not even have the notion of a handler name, it's fairly common to use anonymous lambda functions as handlers.

It is clear that we do not have consensus on this topic, so I suggest rewording the PR that using a handler name is an acceptable alternative, but stop short of saying that it is preferred over HTTP route.

@Oberon00 Oberon00 requested review from a team August 25, 2020 07:16
@Oberon00
Copy link
Member Author

We already have "MAY provide hooks to allow custom logic to override the default span name" here. We should specify a clear order of preference for the default.
We also already say "HTTP spans MUST follow the overall guidelines for span names.", so even if I don't spell the preference out here, it would follow from the general guidelines saying "The span name SHOULD be the most general string that identifies a (statistically) interesting class of Spans".

@github-actions
Copy link

github-actions bot commented Sep 2, 2020

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Sep 2, 2020
@github-actions
Copy link

github-actions bot commented Sep 9, 2020

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Sep 9, 2020
@Oberon00
Copy link
Member Author

Note that I still think this is a good idea and I will re-open this PR after GA.

@Oberon00 Oberon00 removed the release:required-for-ga Must be resolved before GA release, or nice to have before GA label Nov 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions spec:trace Related to the specification/trace directory Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants