Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spandata #78

Merged
merged 10 commits into from
Jun 10, 2019
Merged

Spandata #78

merged 10 commits into from
Jun 10, 2019

Conversation

SergeyKanzhelev
Copy link
Member

#35

@SergeyKanzhelev SergeyKanzhelev requested a review from reyang as a code owner June 7, 2019 16:10
specification/tracing-api.md Outdated Show resolved Hide resolved
@jmacd
Copy link
Contributor

jmacd commented Jun 7, 2019

@SergeyKanzhelev Please comment on #71

I have not seen any justification for exposing SpanData in the API. It's an SDK concept as far as I can tell.

@SergeyKanzhelev
Copy link
Member Author

@jmacd I marked the issue with the "revision" milestone. There is a scenarios of out-of-band reporting of spans that can be solved in two ways - with SpanData and with Span. For this proposal we are trying to document we are going with the SpanData approach.

It doesn't mean we ignoring the feedback. We just want to have some process and transparency. We only wanted those vaguely-documented, mostly as a java code decision to be in the first stage of the project. And we have couple month before September to change those decisions easily. But transparently. After September when we will ship first version - changes would be harder to make.

@jmacd
Copy link
Contributor

jmacd commented Jun 7, 2019

@SergeyKanzhelev Wouldn't it be better to leave recordSpanData and SpanData out of the API specification in this milestone? We can discuss it and add it later if there's real agreement that it's needed.

Please comment on this issue: #71

@SergeyKanzhelev
Copy link
Member Author

@jmacd both OT and OC had a feature to record out-of-band spans. We decided to implement it as a SpanData API. If we will need to change the decisions - I'd like to make it transparently in open discussion.

@jmacd
Copy link
Contributor

jmacd commented Jun 8, 2019

OT supported out-of-band spans using the same API as for in-band spans, by allowing the user to specify timestamps. The current OpenTelemetry API proposal supports out-of-band spans by creating a second API with a new, unnecessary data structure. All it would take in OpenTelemetry are builder options to set timestamps and resources, then SpanData can be removed from the instrumentation API.

The OT Java shim has an unimplemented method currently,

https://github.com/open-telemetry/opentelemetry-java/blob/b0a49827aa3a503290d4c3b0bbed33e547672298/opentracing-shim/src/main/java/io/opentelemetry/opentracingshim/SpanBuilderShim.java#L176

and to address that unimplemented method, the entire SpanBuilderShim would have to be rewritten to use recordSpanData--which will be an efficiency hit.

As mentioned in the original issue #71 (comment), I am not actually concerned with out-of-band reporting, but with setting the correct timestamps in middleware that provide accurate start and end times, such as for gRPC. This decision means that gRPC middleware can't be instrumented using ordinary OpenTelemetry APIs, instead they will have to allocate several extra objects and arrays [which an unbuffered implementation of the APIs would not otherwise require] in order to call recordSpanData.

@SergeyKanzhelev
Copy link
Member Author

@jmacd I understand you proposing an alternative solution for the scenario we all want to support. We did discuss this alternative, but decided to go with the SpanData approach. Again, the goal of original proposed API to set a baseline. And this PR is just documenting on what we decided originally.

We are setting a process in place to propose and approve changes.

@jmacd
Copy link
Contributor

jmacd commented Jun 10, 2019

I was under the impression that we had demonstrated backwards-compatibility with OpenTracing in Java before we began documenting what we have. It will be nice to have a transparent process in place so that we can discuss.

@SergeyKanzhelev
Copy link
Member Author

CC: @carlosalberto to comment on OT bridge and SpanData relationships.

We are having transparent process =). First iteration took too long and it was limited to a set of people to define a baseline. Join our tomorrow specs meeting if you can

specification/tracing-api.md Show resolved Hide resolved
Returns the `Attributes` collection associated with this `SpanData`. The order
of attributes in collection is not significant. The typical use of attributes
collection is enumeration so the fast access to the label value by it's key is
not a requirement. This collection MUST be immutable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it true that typical use is enumeration? This rules out use cases which need to augment / modify specific attributes efficiently.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

augmentation is not supported as returned collection is immutable. What scenario do you see when you need an attribute value with the specific name from the SpanData?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I was thinking about a scenario where you may want to implement telemetry data forwarder, which may also support augmenting the data, but you are right. If we require SpanData to be immutable then this use case is simply not possible. And I agree that we should require SpanData to be immutable since it likely will allow implementations to be more efficient.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there is a use case that does not require SpanData modification but requires fast inspection of data. For example a telemetry data forwarder that can also sample the data based on some attribute value.
However, I understand that this may not be the typical use case and we may not want to support it in the API that is aimed at instrumentation. Forwarders can have specialized implementations and do not need to use this API at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tigrannajaryan good point. I was also thinking something like metric extraction from the SpanData may require dictionary.

Let me file an issue. It is worth to be discussed and documented explicitly to address future questions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#92

specification/tracing-api.md Outdated Show resolved Hide resolved
@jmacd
Copy link
Contributor

jmacd commented Jun 10, 2019

@SergeyKanzhelev I would not like to pin this on @carlosalberto. Before I jumped into these discussions I completed a thorough review of what is there in the OpenCensus libraries. It should not be a surprise that OpenTracing libraries support setting timestamps on Events and Spans.

I understand that SpanData was introduced to accommodate out-of-band span reporting, but it's not a necessary abstraction once users have direct control over timestamps and resources in the span builder API. Moreover, certain OpenTracing use-cases (not related to out-of-band reporting) are not possible through the OTel SpanBuilder at present. Those cases will be forced to use SpanData -- but along with that they're forced to allocate additional memory.

So for the sake of being minimal and avoiding unnecessary allocations, so there's actually a 1:1 transformation from the OTel span builder to the OpenTracing span builder, we should eliminate SpanData.

@SergeyKanzhelev
Copy link
Member Author

@jmacd we will discuss and re-iterate on the process and expectations tomorrow. And will post notes. Our goal for initial API proposal was to come up with converged API and provide OT and OC bridges. Both was done. Now we are documenting decisions in a form of specs and moving into discussing and collecting feedback.

@jmacd you keep making arguments for your option. I keep replying that this is a good discussion that must happen. We have a forum for it. This PR is NOT this forum.

I'm sorry, but I'm going to ignore further comments in this PR regarding the necessity for SpanData. There is a separate issue for this discussion.

Copy link
Member

@songy23 songy23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit, otherwise LGTM.


### GetTimedEvents

Return the collection of `Events` with the timestamps associated with this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Timestamps here are associated with TimedEvents instead of SpanData.

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments related to SpanData being an interface, otherwise LGTM.

@@ -180,22 +180,38 @@ Order of attributes is not significant.

## SpanData

TODO: SpanData operations
https://github.com/open-telemetry/opentelemetry-specification/issues/35
`SpanData` MUST be an abstract class or interface. As `SpanData` is used for the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused, do you mean SpanData must be a final class?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quite opposite:

SpanData MUST be an abstract class or interface. As SpanData is used for the...

`SpanData` will be supplied by consumer of telemetry (i.e. telemetry vendor).
Typical use case for alternative implementations is to implement lazy
calculation of properties and minimize the number of allocation required at
instrumentation point. So it will be done by instrumentation code. For instance,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have this in Java for the moment, do we need to support this usecase? Currently lazy initialization can be done at members level, for example Events can be lazy events by implementing the Event interface.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean we don't have it? SpanData can have an alternative implementation. And getters can be lazily calculated. I think that's the whole point of making SpanData abstract.

@SergeyKanzhelev SergeyKanzhelev merged commit c236833 into open-telemetry:master Jun 10, 2019
@SergeyKanzhelev SergeyKanzhelev deleted the spandata branch June 10, 2019 22:11
SergeyKanzhelev added a commit to SergeyKanzhelev/opentelemetry-specification that referenced this pull request Feb 18, 2020
* first iteration

* SpanData finilized

* remove github issue mention

* addressed feedback from Reiley

* or -> of
carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this pull request Oct 31, 2024
* first iteration

* SpanData finilized

* remove github issue mention

* addressed feedback from Reiley

* or -> of
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants