Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Performance and Blocking specification #130

Merged
merged 13 commits into from
Jul 31, 2019

Conversation

saiya
Copy link
Contributor

@saiya saiya commented Jun 17, 2019

Performance and Blocking specification is specified in a separate document and
is linked from Language Library Design principles document.

I wrote this to document minimal principles as a starting point. If we need a detailed guidelines, it can be added in separate PR, I think. But also welcome for any recommendations :-)

Implements issue: #94

Related issues in OpenCensus: census-instrumentation/opencensus-java#1837, census-instrumentation/opencensus-specs#262

Performance and Blocking specification is specified in a separate document and
is linked from Language Library Design principles document.

Implements issue: open-telemetry#94
@saiya
Copy link
Contributor Author

saiya commented Jun 17, 2019

I signed it > CLA

- **Library should not block end-user application.**
- **Library should not consume infinite memory resource.**

Tracer should not degrade the end-user application. So that it should not block the end-user application nor consume too much memory resource.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggestion, during the application exit/shutdown, it might make sense to block for a user-configurable grace period. This gives the library enough time to persist or flush the telemetry data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for Flush. If we will decide we need a Flush method in SDK - it must be blocking for at least a configurable period.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense completely. I added "Shutdown and explicit flushing could block" section to describe shutdown/flush behavior: 53b0519#diff-44dc82a7e6286380ed89736215beda74R41

Here are the key principles:

- **Library should not block end-user application.**
- **Library should not consume infinite memory resource.**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The definition of infinite is vague, do we want to make this user configurable? (don't have to be rocket science here though)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: "unbounded" may be clearer here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be configurable, but it sounds this has to more to do with the implementation than the actual API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced "unlimited" with "unbounded".

IMHO, it may not easy to configure memory resources itself (depends on implementation) but the resource usage should be indirectly controlled via the size of a queue or something else. So that I think @songy23 's suggestion ("unbounded") is a good word to describe.

Copy link
Member

@reyang reyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I've left two minor suggestions.

Copy link
Member

@songy23 songy23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall

Here are the key principles:

- **Library should not block end-user application.**
- **Library should not consume infinite memory resource.**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: "unbounded" may be clearer here?

If there is such trade-off in language library, it should provide the following options to end-user:

- **Prevent blocking**: Dropping some information under overwhelming load and show warning log to inform when information loss starts and when recovered
- Should be a default option
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also apply to metrics recording (as an example see census-instrumentation/opencensus-python#684).

(However we may not want to make it the default option for logging APIs.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applied the same principle for metrics.

For logging, applied "Prevent information loss" policy by default and added "End-user application should aware of the size of logs" section: 53b0519#diff-44dc82a7e6286380ed89736215beda74R33

How do you think?

- **Library should not block end-user application.**
- **Library should not consume infinite memory resource.**

Tracer should not degrade the end-user application. So that it should not block the end-user application nor consume too much memory resource.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please expand it to all API, not simply Tracer. Metrics and distributed context MUST behave the same way

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expanded to all API, not Tracing only.

I think dropping Logs by default is not good behavior. So that I applied "Prevent information loss" (but may block) policy to Logging. Then I added "End-user application should aware of the size of logs" section: 53b0519#diff-44dc82a7e6286380ed89736215beda74R33

Copy link
Member

@SergeyKanzhelev SergeyKanzhelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My biggest ask is to make it clear in text that this doc is applicable to entire API surface, not only to tracers


Here are the key principles:

- **Library should not block end-user application.**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we do not allow sync upload of telemetry as an alternative implementation? Or you assume the default implementation here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think async should always be the case (with the exception of something like a mock component). I haven't seen any Telemetry system doing sync upload of data - the only cases would be a flush (or a close) call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is sufficiently specific. What counts as blocking work here? Obviously we shouldn't block on network or disk IO, but what about doing expensive copies, computing aggregate metrics, etc.?

Disallowing sync upload will also make it difficult to use OT for auditing, i.e. when the user would rather have the call fail than have it succeed without emitting telemetry data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we do not allow sync upload of telemetry as an alternative implementation? Or you assume the default implementation here?

Disallowing sync upload will also make it difficult to use OT for auditing, i.e. when the user would rather have the call fail than have it succeed without emitting telemetry data.

I added "by default" phrase in the policy:

Library should not block end-user application by default.

I think it is okay to implement synchronous (blocking) option. So I leave a room for synchronous one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What counts as blocking work here? Obviously we shouldn't block on network or disk IO, but what about doing expensive copies, computing aggregate metrics, etc.?

Sharp question.

Actually, we cannot implement a completely zero overhead library. Any tracing/metric/logging library "blocks" application in some nanoseconds or microseconds or milliseconds. And the acceptable duration of the "blocking" depends on end-user needs, application domain, runtime environment.

I think we can agree with "we shouldn't block on network or disk IO". But it is difficult to define the general rule of how many CPU usages and computation overhead is allowed.

This is why I intentionally wrote about I/O but not wrote about a computation (CPU) overhead.

If there are some needs to explicitly define it, it can be done in separate PR/issue, I feel.

- **Library should not block end-user application.**
- **Library should not consume infinite memory resource.**

Tracer should not degrade the end-user application. So that it should not block the end-user application nor consume too much memory resource.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for Flush. If we will decide we need a Flush method in SDK - it must be blocking for at least a configurable period.

- Should provide option to change threshold of the dropping
- Better to provide metric that represents effective sampling ratio
- **Prevent information loss**: Preserve all information but possible to consume unlimited resource
- Should be supported as an alternative option
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Should/Might/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, it feels this is more related to the actual implementation (although I agree we should recommend something like this, but more as a guideline).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Replaced "should" with "might".


## Key principles

Here are the key principles:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider putting "Instrumentation cannot be a failure modality".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I agree with the policy itself: "Instrumentation cannot be a failure modality". But this file/PR is focusing on performance / blocking matter.

Could you make it as separate GitHub issue or something?

The topic relates with about error handling, recovery, retry, logging, handling information loss etc. Seems not to be so simple.

Copy link
Member

@c24t c24t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some grammar nits, and I think it would be useful to do some work to write all these docs in the same style.

The explanation that libraries should either drop messages before using more than some configurable amount of memory is good, and the behavior that the doc describes looks great. It is light on specifics though.


Here are the key principles:

- **Library should not block end-user application.**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is sufficiently specific. What counts as blocking work here? Obviously we shouldn't block on network or disk IO, but what about doing expensive copies, computing aggregate metrics, etc.?

Disallowing sync upload will also make it difficult to use OT for auditing, i.e. when the user would rather have the call fail than have it succeed without emitting telemetry data.

specification/performance.md Outdated Show resolved Hide resolved
specification/performance.md Outdated Show resolved Hide resolved
specification/library-guidelines.md Outdated Show resolved Hide resolved
specification/library-guidelines.md Outdated Show resolved Hide resolved
saiya added 2 commits June 18, 2019 14:57
- Write about Metrics & Logging to cover entire API
- Write about shut down / flush operations
- Leave room for blocking implementation options (should not block "as default behavior")
- Grammar & syntax fix
- Not limit for tracing, metrics.
Here are the key principles:

- **Library should not block end-user application by default.**
- **Library should not consume unbounded memory resource.**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there libraries for which this is not true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For blocking, opencensus-java blocks end-user application when a queue gets full: census-instrumentation/opencensus-java#1837 (this is why I get concerned about those matters).

An easy solution to the blocking matter is to use an unbounded queue to avoid blocking. In compensation, it consumes memory. I don't know the monitoring library which uses an unbounded queue, but I think clarifying it is meaningful.

Also, unbounded memory usage matter is related to the log volume matter described in "End-user application should aware of the size of logs" section: https://github.com/open-telemetry/opentelemetry-specification/pull/130/files#diff-44dc82a7e6286380ed89736215beda74R33 .

- **Library should not block end-user application by default.**
- **Library should not consume unbounded memory resource.**

API should not degrade the end-user application. So that it should not block the end-user application nor consume too much memory resource.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"should not degrade" is not really a feasible requirement. It creates unrealistic expectations that collecting telemetry might be free. There's always perf overhead. What implementation should do is provide levers to control that overhead at the expense of collecting less data. Note that "less data" and "dropping data due to buffer overflow" are different remediation mechanisms, with different QOS guarantees.

Copy link
Contributor Author

@saiya saiya Jun 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"should not degrade" is not really a feasible requirement. It creates unrealistic expectations that collecting telemetry might be free. There's always perf overhead.

It is true that there is no zero-overhead perf library. Very similar conversation: #130 (comment)

Considering this and the above feedback, I changed the pointed sentence as follows:

Although there is inevitable overhead to achieve monitoring, API should not degrade the end-user application as possible.

Does this make sense?

What implementation should do is provide levers to control that overhead at the expense of collecting less data.

Another important point is that library design itself. For example, tracing should not block end-user applications even if the network I/O takes so much time. Such operations should be done in an asynchronous manner. Levers cannot help such design failure. This is why I wrote these principles.

Talking about levers, I believe each API provides levers (e.g. Tracing can be tuned by Sampling ). Thus I did not mention about levers in this document. Should it be mentioned in this document also?


### Shutdown and explicit flushing could block

The language library could block the end-user application when it shut down. On shutdown, it has to flush data to prevent information loss.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still should not block indefinitely. If we want to codify blocking flushing, we should require supporting user-supplied timeout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I added sentence The language library should support user-configurable timeout if it blocks on shut down.

Here are the key principles:

- **Library should not block end-user application by default.**
- **Library should not consume unbounded memory resource.**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we single out memory only? CPU and latency impact are often more important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is true that computation overhead (CPU usage, latency) is also a possible cause of unwelcome blocking as discussed in #130 (comment) .

Could you read the above comment?

If we need to deep dive into CPU and latency impact matters, it is better to create separate PR/issue, I feel.

@SergeyKanzhelev SergeyKanzhelev self-requested a review June 18, 2019 17:14
- Mentioned about inevitable overhead
- Shutdown may block, but it should support configurable timeout also
specification/performance.md Outdated Show resolved Hide resolved
specification/performance.md Outdated Show resolved Hide resolved
Copy link
Member

@iredelmeier iredelmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like the scopes of the two core parts of this (blocking and memory usage) are, while definitely linked, separate issues.

Blocking is pretty easily addressed - i.e., with the suggestion (from the PR) that libraries should use something like asnyc I/O or background threads.

Memory usage, on the other hand, is a lot more complicated: there are many solutions, each with different trade-offs in terms of usability, ease of implementation, etc. Many of the previously used solutions - e.g., the limits in OpenCensus - are mitigations but not outright solutions. For instance, individual logs may be large enough to cause problems, even if there aren't many logs per span.

Similarly, there are a lot of big questions that are touched on here, but that we haven't answered yet. How do we want to convey dropped data? How do we ensure that disk I/O doesn't become the problem - especially if, as I think is suggested in the change here (sorry if I'm misreading!!), the idea is to log data that would otherwise be lost?

There are also some promising mitigations that haven't been explored as much yet, e.g.:

  • Streaming requests
  • Providing verbosity levels for logs and/or spans or metrics themselves
  • More fine-tuning of request frequency

Another big piece that hasn't really been addressed yet in OpenTracing, OpenCensus, or OpenTelemetry is the testing side of things - i.e., testing that a) instrumentation doesn't add too much overhead, and that b) given certain failure modes or degraded circumstances, the instrumented application is still able to function as expected.

(I believe that CPU and latency are in a similar camp. I do agree with @yurishkuro that they should be part of the discussion, but I think the discussion of them shouldn't be "concluded" for the same reason as the memory discussion.)

@iredelmeier
Copy link
Member

(cc @jmacd - probably of interest to you!)

@saiya
Copy link
Contributor Author

saiya commented Jun 20, 2019

Hi @SergeyKanzhelev, as discussed in the thread of #130 (comment), I expanded this document into not only Tracer but to all API. Is there any blocker still?

Copy link
Member

@SergeyKanzhelev SergeyKanzhelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with some comments that this document may need to be expanded for more use cases and aspects of performance in future. It is a good starting point though

@saiya
Copy link
Contributor Author

saiya commented Jun 24, 2019

Dear Reviewers, are there any blockers to merge this PR?

Copy link
Member

@songy23 songy23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to merge this PR. Remaining comments are a bit beyond the original scope and can be addressed in follow-up PRs.

@saiya
Copy link
Contributor Author

saiya commented Jun 28, 2019

@songy23 @SergeyKanzhelev @reyang Thank you for approval!

Does whom can merge this PR? Or can I do something to help merge this branch?

@SergeyKanzhelev
Copy link
Member

@open-telemetry/specs-approvers any objections to merge this doc? Looks like a good starting point to discuss perf and highlights basic principles to think about when implementing SDK

@saiya
Copy link
Contributor Author

saiya commented Jul 4, 2019

@open-telemetry/spec-approvers (just a reminder reminder) Are we ready to merge this?

I think merging this helps anyone to make PR to to propose more deep guideline.

- Better to provide metric that represents effective sampling ratio
- Language library might provide this option for Logging
- **Prevent information loss**: Preserve all information but possible to consume many resources
- Should be a default option for Logging
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why logging is different? Should be an option for every signal and probably the default for all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for approval!

Why logging is different?

About metrics/tracing, most existing tracing libraries support sampling & asynchronous processing. They drop some data in a fixed (or dynamic) rate and also drops data sometimes due to an internal buffer overflow. Such instrumentations are generally expected to sample information. But not expected to degrade end user's application to capture complete data.
About logging, as far as I know, most of the existing logging libraries do not drop logs but may block application. I feel dropping logs by default is a surprising behavior for most users.

Related materials: Logging v. instrumentation, Question: Can zipkin be used to replace logging?

Related conversation in this PR: #130 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point was mostly about why not make everything blocking as default. We should support non blocking for sure but I feel that blocking can be default for all the implementations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saiya just a data point: https://github.com/uber-go/zap by default samples (throttles) logs

Copy link
Contributor Author

@saiya saiya Jul 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should support non blocking for sure but I feel that blocking can be default for all the implementations.

In my opinion, the blocking strategy is a little bit surprising for users who use instrumentation (tracing, metrics). But the non-blocking strategy can be an opt-in option as you mentioned. Both are fine for me.

Should we apply the blocking strategy for all components (tracing, metrics, and logging)?
If so, I will update this PR.

Copy link
Contributor

@z1c0 z1c0 Jul 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the key distinction to make here is dropping (sampling) vs throttling when it comes to logs:

  • Dropping logs arbitrarily will definitely come as a surprise for most users and may lose valuable information.
  • Throttling however can be really useful. If an application starts flooding the log for whatever reason, intelligent throttling along the lines of "log message "...." appeared X times in the last minute" does not lose information and can prevent severe performance degradations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropping logs arbitrarily will definitely come as a surprise for most users and may lose valuable information.

Yes. This is what I meant and why I distinguished logs from other instruments (tracing, metrics).

Throttling however can be really useful. If an application starts flooding the log for whatever reason, intelligent throttling along the lines of "log message "...." appeared X times in the last minute" does not lose information and can prevent severe performance degradations.

Such intelligent seems to be awesome for me.

In my understanding, OpenTelemetly's specification of logging is not yet defined. Thus such technique should be discussed in logging API definition, IMHO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am surprised why do you care that much about the default value and why do you treat metrics different than logs. In big companies (at least some that I know) metrics are more important than logs - for example monitoring/alerting is done using metrics not logs. Also we do use blocking for all the signals and we don't run into issues.

So I would suggest to treat all the signals the same. So probably go with a default blocking or not-blocking but for all the signals.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understood prescribing a default strategy (blocking or information loss) is a difficult topic. At the beginning, I assumed most users prefer non-blocking (information loss) for metrics/tracing. But now I understood there are various needs.

So that I deleted sentences such as Should be a default option other than Logging and Should be a default option for Logging in 34380b3.

In this PR, I want to clarify principle (avoid degrading user application as possible) and describe desired options to deal with a tradeoff. If we need to prescribe default, it can be done in separate PR, I feel.


Although there are inevitable overhead to achieve monitoring, API should not degrade the end-user application as possible. So that it should not block the end-user application nor consume too much memory resource.

Especially, most telemetry exporters need to call API of servers to export telemetry data. API call operations should be performed in asynchronous I/O or background thread to prevent blocking end-user applications.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not important here, also covered by #186. So I think it is fine to just say that we should support not blocking which is the previous sentence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I read #186 and deleted this sentence as you suggested.

- Better to provide metric that represents effective sampling ratio
- Language library might provide this option for Logging
- **Prevent information loss**: Preserve all information but possible to consume many resources
- Should be a default option for Logging
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am surprised why do you care that much about the default value and why do you treat metrics different than logs. In big companies (at least some that I know) metrics are more important than logs - for example monitoring/alerting is done using metrics not logs. Also we do use blocking for all the signals and we don't run into issues.

So I would suggest to treat all the signals the same. So probably go with a default blocking or not-blocking but for all the signals.


The language library could block the end-user application when it shut down. On shutdown, it has to flush data to prevent information loss. The language library should support user-configurable timeout if it blocks on shut down.

If the language library supports an explicit flush operation, it could block also.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably you want to also support a timeout for this operation and a negative or zero timeout means blocking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I also looked #186 (it describes timeout about shutdown operation).

Added sentence But should support a configurable timeout.. More detailed interface (e.g. use nagative integer to specify infinite blocking) can be described in #186's document if needed.

saiya added 4 commits July 25, 2019 16:06
- Remove duplication with open-telemetry#186
- Mention about configurable timeout of flush operation
- Not specify default strategy (blocking or information loss)
@saiya
Copy link
Contributor Author

saiya commented Jul 25, 2019

As discussed, the default option (prefer blocking or information loss) is not obvious, there are various needs. Thus I deleted sentences such as Should be a default option other than Logging and Should be a default option for Logging in 34380b3, #130 (comment). I intend to focus on principle (avoid degrading user application as possible) and desired options to deal with a tradeoff.

I think issues are cleared now. Are there any blockers to merge this still?

@saiya
Copy link
Contributor Author

saiya commented Jul 30, 2019

@open-telemetry/spec-approvers (just a reminder) Are we ready to merge this?

As written in previous one comment, all moot point seems to be solved.

@carlosalberto carlosalberto merged commit 63f7804 into open-telemetry:master Jul 31, 2019
@saiya
Copy link
Contributor Author

saiya commented Aug 1, 2019

👍 Thank you for taking care of this PR, everyone!

SergeyKanzhelev pushed a commit to SergeyKanzhelev/opentelemetry-specification that referenced this pull request Feb 18, 2020
* Add Performance and Blocking specification

Performance and Blocking specification is specified in a separate document and
is linked from Language Library Design principles document.

Implements issue: open-telemetry#94

* PR fix (open-telemetry#94).

- Write about Metrics & Logging to cover entire API
- Write about shut down / flush operations
- Leave room for blocking implementation options (should not block "as default behavior")
- Grammar & syntax fix

* PR fix (open-telemetry#94).

- Not limit for tracing, metrics.

* PR fix (open-telemetry#94).

- Mentioned about inevitable overhead
- Shutdown may block, but it should support configurable timeout also

* PR fix (open-telemetry#94)

- s/traces/telemetry data/
- Syntax fix

Co-Authored-By: Yang Song <songy23@users.noreply.github.com>

* PR fix (open-telemetry#130)

- Remove duplication with open-telemetry#186
- Mention about configurable timeout of flush operation

* PR fix (open-telemetry#130)

- Not specify default strategy (blocking or information loss)
rockb1017 pushed a commit to rockb1017/opentelemetry-specification that referenced this pull request Nov 18, 2021
carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this pull request Oct 21, 2024
…rics 1.0 GA) (open-telemetry#130)

* Define what's in scope for OpenTelemetry Logs 1.0 GA

Clearly defined scope is important to align all logs contributors and to make
sure we know what target we are working towards. Note that some of the listed
items are already fully or partially implemented but are still listed for
completeness.

* Address PR comments

Co-authored-by: Bogdan Drutu <bogdandrutu@gmail.com>
carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this pull request Oct 23, 2024
…rics 1.0 GA) (open-telemetry#130)

* Define what's in scope for OpenTelemetry Logs 1.0 GA

Clearly defined scope is important to align all logs contributors and to make
sure we know what target we are working towards. Note that some of the listed
items are already fully or partially implemented but are still listed for
completeness.

* Address PR comments

Co-authored-by: Bogdan Drutu <bogdandrutu@gmail.com>
carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this pull request Oct 31, 2024
* Add Performance and Blocking specification

Performance and Blocking specification is specified in a separate document and
is linked from Language Library Design principles document.

Implements issue: open-telemetry#94

* PR fix (open-telemetry#94).

- Write about Metrics & Logging to cover entire API
- Write about shut down / flush operations
- Leave room for blocking implementation options (should not block "as default behavior")
- Grammar & syntax fix

* PR fix (open-telemetry#94).

- Not limit for tracing, metrics.

* PR fix (open-telemetry#94).

- Mentioned about inevitable overhead
- Shutdown may block, but it should support configurable timeout also

* PR fix (open-telemetry#94)

- s/traces/telemetry data/
- Syntax fix

Co-Authored-By: Yang Song <songy23@users.noreply.github.com>

* PR fix (open-telemetry#130)

- Remove duplication with open-telemetry#186
- Mention about configurable timeout of flush operation

* PR fix (open-telemetry#130)

- Not specify default strategy (blocking or information loss)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.