Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OtlpGrpcSpanExporter's shutdown can hang forever if the Channel is already shut down #3306

Closed
andimiller opened this issue Jun 10, 2021 · 1 comment · Fixed by #3307
Closed
Assignees
Labels
Bug Something isn't working

Comments

@andimiller
Copy link

andimiller commented Jun 10, 2021

Describe the bug
OtlpGrpcSpanExporter's shutdown can hang forever if the underlying Channel is already shut down

Steps to reproduce

  1. Create an OtlpGrpcSpanExporter
  2. Create a BatchSpanProcessor and feed in the exporter
  3. Create an SdkTracer and feed in the processor
  4. Create an OpenTelemetrySdk and set the tracer
  5. Report some spans (optional)
  6. Call shutdown() on the SdkTracer and wait for the callback
  7. Call shutdown() on the BatchSpanProcessor and wait for the callback
  8. Call shutdown() on the OtlpGrpcSpanExporter and wait for the callback
  9. Observe that the callback never completes

What did you expect to see?
I expected the callback from OtlpGrpcSpanExporter's shutdown to complete

What did you see instead?
The program hung indefinitely

What version and what artifacts are you using?

libraryDependencies ++= Seq(
  "io.opentelemetry"  % "opentelemetry-exporter-otlp" % "1.2.0",
  "io.grpc" % "grpc-okhttp" % "1.38.0",
   // "io.grpc"           % "grpc-netty"                  % "1.38.0" // I also tried this
  "io.opentelemetry" % "opentelemetry-api" % "1.2.0",
  "io.opentelemetry" % "opentelemetry-sdk" % "1.2.0",
)

Environment
Compiler: Scala 2.13.4 and AdoptOpenJDK 11.0.3
OS: NixOS 20.09
Runtime: same JDK as above
Runtime OS: same as above

Additional context
I will follow up with a java example that demonstrates this

@andimiller andimiller added the Bug Something isn't working label Jun 10, 2021
@andimiller
Copy link
Author

andimiller commented Jun 10, 2021

https://github.com/andimiller/opentelemetry-shutdown-bug here's a repo that reproduces the bug in java, it's using sbt because I can't remember how to use maven

@jkwatson jkwatson self-assigned this Jun 10, 2021
jkwatson pushed a commit to jkwatson/opentelemetry-java that referenced this issue Jun 10, 2021
anuraaga pushed a commit that referenced this issue Jun 10, 2021
…3307)

* Make sure the 2nd call to shutdown() on the GRPC exporters succeeds.
Resolves #3306

* re-order the check for the channel already being shut down
andimiller added a commit to andimiller/natchez that referenced this issue Jun 11, 2021
This adds a `natchez-opentelemetry` module which allows span reporting via the `opentelemetry-java` project.

The `Utils` object contains a helper to turn the `OpenTelemetry` `CompletableResultCode` class into an `F[Unit]` given `Async[F]`, this is useful for implementing `Resource`s for clean shutdown logic.

`Shutdownable` is a little abstraction to unify all the interfaces that have a `shutdown(): CompletableResultCode` method in `OpenTelemetry`, since they have no common interface.

`OpenTelemetrySpan` and `OpenTelemetryEntryPoint` are heavily based on the `natchez-jaeger` versions, with tweaks to make them with with `OpenTelemetry.

The `OpenTelemetry` object which end users should interact with has these methods:
* `lift` can be used to lift any `F[T]` where `T` is an `OpenTelemetry` class with a `shutdown` method into a `Resource[F, T], it asks for a name to provide a nice error message
* `entryPoint` is the main way to make an `EntryPoint` and has a boolean flag to allow the user to globally register the `OpenTelemetry` if that's helpful, this defaults to false.
* `globalEntryPoint` will use the globally registered `OpenTelemetry` to create an `EntryPoint`

Note that this is currently using `OpenTelemetry` libraries at `1.4.0-SNAPSHOT` because I found a bug while developing this this broke the shutdown logic.
The issue is here open-telemetry/opentelemetry-java#3306 and it was closed by this PR open-telemetry/opentelemetry-java#3307 so it should make it into the next release.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants