Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide an SPI that an agent distribution or intiializer can use to c… #561

Merged
merged 5 commits into from
Jun 24, 2020

Conversation

anuraaga
Copy link
Contributor

@anuraaga anuraaga commented Jun 23, 2020

…ustomize the tracer after it's initialized.

Currently, I use the normal OpenTelemetry SPI for configuring an SDK tracer, namely to set the IdsGenerator and Resource

https://github.com/aws-samples/aws-xray-sdk-with-opentelemetry-sample/blob/both-otel-and-xray/awsagentprovider/src/main/java/com/softwareaws/xray/opentelemetry/exporters/AwsTraceProvider.java#L52

Unfortunately, nothing else can be customized here - in particular, methods like OpenTelemetry.setPropagators cannot be called here because the SDK is still being initialized. I have a hack to delay it to after the SDK is initialized

https://github.com/aws-samples/aws-xray-sdk-with-opentelemetry-sample/blob/both-otel-and-xray/awsagentprovider/src/main/java/com/softwareaws/xray/opentelemetry/exporters/AwsTraceProvider.java#L62

But for agent distributions, it is important to provide a programmatic way of customizing all aspects of the tracer, which we can do by having an SPI that is executed after the SDK has been initialized.

Notes

  • I added the SPI interface to agent-bootstrap instead of agent-tooling since I need it to be available to compile against (not renamed to classdata)
  • I ended up adding a compileOnly dep on the SDK in agent-bootstrap. This seems generally harmless but if we want to avoid it, I can remove the parameter from the method, it's only slightly convenient compared to a user calling OpenTelemetrySdk.getTracerProvider() themselves

I have built locally and replaced my hack with an SPI implementation and verifies it works well.

Fixes? #387

…ustomize the tracer after it's initialized.
Anuraag Agrawal added 2 commits June 23, 2020 17:16
@iNikem
Copy link
Contributor

iNikem commented Jun 23, 2020

You need/want this second SPI because currently some methods are only available on OpenTelemetry after SDK got configured, and not on TracerSdkProvider.Builder, right? Have you talk to otel-java about adding those "missing" methods to TracerSdkProvider.Builder? How does otel-java suggest configuring those concerns now?

@anuraaga
Copy link
Contributor Author

A couple of issues are that

  • OpenTelemetry API can't be used until the SDK provider is initialized. We can't add things like span processors or samplers to the provider builder reliably since they often have metrics or similar and need the API initialized first.

  • Propagators are an API concept and not SDK so a bit weird to add to the SDK builder.

With normal otel-java, a user can just call the static methods like OpenTelemetry.setPropagators in their app. But since we expect vendor packaged agents I think we need to provide an entry point like this one for those agents to do the customization outside the user's app.

@iNikem
Copy link
Contributor

iNikem commented Jun 23, 2020

Again, how this problem can be solved for those users who don't use auto instrumentation? If I want to manually instrument my app, without javaagent, and I want to provide my own TracerProviderFactory via SPI, how can I configure/set propagators then? Is it wrong for me to assume that an user may want to use AwsTraceProvider or similar without javaagent? Should they all spawn a separate thread and sleep for some time?

@anuraaga
Copy link
Contributor Author

A user can call this manually in main or wherever they prefer in their app

OpenTelemetry.setPropagators(DefaultContextPropagators.builder()
                                                                 .addHttpTextFormat(new AwsXrayPropagator())
                                                                      .build());

Or OpenTelemetrySdk.getTraceProvider().updateTraceConfig to set a sampler. But for the agent, don't we need an agent-specific SPI to give a way for distributions to call these methods automatically, in the agent ClassLoader?

@iNikem
Copy link
Contributor

iNikem commented Jun 23, 2020

So I will provide my own TracerProviderFactory, then just call OpenTelemetry.setPropagators in my main method. That call will call getInstance and will lazily create new instance of OpenTelemetry. During that TracerProviderFactory will be loaded and new tracerProvider will be created. And immediately followed by setting propagators.

I am probably missing something obvious here, but there is no delay above between creating new TraceProvider and setting propagators. Why such delay is needed in AwsTraceProvider?

@anuraaga
Copy link
Contributor Author

The delay itself is artificial and just to account for timing races. The key point is just to make sure setPropagators is called after getInstance returns, meaning the TracerProviderFactory has finished. Calling setPropagators inline without switching to another thread would result in getInstance being called again, a nested initialization. But currently the TracerProviderFactory is the only SPI I can implement, so to make sure the propagators are set up after it finishes I have it trigger a new thread. Very big workaround ;)

@iNikem
Copy link
Contributor

iNikem commented Jun 23, 2020

A! Now I get it :)

Ok, I don't have any immediate objection to this new SPI, but :) The whole story is incomplete. What you are trying to achieve is a convenient way for a vendor to package his own distribution of instrumentation agent providing pre-configured Otel with potential custom components. This is huge topic. Even parts which are already kinda in place, like configuring exporters, propagators, resources etc, are totally undocumented. But even before documentation, can we at least give some thoughts to high level description: what are vendors options and what they have to do to make such distribution? What can be achieved by providing external jars? What has to go inside agent jar? What interfaces/SPI are there for what tasks?

I am somewhat uncomfortable moving forward with this topic without having at least vague understanding of the big picture. I don't see even an issue with a stated goal! :)

@trask
Copy link
Member

trask commented Jun 23, 2020

The whole story is incomplete

Yes it is. I suggest we merge (EDIT: after check build is fixed of course) and continue incrementing on it.

I don't see even an issue with a stated goal!

Good point, created #566

@anuraaga
Copy link
Contributor Author

Thanks! #568 will save my day

@trask trask merged commit 6f5f673 into open-telemetry:master Jun 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants