Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic unload/load of instrumentations #1577

Closed
pavolloffay opened this issue Nov 6, 2020 · 15 comments
Closed

Dynamic unload/load of instrumentations #1577

pavolloffay opened this issue Nov 6, 2020 · 15 comments
Labels
enhancement New feature or request

Comments

@pavolloffay
Copy link
Member

Is your feature request related to a problem? Please describe.

Add/replace instrumentations at runtime.

Describe the solution you'd like

Have an SPI to hook into AgentInstaller and get the ResettableClassFileTransformer but also be able to change the AgentBuilder.

Describe alternatives you've considered

None.

Additional context

Related to #566 and slightly with related with #1534 if we provide SPI to access ByteBuddy.

@pavolloffay pavolloffay added the enhancement New feature or request label Nov 6, 2020
@iNikem
Copy link
Contributor

iNikem commented Nov 9, 2020

@pavolloffay Can you describe your use-case which requires unloading of the instrumentation? Or even loading it during runtime?

@pavolloffay
Copy link
Member Author

For instance provide patched instrumentations without requiring JVM to restart. Or provide new instrumentations without requiring JVM to restart.

@iNikem
Copy link
Contributor

iNikem commented Nov 10, 2020

Do you have an actual business need for that? I hardly can imagine a situation when I would want to dynamically load new instrumentation into running JVM... I would recommend against it :)

@pavolloffay
Copy link
Member Author

yes, in some enterprises it's hard to upgrade the agent once it's deployed. A capability to dynamically change instrumentations solves that.

@pavolloffay
Copy link
Member Author

IIRC Instana agent is able to update instrumentations at runtime. They are also using ByteBuddy internally.

@iNikem
Copy link
Contributor

iNikem commented Nov 11, 2020

I would like to note one thing. In our various discussions about different design or functionality decisions we often rely on our understanding of potential operational concerns. E.g. @trask bases some of this decisions on his opinion that it is easier for admins to update javaagent than for developers to change and release new code version. I usually think the opposite: code changes are cheaper and a good way to solve various problems.

So with this issue @pavolloffay brings yet another opinion on the table: javaagent updates are very hard and we need more dynamic behaviour in the agent :)

I feel that we have to made our expectations about agent operational model more explicit and maybe even document them as the basis for decisions we make.

@pavolloffay
Copy link
Member Author

From the internal discussion with senior people (ex. AppD) I hear that updating an agent in bigger enterprises takes a lot of time. It usually has to go through multiple teams and approvals, whereas submitting a patch on the wire does not have to. It's not an ideal state but has been the experience from the field.

Ultimately dynamic reload might be trickier to implement properly, but the core OTEL agent should not block vendors from implementing this capability in their distributions. Generally speaking it is a better development model to "test" feature in the vendor's environment and then move it to the spec/upstream once we know it works well (e.g. Eclipse Microprofile feature workflow).

@iNikem
Copy link
Contributor

iNikem commented Nov 11, 2020

First, every vendor's point of view may be coloured by the profile of their clients :) This is true both for AppD (your case) and Splunk (my case). And therefore we may have different opinions based on different experiences.

Second, I absolutely agree with you on "it is a better development model to "test" feature in the vendor's environment". I am somewhat worried that we are adding yet another and yet another extensions points to the agent, with quite obscure use-cases. And that we end up having so many of them, that it will be very confusing for everybody.

All in all, I suggest bringing this topic to the SIG meeting this week. And to discuss this issue together with #1619 and #1613.

Also pinging @anuraaga to hear his opinion before that meeting.

@pavolloffay
Copy link
Member Author

I am somewhat worried that we are adding yet another and yet another extensions points to the agent, with quite obscure use-cases.

Let me know what is obscure and I will better explain it to you :)

All in all, I suggest bringing this topic to the SIG meeting this week. And to discuss this issue together with #1619 and #1613.

Sure I am joining tomorrow's call as usual nd we can talk about it in more depht.

@iNikem
Copy link
Contributor

iNikem commented Nov 11, 2020

What I mean is that by reading new proposed SPIs it is not immediately obvious when and how one would want to use them. Compare that to PropertySource or SpanExporterFactory. They are (to my eye) much more self-explanatory.

@pavolloffay
Copy link
Member Author

Thanks that is better feedback 👍 . Maybe it's better to have such a comment on the PR. None of those PRs solve the subject of this issue. I will work on better javadocs for the proposed SPIs. One thing to realize is that these SPIs are for vendors and not everybody will need to implement them.

@anuraaga
Copy link
Contributor

anuraaga commented Nov 12, 2020

Sorry think my feedback is later than the meeting - but if it's possible to cleanly rearchitect things to support load / unload, I don't see a reason not to. It seems quite hard to me (e.g., once user code is instrumented, how do you back out?) but I'm not an agent expert still ;) That being said, I have to ask, does this mean there is an expectation that these apps never, ever shutdown / restart? I didn't know that's possible :P

@anuraaga
Copy link
Contributor

That being said, on the topic of SPIs, it feels like we are adding SPIs to many, many code path in the agent. Is there another option to make the agent more modular for code reuse, going inside-out instead of outside-in?

@pavolloffay
Copy link
Member Author

how do you back out?)

Call reset on the returned value

it feels like we are adding SPIs to many, many code path in the agent.

The only SPI that Agent has at the moment is to configure OTEL SD, the property source and bootstrap packages(I will talk about this in the meeting). All the OTEL SPIs could be substituted with #1619. Internally they can get the config and install SDK components based on the configuration (OTEL_EXPORTER). The code will be very similar to what OTEL SDK examples show https://github.com/open-telemetry/opentelemetry-java/blob/master/examples/otlp/src/main/java/io/opentelemetry/example/otlp/OtlpExporterExample.java#L37.

@trask
Copy link
Member

trask commented Apr 6, 2021

Closing, similar to dynamic attach (#1932), this is not something we plan to officially test/support.

@trask trask closed this as completed Apr 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants