Dynamic unload/load of instrumentations #1577

pavolloffay · 2020-11-06T16:00:54Z

Is your feature request related to a problem? Please describe.

Add/replace instrumentations at runtime.

Describe the solution you'd like

Have an SPI to hook into AgentInstaller and get the ResettableClassFileTransformer but also be able to change the AgentBuilder.

Describe alternatives you've considered

None.

Additional context

Related to #566 and slightly with related with #1534 if we provide SPI to access ByteBuddy.

The text was updated successfully, but these errors were encountered:

iNikem · 2020-11-09T13:54:43Z

@pavolloffay Can you describe your use-case which requires unloading of the instrumentation? Or even loading it during runtime?

pavolloffay · 2020-11-10T08:00:11Z

For instance provide patched instrumentations without requiring JVM to restart. Or provide new instrumentations without requiring JVM to restart.

iNikem · 2020-11-10T08:11:50Z

Do you have an actual business need for that? I hardly can imagine a situation when I would want to dynamically load new instrumentation into running JVM... I would recommend against it :)

pavolloffay · 2020-11-10T15:26:42Z

yes, in some enterprises it's hard to upgrade the agent once it's deployed. A capability to dynamically change instrumentations solves that.

pavolloffay · 2020-11-11T10:58:02Z

IIRC Instana agent is able to update instrumentations at runtime. They are also using ByteBuddy internally.

iNikem · 2020-11-11T11:07:26Z

I would like to note one thing. In our various discussions about different design or functionality decisions we often rely on our understanding of potential operational concerns. E.g. @trask bases some of this decisions on his opinion that it is easier for admins to update javaagent than for developers to change and release new code version. I usually think the opposite: code changes are cheaper and a good way to solve various problems.

So with this issue @pavolloffay brings yet another opinion on the table: javaagent updates are very hard and we need more dynamic behaviour in the agent :)

I feel that we have to made our expectations about agent operational model more explicit and maybe even document them as the basis for decisions we make.

pavolloffay · 2020-11-11T11:16:46Z

From the internal discussion with senior people (ex. AppD) I hear that updating an agent in bigger enterprises takes a lot of time. It usually has to go through multiple teams and approvals, whereas submitting a patch on the wire does not have to. It's not an ideal state but has been the experience from the field.

Ultimately dynamic reload might be trickier to implement properly, but the core OTEL agent should not block vendors from implementing this capability in their distributions. Generally speaking it is a better development model to "test" feature in the vendor's environment and then move it to the spec/upstream once we know it works well (e.g. Eclipse Microprofile feature workflow).

iNikem · 2020-11-11T13:18:02Z

First, every vendor's point of view may be coloured by the profile of their clients :) This is true both for AppD (your case) and Splunk (my case). And therefore we may have different opinions based on different experiences.

Second, I absolutely agree with you on "it is a better development model to "test" feature in the vendor's environment". I am somewhat worried that we are adding yet another and yet another extensions points to the agent, with quite obscure use-cases. And that we end up having so many of them, that it will be very confusing for everybody.

All in all, I suggest bringing this topic to the SIG meeting this week. And to discuss this issue together with #1619 and #1613.

Also pinging @anuraaga to hear his opinion before that meeting.

pavolloffay · 2020-11-11T13:32:10Z

I am somewhat worried that we are adding yet another and yet another extensions points to the agent, with quite obscure use-cases.

Let me know what is obscure and I will better explain it to you :)

All in all, I suggest bringing this topic to the SIG meeting this week. And to discuss this issue together with #1619 and #1613.

Sure I am joining tomorrow's call as usual nd we can talk about it in more depht.

iNikem · 2020-11-11T14:04:00Z

What I mean is that by reading new proposed SPIs it is not immediately obvious when and how one would want to use them. Compare that to PropertySource or SpanExporterFactory. They are (to my eye) much more self-explanatory.

pavolloffay · 2020-11-11T14:13:15Z

Thanks that is better feedback 👍 . Maybe it's better to have such a comment on the PR. None of those PRs solve the subject of this issue. I will work on better javadocs for the proposed SPIs. One thing to realize is that these SPIs are for vendors and not everybody will need to implement them.

anuraaga · 2020-11-12T07:37:29Z

Sorry think my feedback is later than the meeting - but if it's possible to cleanly rearchitect things to support load / unload, I don't see a reason not to. It seems quite hard to me (e.g., once user code is instrumented, how do you back out?) but I'm not an agent expert still ;) That being said, I have to ask, does this mean there is an expectation that these apps never, ever shutdown / restart? I didn't know that's possible :P

anuraaga · 2020-11-12T07:41:01Z

That being said, on the topic of SPIs, it feels like we are adding SPIs to many, many code path in the agent. Is there another option to make the agent more modular for code reuse, going inside-out instead of outside-in?

pavolloffay · 2020-11-12T10:42:36Z

how do you back out?)

Call reset on the returned value

opentelemetry-java-instrumentation/javaagent-tooling/src/main/java/io/opentelemetry/javaagent/tooling/AgentInstaller.java

Line 163 in 1290bca

return agentBuilder.installOn(inst);

it feels like we are adding SPIs to many, many code path in the agent.

The only SPI that Agent has at the moment is to configure OTEL SD, the property source and bootstrap packages(I will talk about this in the meeting). All the OTEL SPIs could be substituted with #1619. Internally they can get the config and install SDK components based on the configuration (OTEL_EXPORTER). The code will be very similar to what OTEL SDK examples show https://github.com/open-telemetry/opentelemetry-java/blob/master/examples/otlp/src/main/java/io/opentelemetry/example/otlp/OtlpExporterExample.java#L37.

trask · 2021-04-06T00:20:09Z

Closing, similar to dynamic attach (#1932), this is not something we plan to officially test/support.

pavolloffay added the enhancement New feature or request label Nov 6, 2020

iNikem added the release:after-ga label Nov 9, 2020

pavolloffay mentioned this issue Nov 10, 2020

Add Bytebuddy agent builder customizer SPI #1613

Merged

trask closed this as completed Apr 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic unload/load of instrumentations #1577

Dynamic unload/load of instrumentations #1577

pavolloffay commented Nov 6, 2020

iNikem commented Nov 9, 2020

pavolloffay commented Nov 10, 2020

iNikem commented Nov 10, 2020

pavolloffay commented Nov 10, 2020

pavolloffay commented Nov 11, 2020

iNikem commented Nov 11, 2020

pavolloffay commented Nov 11, 2020

iNikem commented Nov 11, 2020

pavolloffay commented Nov 11, 2020

iNikem commented Nov 11, 2020

pavolloffay commented Nov 11, 2020

anuraaga commented Nov 12, 2020 •

edited

Loading

anuraaga commented Nov 12, 2020

pavolloffay commented Nov 12, 2020

trask commented Apr 6, 2021

Dynamic unload/load of instrumentations #1577

Dynamic unload/load of instrumentations #1577

Comments

pavolloffay commented Nov 6, 2020

iNikem commented Nov 9, 2020

pavolloffay commented Nov 10, 2020

iNikem commented Nov 10, 2020

pavolloffay commented Nov 10, 2020

pavolloffay commented Nov 11, 2020

iNikem commented Nov 11, 2020

pavolloffay commented Nov 11, 2020

iNikem commented Nov 11, 2020

pavolloffay commented Nov 11, 2020

iNikem commented Nov 11, 2020

pavolloffay commented Nov 11, 2020

anuraaga commented Nov 12, 2020 • edited Loading

anuraaga commented Nov 12, 2020

pavolloffay commented Nov 12, 2020

trask commented Apr 6, 2021

anuraaga commented Nov 12, 2020 •

edited

Loading