Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify intent of JFR module #673

Closed
jack-berg opened this issue Jan 4, 2023 · 5 comments · Fixed by #709
Closed

Clarify intent of JFR module #673

jack-berg opened this issue Jan 4, 2023 · 5 comments · Fixed by #709

Comments

@jack-berg
Copy link
Member

There's been some recent activity in the JFR module to implement the JVM runtime semantic conventions by observing JFR events. It's got my wondering what exactly we're trying to achieve. The recent changes attempt to recreate the same data which is already available through JMX, but with more effort and lower quality.

I like the idea of using JFR events to expose data about the JVM which is not otherwise available, but I'm increasingly skeptical about the value of using JFR to duplicate the metrics we easily get from JMX.

@roberttoyonaga and @kittylyst - I'm especially interested in your perspective on this. Are there situations where a JFR source of these metrics would be superior to JMX?

(Also, apologies for not raising this concern sooner. I appreciate the effort that @roberttoyonaga has recently dedicated to moving this forward.)

Relavant issues: #653, #652, #651, #650

@kittylyst
Copy link
Contributor

Can you be more specific about where you see metrics that are more easily got from JMX?

In general, apart from averages - where we have already lost resolution and the raw JFR data should be superior - I'm not 100% sure what you mean by this.

Part of the issue here is, I think, that the spec was developed in reference purely to the JMX implementation.

I am quite unsure that there is any appetite from JDK folks to do much (if any) work on the JMX subsystem to add anything new - instead, the impression that I've had from conversations was that anything new should go into JFR.

So, from my perspective, it makes sense to see where the gaps are in the data JFR makes available and if there are any significant ones, then look at the possibility of creating new JFR events and seeking to have them merged into OpenJDK.

@roberttoyonaga
Copy link
Contributor

I think another reason is that JFR and JMX gather different information. For example, JFR provides lots of data on monitors (contention, wait, notify, inflation) while JMX does not. I think, with respect to JVM info, JFR provides more overall data. So in the future, if we choose to expand the semantic conventions, JFR would give use more room to do so.

@roberttoyonaga
Copy link
Contributor

I also agree that a big reason that the JFR streaming implementation struggles to meet some of the specifications in the semantic conventions is because they were made with JMX in mind. I'll write up a list shortcomings with respect to the current semantic conventions, and maybe we can agree on places to modify them slightly.

@jack-berg
Copy link
Member Author

Can you be more specific about where you see metrics that are more easily got from JMX?

I'm referring to semantic convention metrics, which are already captured via JMX here. For example, capturing memory pool metrics from JMX is simpler and more complete than capturing the same information via JFR.

Part of the issue here is, I think, that the spec was developed in reference purely to the JMX implementation.

I think another reason is that JFR and JMX gather different information. For example, JFR provides lots of data on monitors (contention, wait, notify, inflation) while JMX does not

Yup! I'd like to see us take advantage of the relative strengths of each data source. Can we let JMX be the source for the metrics specified in the existing semantic conventions, and use JFR for completely new metrics to be added to the semantic conventions as well?

@jack-berg
Copy link
Member Author

We chatted in the 1/5 Java SIG and there was consensus that its OK to continue duplicating the metrics which already have JMX sources, but that we should work to provide options to group the various JFR handlers by feature / function and allow the the groups to be toggled on and off by consumers.

jack-berg added a commit that referenced this issue Jan 27, 2023
Resolves #673.

Groups `RecordedEventHandlers` into "features", which can be toggled on
and off, and which are enabled by default if not already implemented in
the JMX based
[runtime-metrics](https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation/runtime-metrics/library).

This was discussed in the 1/5 SIG, as mentioned in this
[comment](#673 (comment)).

A side effect of this is that the tests can enable the precise set of
features that they test, further increasing isolation.

Also extends `JfrTelemetry#close()` to close all the observable
instruments so that they stop reported data. Does this by having
`RecordedEventHandler` extend `Closeable`, and having each
implementation track its observable instruments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants