-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GR-39497] Add image stats reporter to build output #4481
Conversation
Example usage (of the file based one, included in the patch):
Or if somebody wants to do something else with the image stats, implement a feature similar to this:
|
@fniephaus @zakkak Would something like that be useful? I'm looking into gathering some data for various apps and a framework like that would help me a lot. Thoughts? |
Yes, something like that would be useful. In fact, we had some discussions around this (e.g., #3955 (comment)) when introducing the new build output. Before we start discussing implementation details, could you elaborate some more on why you want to collect build metrics? What are your use cases? Providing a machine-readable file is a first step, but we might also want to provide some useful "consumers". One use case could be the detection of a significant change of a certain variable (e.g., reachable classes or file size of the executable) and then trigger something (e.g., fail a CI build with non-zero exit code). Of course, "significant" may depend on what variable you choose. Another use case is monitoring/tracking. For that, we might want to look into common file formats and maybe even provide support for a specific setup. |
Right now, it's general data gathering. For example we'd like to collect some evidence that for app
OK, seems a valid point. For us that's definitely a secondary goal. Having said, that it should be fairly easy to achieve based on this patch by a slight refactoring how
I'm not sure what you are trying to say with this. Do you think monitoring/tracking is orthogonal to this? For me the format should be |
Ok, that's the monitoring/tracking use case I was talking about.
I think we should build stuff like this around Native Image, not into it. An (optional?) file in a well-defined format should be more than enough. If you want to write consumers in Java, you can still so that (and even compile them to native executables 😅).
JSON output (plus a JSON schema) seems reasonable. I guess all I'm trying to say is that I think it'd be good to provide some basic/useful consumers (e.g., scripts for analysis), or at least some examples of how the output data could be used (e.g., a GitHub action setup that fails when image size increased by X percent). |
I'm not sure I agree. Why not do both with one shot? Clearly, something would have to be part of native-image so that the metrics can get exported to the file. The proposed patch basically does the optional file part you mention, but also allows for a different kind of consumer whatever it is. A nice thing about a design like this is that an application specific
Sure. We are hoping to have a bit more flexibility though. For example, if the native-image generation happens in a container, the files would need to get extracted from the container first. A better fit for it would be to print metrics to standard out or call a web service and post the JSON to it or do something else entirely. @fniephaus Please provide some guidance as to what the next steps should be. Thanks! |
I'm OOO for the rest of the week. Will look into this next week. |
A typical examples include benchmarking or tracking regressions. For example, you might want to see what the metrics compare with different combinations or user code and/or GraalVM versions...etc. Also, it can help track regressions. When a certain combination results in higher build times, or bigger binaries, you'd want to find out what the metrics look like before/after and see if you can spot where it has gotten worse. |
Sorry for the delay, just looked into the PR. With a change stat of @jerboaa what monitoring system(s) were you targeting with this? Something like OpenTelemetry? |
This post-processing becomes a bit cumbersome when trying to extract this data from CI (think GHA). A more flexible setup which could upload statistics to a server and/or fail the build if some statistics change in a significant way would be nice to have. Thanks for having taken a look at this. This PR includes a couple of preparatory steps (therefore might be larger than what you'd expect):
Since the stats are now collected in a class separate from
Right now we are developing a custom collector (web application) with some rudimentary UI. Something like OpenTelemetry could be supported as well, I guess. The key of this patch is to separate the data collecting from the UI logic and add some flexibility in terms of how those stats can get retrieved. |
That's a very common problem and CI providers have solutions for this (GHA, for example, supports artifacts). Adding this to the builder on the other hand adds another potential source for errors: just imagine your monitoring server is suddenly down and now your Native Image build fails because of that. It shouldn't.
I didn't check in detail and only wanted to mention that we need to make sure the numbers are identical.
Rolling a custom script that parses whatever JSON output we provide in the end and uploads relevant metrics to whatever services seems most flexible to me and is relatively easy to implement in the builder. Of course, we could provide some prepared scripts for well-established platforms such as OpenTelemetry. |
That's fair enough. Mind you, the way I have it working locally here is that it would just not upload it.
So how can we make progress on this? Would any other form of a patch like this acceptable for integration? If yes, what would be the requirements for it to get accepted? Do you think it would be acceptable with me proceeding with this PR with the following modifications: Keep the separation of the data collecting part and only support writing a JSON file (in addition to the console output)? I'd remove the extra infrastructure like |
If we consider the generation of the JSON file a responsibility of the Anyway, maybe we should schedule a call on this topic and discuss implementation details and use cases. |
OK. I'll reach out to you next week. |
@fniephaus and myself have agreed to first work on a "supportable" JSON schema. Once that's been agreed on an implementation will follow. First draft of a JSON schema is here: |
@jerboaa Does the schema include the possibility of adding time spent in each of the phases? E.g. analysis 10.3 secs, building universe 1 sec, parsing ..., inlining ... |
@galderz No, not yet. I'm waiting on @fniephaus feedback to get this moving again. It's conceivable that we could add it to https://github.com/jerboaa/build-stats-json/blob/main/json_formatted.txt#L13 ( |
To get better granularity of changes in the total time. E.g. if you're doing a benchmark and the total time has gone up, you'd want to get an idea of which phase(s) have caused the time to go up. Or maybe you're testing the effects that certain flags (e.g. parse once) have on the time spent on different phases. Aside from time, you'd want to maybe get clearer idea of what each phase contributes to in terms of memory usage, but I'm not sure we can measure that precisely enough. Pre-22.0 after each phase the output would show total memory after each phase but I don't think this was very precise. |
I understand why precise timing details would be useful but I have two reservations: 1) this seems to be only interesting for power users that understand what each stage does and want to benchmark and more importantly, 2) the goal is that the JSON is considered public API but each build stage is an implementation detail and subject to change at any time. We could add this info but it will be annoying when things do change.
Total memory wasn't really useful, which is why it now shows used memory. However, even used memory doesn't really tell you much, especially when the GC happen to clean up a lot right before a build stage ended. I don't want to block any of things and I hope you understand what I'm trying to say. Anyway, I still like to understand your use case a bit more, @galderz. Do you really think fine-grained timing info would actually be helpful or just nice to have throughout the development process of a native app? In what type of system would you feed such a JSON? |
One way of implementing this could be to add such things under a special JSON field and make clear that anything under this field is subject to change. Alternatively, we could generate a companion json file including the non-API parts when a special flagged is used. WDYT? |
Right. Of course, eventually people will rely on the private API and then it's just like public API. Also, it will be harder to argue against other things that could also be exposed as "private API" and then I'm also concerned about the file size of the JSON (it might matter if people start storing hundreds of these, and JSON files can get quite large... see the dashboard outputs). What we could also do is providing the timing info in a generic way that is easier to change in the future. Something like: "stage_timings": [ {"name": "initializing", "millis": 1234 }, {"name": "analysis", "millis": 2345 }, ... ] |
Irrespective of the end user use case, both pre and post 22.0 outputs have been listing phases and the time spent in each of them. To not include that in the JSON output would be odd IMO. You could define the JSON in such way that the phase names are not fixed so you're not limiting yourself to specific phases.
Yeah, happy to leave memory out for per-phase info.
For the past year I've been extracting information out of the pre and post 22.0 outputs into CSV files, and then import them into Google spreadsheets to create diagrams. E.g. I'm hoping to switch that code to read the JSON instead, in order to extract information more easily. |
I don't think it's necessarily odd because the build output was never intended to be considered a public API, it's for human consumption only and as I mention, it can change at any point in time.
That's what I was proposing in #4481 (comment). However, we of course already have an infrastructure for measuring timing events. Take a look at this: 6f0d40b#diff-4076a2705a4cfe89f104024bd4a4ce19096184ea7431452b7f58a78feff8e50a. Note that we don't really consider these options public API either, but it seems like we should keep timing events out of the JSON for now.
Very cool! Thanks for sharing that! |
I iterated over the {
"general_info": {
"name": "helloworld",
"graalvm_version": "GraalVM 22.2.0-dev Java 17 CE",
"java_version": "17.0.4+5-jvmci-22.2-b03",
"c_compiler": "gcc (linux, x86_64, 9.3.0)",
"garbage_collector": "Serial GC",
"user_specific_features": 4,
},
"analysis_results": {
"classes": {
"total": 3850,
"reachable": 2839,
"reflection": 28,
"jni": 58,
},
"fields": {
"total": 6665,
"reachable": 3400,
"reflection": 0,
"jni": 58,
},
"methods": {
"total": 29038,
"reachable": 12916,
"reflection": 332,
"jni": 52,
},
},
"image_details": {
"total_bytes": 13057934,
"code_area": {
"bytes": 4610048,
"compilation_units": 67007,
},
"image_heap": {
"bytes": 7307264,
"resources": {
"count": 134,
"bytes": 10200,
},
},
"debug_info": {
"bytes": 1140622,
},
},
"resource_usage": {
"cpu_load": 8.38,
"gc_count": 17,
"gc_time_sec": 0.9245,
"peak_rss_bytes": 3506065408,
"total_time_sec": 28.874970166,
},
"artifacts": [
{"path": "/home/janedoe/helloworld/helloworld", "type": "executable"},
{"path": "/home/janedoe/helloworld/sources", "type": "debug_info"},
{"path": "/home/janedoe/helloworld/some.so", "type": "shared_lib"},
],
} Comments:
Please let me know what you think! Maybe @olpaw could comment on the two proposals as well? |
This information is not very valuable. Either list the fqn's of the feature classes or remove that information entirely |
Instead of total_time_sec, times for each stage would be preferable to e.g. track if either individual stages suddenly become more expensive or if the build time increase reflects somewhat evenly on all stages (i.e. due to CI infrastructure changes). |
I would keep produced artifacts separate from this output. But we might also want to switch to json for the output format there. |
+1. We can always add it later when there is a use case for it |
Actually that should be dropped entirely. If timing info is needed we should all use |
Thanks, @fniephaus for the feedback!
[...]
I've updated the schema: Essentially, I feel like we should keep the CPU cores value as that directly relates to the CPU load metric (a load of Therefore, I propose this updated example schema (more comments below): {
"general_info": {
"name": "helloworld",
"graalvm_version": "GraalVM 22.2.0-dev Java 17 CE",
"java_version": "17.0.4+5-jvmci-22.2-b03",
"c_compiler": "gcc (linux, x86_64, 9.3.0)",
"garbage_collector": "Serial GC"
},
"analysis_results": {
"classes": {
"total": 3850,
"reachable": 2839,
"reflection": 28,
"jni": 58
},
"fields": {
"total": 6665,
"reachable": 3400,
"reflection": 0,
"jni": 58
},
"methods": {
"total": 29038,
"reachable": 12916,
"reflection": 332,
"jni": 52
}
},
"image_details": {
"total_bytes": 13057934,
"code_area": {
"bytes": 4610048,
"compilation_units": 67007
},
"image_heap": {
"bytes": 7307264,
"resources": {
"count": 134,
"bytes": 10200
}
},
"debug_info": {
"bytes": 1140622
}
},
"resource_usage": {
"cpu": {
"load": 8.38,
"cores_total": 10
},
"gc_count": 17,
"gc_time_sec": 0.9245,
"memory": {
"system_total": 33254146048,
"peak_rss_bytes": 3506065408
}
}
} Comments:
Please let me know what you think! |
The proposal in #4481 (comment) looks very good! And your additions regarding CPU/memory make sense, thanks for revising! I think we can start implementing. :) Two minor nits: To be consistent, should we rename
|
OK. I'll update the schema and then start implementing.
Makes sense. I'll add those too. Thanks for the feedback! |
c9e8029
to
5fb696e
Compare
@fniephaus I'm moving this out of draft. Please review! It's a simple
Generates this json that validates against the schema: {
"resource_usage": {
"memory": {
"system_total": 33260355584,
"peak_rss_bytes": 3127443456
},
"garbage_collection": {
"count": 20,
"total_secs": 1.097
},
"cpu": {
"load": 6.307451297470753,
"total_cores": 8
}
},
"image_details": {
"debug_info": {
"bytes": 7694974
},
"code_area": {
"bytes": 4181808,
"compilation_units": 7040
},
"total_bytes": 20157545,
"image_heap": {
"bytes": 7233536,
"resources": {
"bytes": 142884,
"count": 5
}
}
},
"general_info": {
"c_compiler": "gcc (redhat, x86_64, 11.3.1)",
"name": "hello",
"java_version": "11.0.16-beta+7-202206201130",
"garbage_collector": "Serial GC",
"graalvm_version": "GraalVM 22.3.0-dev Java 11 Mandrel Distribution"
},
"analysis_results": {
"methods": {
"total": 27123,
"reflection": 267,
"jni": 52,
"reachable": 12255
},
"classes": {
"total": 3725,
"reflection": 27,
"jni": 58,
"reachable": 2709
},
"fields": {
"total": 6388,
"reflection": 0,
"jni": 58,
"reachable": 3430
}
}
} Let me know what you think! |
Cool! It's on the top of my list for next week. What happens if the build fails? Do I get an incomplete JSON or no JSON currently? What should the behaviour be in that case? |
Great, thanks!
Right now, I think it doesn't produce complete output (like the text output on stdout), but it should easily be adjustable to not produce any output in that case. What is the preference for failed builds? Aside: I'll fix the checkstyle issues meanwhile. |
5fb696e
to
54cc169
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good and lightweight so far. Left some minor remarks. I think one main question we still need to answer is: what about optional values? When users write a consumer for these JSON files, they will have to check for optional values somehow, which may be annoying. What do you think?
substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/ProgressReporter.java
Show resolved
Hide resolved
substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/ProgressReporter.java
Outdated
Show resolved
Hide resolved
substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/ProgressReporter.java
Outdated
Show resolved
Hide resolved
substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/ProgressReporter.java
Outdated
Show resolved
Hide resolved
substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/ProgressReporter.java
Outdated
Show resolved
Hide resolved
docs/reference-manual/native-image/assets/build-output-schema-v1.json
Outdated
Show resolved
Hide resolved
docs/reference-manual/native-image/assets/build-output-schema-v1.json
Outdated
Show resolved
Hide resolved
docs/reference-manual/native-image/assets/build-output-schema-v1.json
Outdated
Show resolved
Hide resolved
docs/reference-manual/native-image/assets/build-output-schema-v1.json
Outdated
Show resolved
Hide resolved
docs/reference-manual/native-image/assets/build-output-schema-v1.json
Outdated
Show resolved
Hide resolved
This uses a simple backing Map (HashMap), for collecting the data. It's entirely driven by ProgressReporter and if not requested by the user, by specifying: `-H:BuildOutputJSONFile=<file>` on the command line, entirely disabled. It also adds a JSON schema to the native image manual as an asset.
54cc169
to
cdf505d
Compare
Thanks for the review! I've updated the patch. The only true optional value is Other than that the remaining "optional" values are More thoughts? |
Apart from changing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for incorporating the feedback! I'm happy to take care of the rest and get it merged. :)
Not sure what you mean. Could you clarify? Say, you'd have an application (most likely would need a JDK which doesn't use any JNI), which has
Would you rather have that Ideally, those JNI members would be boxed (or use an Object to collect members) and using the unset state |
OK, thank you! |
I just realized that there is |
Merged to |
Thank you! |
Hmmm, so where do those timing events go? To the standard output? It'd a pain having to parse both a JSON and standard output to get all the info :\ |
Btw, the reason I'm so adamant about timings per phase, it's because I see people making assumptions like "most of the native image build time is spent in method compilation" (e.g. Carlo in this presentation), and that's certainly not the case for all applications. E.g. Quarkus applications tend to spend more time on analysis than method compilation. The overall time is not enough to understand these kind of nuances. |
@jerboaa Pointed out that |
This adds a basic build stats data collecting framework
which can be used to serialize build metrics to disk (default)
or by some other means by implementing ImageStatsReporter
and registering the reporter via ImageStatsReporterRegistry
in a Feature in afterRegistration phase.
Extra reporting is enabled via -H:+BuildOutputStats which
by default produces a file with image stats in json format.