[GR-39497] Add image stats reporter to build output #4481

jerboaa · 2022-04-08T15:28:03Z

This adds a basic build stats data collecting framework
which can be used to serialize build metrics to disk (default)
or by some other means by implementing ImageStatsReporter
and registering the reporter via ImageStatsReporterRegistry
in a Feature in afterRegistration phase.

Extra reporting is enabled via -H:+BuildOutputStats which
by default produces a file with image stats in json format.

jerboaa · 2022-04-08T15:35:39Z

Example usage (of the file based one, included in the patch):

$ native-image -H:+BuildOutputStats HelloWorld hello
========================================================================================================================
GraalVM Native Image: Generating 'hello' (executable)...
========================================================================================================================
[1/7] Initializing...                                                                                    (3.0s @ 0.15GB)
 Version info: 'GraalVM 22.2.0-dev Java 17 Mandrel Distribution'
 C compiler: gcc (redhat, x86_64, 11.2.1)
 Garbage collector: Serial GC
[2/7] Performing analysis...  [******]                                                                  (10.3s @ 0.94GB)
   2,839 (73.74%) of  3,850 classes reachable
   3,400 (51.01%) of  6,665 fields reachable
  12,916 (44.48%) of 29,038 methods reachable
      28 classes,     0 fields, and   332 methods registered for reflection
      58 classes,    58 fields, and    52 methods registered for JNI access
[3/7] Building universe...                                                                               (1.0s @ 1.21GB)
[4/7] Parsing methods...      [*]                                                                        (0.8s @ 1.46GB)
[5/7] Inlining methods...     [****]                                                                     (1.1s @ 0.90GB)
[6/7] Compiling methods...    [***]                                                                      (9.7s @ 2.56GB)
[7/7] Creating image...                                                                                  (1.6s @ 2.86GB)
   4.40MB (35.30%) for code area:    7,558 compilation units
   6.97MB (55.96%) for image heap:   1,706 classes and 93,282 objects
   1.09MB ( 8.74%) for other data
  12.45MB in total
------------------------------------------------------------------------------------------------------------------------
Top 10 packages in code area:                               Top 10 object types in image heap:
 667.02KB java.util                                          973.77KB byte[] for general heap data
 338.45KB java.lang                                          942.71KB byte[] for code metadata
 274.82KB java.text                                          874.94KB java.lang.String
 235.97KB java.util.regex                                    620.70KB java.lang.Class
 198.51KB com.oracle.svm.jni                                 528.48KB byte[] for java.lang.String
 193.17KB java.util.concurrent                               383.34KB java.util.HashMap$Node
 155.58KB com.oracle.svm.core.reflect                        221.80KB com.oracle.svm.core.hub.DynamicHubCompanion
 147.03KB java.math                                          163.03KB java.util.HashMap$Node[]
  98.46KB java.util.logging                                  159.99KB java.lang.String[]
  94.34KB com.oracle.svm.core.genscavenge                    155.95KB java.util.concurrent.ConcurrentHashMap$Node
      ... 119 additional packages                                 ... 772 additional object types
                                           (use GraalVM Dashboard to see all)
------------------------------------------------------------------------------------------------------------------------
                        1.3s (4.6% of total time) in 17 GCs | Peak RSS: 3.27GB | CPU load: 6.69
------------------------------------------------------------------------------------------------------------------------
Produced artifacts:
 /disk/graal/upstream-sources/graal/hello (executable)
 /disk/graal/upstream-sources/graal/hello.build_artifacts.txt
 /disk/graal/upstream-sources/graal/reports/native_image_stats_20220408_170705.json
========================================================================================================================
Finished generating 'hello' in 28.9s.

$ cat /disk/graal/upstream-sources/graal/reports/native_image_stats_20220408_170705.json | jq
{
  "img_name": "hello",
  "generator_version": "GraalVM 22.2.0-dev Java 17 Mandrel Distribution",
  "build_perf_stats": {
    "total_build_time_sec": 28.874970166,
    "cpu_load": 6.689655172413793,
    "peak_rss_bytes": 3506065408,
    "num_cpu_cores": 8,
    "total_machine_memory": 33254146048
  },
  "reflection_stats": {
    "classes": 28,
    "fields": 0,
    "methods": 332
  },
  "total_classes_stats": {
    "classes": 3850,
    "fields": 6665,
    "methods": 29038
  },
  "jni_classes_stats": {
    "classes": 58,
    "fields": 58,
    "methods": 52
  },
  "reachability_stats": {
    "classes": 2839,
    "fields": 3400,
    "methods": 12916
  },
  "image_size_stats": {
    "total_bytes": 13057934,
    "code_cache_bytes": 4610048,
    "heap_bytes": 7307264,
    "other_bytes": 1140622
  }
}

Or if somebody wants to do something else with the image stats, implement a feature similar to this:

diff --git a/substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/image/stats/TestImageStatsReporterFeature.java b/substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/image/stats/TestImageStatsReporterFeature.java
new file mode 100644
index 000000000000..5ebdb708d500
--- /dev/null
+++ b/substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/image/stats/TestImageStatsReporterFeature.java
@@ -0,0 +1,49 @@
+package com.oracle.svm.hosted.image.stats;
+
+import java.util.Optional;
+
+import org.graalvm.collections.Pair;
+import org.graalvm.nativeimage.hosted.Feature;
+
+import com.oracle.svm.core.annotate.AutomaticFeature;
+
+@AutomaticFeature
+public class TestImageStatsReporterFeature implements Feature {
+
+    @Override
+    public void afterRegistration(AfterRegistrationAccess access) {
+        ImageStatsReporterRegistry.singleton().addReporter(new UrlBasedImageStatsReporter());
+    }
+
+    static class UrlBasedImageStatsReporter implements ImageStatsReporter<String> {
+
+        @Override
+        public ReportResult<String> report(ImageStats stats) {
+            // Upload to url. Left out...
+            return new ReportResult<>() {
+
+                @Override
+                public boolean isSuccess() {
+                    return true;
+                }
+
+                @Override
+                public boolean isPath() {
+                    return false;
+                }
+
+                @Override
+                public boolean isUrl() {
+                    return true;
+                }
+
+                @Override
+                public Optional<Pair<String, String>> getArtifact() {
+                    return Optional.of(Pair.create("Linktext", "http://where.com/stats/here"));
+                }
+
+            };
+        }
+
+    }
+}

jerboaa · 2022-04-08T15:38:31Z

@fniephaus @zakkak Would something like that be useful? I'm looking into gathering some data for various apps and a framework like that would help me a lot. Thoughts?

fniephaus · 2022-04-08T20:06:03Z

Yes, something like that would be useful. In fact, we had some discussions around this (e.g., #3955 (comment)) when introducing the new build output.

Before we start discussing implementation details, could you elaborate some more on why you want to collect build metrics? What are your use cases?

Providing a machine-readable file is a first step, but we might also want to provide some useful "consumers". One use case could be the detection of a significant change of a certain variable (e.g., reachable classes or file size of the executable) and then trigger something (e.g., fail a CI build with non-zero exit code). Of course, "significant" may depend on what variable you choose.

Another use case is monitoring/tracking. For that, we might want to look into common file formats and maybe even provide support for a specific setup.

jerboaa · 2022-04-11T08:30:56Z

Yes, something like that would be useful. In fact, we had some discussions around this (e.g., #3955 (comment)) when introducing the new build output.

Before we start discussing implementation details, could you elaborate some more on why you want to collect build metrics? What are your use cases?

Right now, it's general data gathering. For example we'd like to collect some evidence that for app X doing Y has some effect on Z, possibly be to the expense of A. For us the collected data in this patch is a good starting point as it gives a good overview. The focus right now for me is image-size, but it may be something else later.

Providing a machine-readable file is a first step, but we might also want to provide some useful "consumers". One use case could be the detection of a significant change of a certain variable (e.g., reachable classes or file size of the executable) and then trigger something (e.g., fail a CI build with non-zero exit code). Of course, "significant" may depend on what variable you choose.

OK, seems a valid point. For us that's definitely a secondary goal. Having said, that it should be fairly easy to achieve based on this patch by a slight refactoring how ImageStatsReporter does it's work and is hooked up in the machinery. Basically add a failBuild property or something like that which in turn would fail the build if any of those registered "reporters" want to fail the build. It could be an app specific feature with an app specific threshold/property doing that.

Another use case is monitoring/tracking. For that, we might want to look into common file formats and maybe even provide support for a specific setup.

I'm not sure what you are trying to say with this. Do you think monitoring/tracking is orthogonal to this? For me the format should be json and that can be passed to some form of reporter and the rest would be handled by the endpoint consuming the json. Should there be a need to support some form of specific format it shouldn't be the concern of native image but the json consumer...

fniephaus · 2022-04-11T21:33:56Z

Right now, it's general data gathering. For example we'd like to collect some evidence that for app X doing Y has some effect on Z, possibly be to the expense of A. For us the collected data in this patch is a good starting point as it gives a good overview. The focus right now for me is image-size, but it may be something else later.

Ok, that's the monitoring/tracking use case I was talking about.

OK, seems a valid point. For us that's definitely a secondary goal. Having said, that it should be fairly easy to achieve based on this patch by a slight refactoring how ImageStatsReporter does it's work and is hooked up in the machinery. Basically add a failBuild property or something like that which in turn would fail the build if any of those registered "reporters" want to fail the build. It could be an app specific feature with an app specific threshold/property doing that.

I think we should build stuff like this around Native Image, not into it. An (optional?) file in a well-defined format should be more than enough. If you want to write consumers in Java, you can still so that (and even compile them to native executables 😅).

I'm not sure what you are trying to say with this. Do you think monitoring/tracking is orthogonal to this? For me the format should be json and that can be passed to some form of reporter and the rest would be handled by the endpoint consuming the json. Should there be a need to support some form of specific format it shouldn't be the concern of native image but the json consumer...

JSON output (plus a JSON schema) seems reasonable. I guess all I'm trying to say is that I think it'd be good to provide some basic/useful consumers (e.g., scripts for analysis), or at least some examples of how the output data could be used (e.g., a GitHub action setup that fails when image size increased by X percent).

jerboaa · 2022-04-12T12:55:05Z

Right now, it's general data gathering. For example we'd like to collect some evidence that for app X doing Y has some effect on Z, possibly be to the expense of A. For us the collected data in this patch is a good starting point as it gives a good overview. The focus right now for me is image-size, but it may be something else later.

Ok, that's the monitoring/tracking use case I was talking about.

OK, seems a valid point. For us that's definitely a secondary goal. Having said, that it should be fairly easy to achieve based on this patch by a slight refactoring how ImageStatsReporter does it's work and is hooked up in the machinery. Basically add a failBuild property or something like that which in turn would fail the build if any of those registered "reporters" want to fail the build. It could be an app specific feature with an app specific threshold/property doing that.

I think we should build stuff like this around Native Image, not into it. An (optional?) file in a well-defined format should be more than enough. If you want to write consumers in Java, you can still so that (and even compile them to native executables sweat_smile).

I'm not sure I agree. Why not do both with one shot? Clearly, something would have to be part of native-image so that the metrics can get exported to the file. The proposed patch basically does the optional file part you mention, but also allows for a different kind of consumer whatever it is. A nice thing about a design like this is that an application specific AutomaticFeature can handle whatever it is that needs to happen when some specific property of the image stats changes to some (unexpected) value. The application owners would know that best. This is all optional of course. We could even hide more of the logic behind a switch if that's the preference.

I'm not sure what you are trying to say with this. Do you think monitoring/tracking is orthogonal to this? For me the format should be json and that can be passed to some form of reporter and the rest would be handled by the endpoint consuming the json. Should there be a need to support some form of specific format it shouldn't be the concern of native image but the json consumer...

JSON output (plus a JSON schema) seems reasonable. I guess all I'm trying to say is that I think it'd be good to provide some basic/useful consumers (e.g., scripts for analysis), or at least some examples of how the output data could be used (e.g., a GitHub action setup that fails when image size increased by X percent).

Sure. We are hoping to have a bit more flexibility though. For example, if the native-image generation happens in a container, the files would need to get extracted from the container first. A better fit for it would be to print metrics to standard out or call a web service and post the JSON to it or do something else entirely.

@fniephaus Please provide some guidance as to what the next steps should be. Thanks!

fniephaus · 2022-04-12T18:07:51Z

Please provide some guidance as to what the next steps should be. Thanks!

I'm OOO for the rest of the week. Will look into this next week.

galderz · 2022-04-14T13:56:58Z

Before we start discussing implementation details, could you elaborate some more on why you want to collect build metrics? What are your use cases?

A typical examples include benchmarking or tracking regressions. For example, you might want to see what the metrics compare with different combinations or user code and/or GraalVM versions...etc. Also, it can help track regressions. When a certain combination results in higher build times, or bigger binaries, you'd want to find out what the metrics look like before/after and see if you can spot where it has gotten worse.

fniephaus · 2022-05-23T12:00:01Z

Sorry for the delay, just looked into the PR. With a change stat of +1,687 −126, the PR is large than I expected for this feature (although it includes some tests). We obviously need to make sure that the reported numbers of the ProgressReporter and the JSON output are the same, but I still don't think a full-blown framework around statistics should end up in the Native Image builder. Users can then post-process the JSON output as they wish.

@jerboaa what monitoring system(s) were you targeting with this? Something like OpenTelemetry?

jerboaa · 2022-05-23T14:28:37Z

Sorry for the delay, just looked into the PR. With a change stat of +1,687 −126, the PR is large than I expected for this feature (although it includes some tests). We obviously need to make sure that the reported numbers of the ProgressReporter and the JSON output are the same, but I still don't think a full-blown framework around statistics should end up in the Native Image builder. Users can then post-process the JSON output as they wish.

This post-processing becomes a bit cumbersome when trying to extract this data from CI (think GHA). A more flexible setup which could upload statistics to a server and/or fail the build if some statistics change in a significant way would be nice to have.

Thanks for having taken a look at this. This PR includes a couple of preparatory steps (therefore might be larger than what you'd expect):

It separates (some) of the stats collection from the UI (ProgressReporter)
It defines some model classes for statistics
It includes tests for the same

Since the stats are now collected in a class separate from ProgressReporter and that only uses it the numbers are the same. What exactly do you anticipate to have tested in that regard?

@jerboaa what monitoring system(s) were you targeting with this? Something like OpenTelemetry?

Right now we are developing a custom collector (web application) with some rudimentary UI. Something like OpenTelemetry could be supported as well, I guess. The key of this patch is to separate the data collecting from the UI logic and add some flexibility in terms of how those stats can get retrieved.

fniephaus · 2022-05-23T14:39:22Z

This post-processing becomes a bit cumbersome when trying to extract this data from CI (think GHA)

That's a very common problem and CI providers have solutions for this (GHA, for example, supports artifacts). Adding this to the builder on the other hand adds another potential source for errors: just imagine your monitoring server is suddenly down and now your Native Image build fails because of that. It shouldn't.

Since the stats are now collected in a class separate from ProgressReporter and that only uses it the numbers are the same. What exactly do you anticipate to have tested in that regard?

I didn't check in detail and only wanted to mention that we need to make sure the numbers are identical.

Right now we are developing a custom collector (web application) with some rudimentary UI. Something like OpenTelemetry could be supported as well, I guess. The key of this patch is to separate the data collecting from the UI logic and add some flexibility in terms of how those stats can get retrieved.

Rolling a custom script that parses whatever JSON output we provide in the end and uploads relevant metrics to whatever services seems most flexible to me and is relatively easy to implement in the builder. Of course, we could provide some prepared scripts for well-established platforms such as OpenTelemetry.

jerboaa · 2022-05-23T15:26:32Z

This post-processing becomes a bit cumbersome when trying to extract this data from CI (think GHA)

That's a very common problem and CI providers have solutions for this (GHA, for example, supports artifacts). Adding this to the builder on the other hand adds another potential source for errors: just imagine your monitoring server is suddenly down and now your Native Image build fails because of that. It shouldn't.

That's fair enough. Mind you, the way I have it working locally here is that it would just not upload it.

Since the stats are now collected in a class separate from ProgressReporter and that only uses it the numbers are the same. What exactly do you anticipate to have tested in that regard?

I didn't check in detail and only wanted to mention that we need to make sure the numbers are identical.

Right now we are developing a custom collector (web application) with some rudimentary UI. Something like OpenTelemetry could be supported as well, I guess. The key of this patch is to separate the data collecting from the UI logic and add some flexibility in terms of how those stats can get retrieved.

Rolling a custom script that parses whatever JSON output we provide in the end and uploads relevant metrics to whatever services seems most flexible to me and is relatively easy to implement in the builder. Of course, we could provide some prepared scripts for well-established platforms such as OpenTelemetry.

So how can we make progress on this? Would any other form of a patch like this acceptable for integration? If yes, what would be the requirements for it to get accepted? Do you think it would be acceptable with me proceeding with this PR with the following modifications: Keep the separation of the data collecting part and only support writing a JSON file (in addition to the console output)? I'd remove the extra infrastructure like ImageStatsFeature and the registry in that case. If possible I'd still like to model stats with actual classes though as this would make ImageStats.singleton() the central place for user consumable image build stats that would be responsible for filling in the info. ProgressReporter and that file writer would be a consumer of it. Thoughts?

fniephaus · 2022-05-25T09:34:44Z

So how can we make progress on this? Would any other form of a patch like this acceptable for integration? If yes, what would be the requirements for it to get accepted? Do you think it would be acceptable with me proceeding with this PR with the following modifications: Keep the separation of the data collecting part and only support writing a JSON file (in addition to the console output)? I'd remove the extra infrastructure like ImageStatsFeature and the registry in that case. If possible I'd still like to model stats with actual classes though as this would make ImageStats.singleton() the central place for user consumable image build stats that would be responsible for filling in the info. ProgressReporter and that file writer would be a consumer of it. Thoughts?

If we consider the generation of the JSON file a responsibility of the ProgressReporter, the easiest implementation would just maintain entries in a Map<String, Map<String, Object>> (or something like that) and write the content to a file in the JSON format. While separating concerns is a good thing, I'm not sure adding a bunch of collection classes with lots of setter and getters is a reasonable thing to do in this case to be honest.

Anyway, maybe we should schedule a call on this topic and discuss implementation details and use cases.

jerboaa · 2022-05-25T13:00:29Z

Anyway, maybe we should schedule a call on this topic and discuss implementation details and use cases.

OK. I'll reach out to you next week.

jerboaa · 2022-06-07T12:51:28Z

@fniephaus and myself have agreed to first work on a "supportable" JSON schema. Once that's been agreed on an implementation will follow. First draft of a JSON schema is here:

https://github.com/jerboaa/build-stats-json

galderz · 2022-06-14T05:46:55Z

@jerboaa Does the schema include the possibility of adding time spent in each of the phases? E.g. analysis 10.3 secs, building universe 1 sec, parsing ..., inlining ...

jerboaa · 2022-06-14T08:26:50Z

@jerboaa Does the schema include the possibility of adding time spent in each of the phases? E.g. analysis 10.3 secs, building universe 1 sec, parsing ..., inlining ...

@galderz No, not yet. I'm waiting on @fniephaus feedback to get this moving again. It's conceivable that we could add it to https://github.com/jerboaa/build-stats-json/blob/main/json_formatted.txt#L13 (resource_usage_statistics => build_time => 'phase_x: <time-in-secs>'). Or some other place. Right now, I didn't have a use-case for it. What's the use case? We could add that in a second iteration too.

galderz · 2022-06-14T10:09:08Z

Right now, I didn't have a use-case for it. What's the use case? We could add that in a second iteration too.

To get better granularity of changes in the total time. E.g. if you're doing a benchmark and the total time has gone up, you'd want to get an idea of which phase(s) have caused the time to go up. Or maybe you're testing the effects that certain flags (e.g. parse once) have on the time spent on different phases.

Aside from time, you'd want to maybe get clearer idea of what each phase contributes to in terms of memory usage, but I'm not sure we can measure that precisely enough. Pre-22.0 after each phase the output would show total memory after each phase but I don't think this was very precise.

fniephaus · 2022-06-14T19:08:40Z

Does the schema include the possibility of adding time spent in each of the phases? E.g. analysis 10.3 secs, building universe 1 sec, parsing ..., inlining ...

I understand why precise timing details would be useful but I have two reservations: 1) this seems to be only interesting for power users that understand what each stage does and want to benchmark and more importantly, 2) the goal is that the JSON is considered public API but each build stage is an implementation detail and subject to change at any time. We could add this info but it will be annoying when things do change.

Aside from time, you'd want to maybe get clearer idea of what each phase contributes to in terms of memory usage, but I'm not sure we can measure that precisely enough. Pre-22.0 after each phase the output would show total memory after each phase but I don't think this was very precise.

Total memory wasn't really useful, which is why it now shows used memory. However, even used memory doesn't really tell you much, especially when the GC happen to clean up a lot right before a build stage ended.

I don't want to block any of things and I hope you understand what I'm trying to say. Anyway, I still like to understand your use case a bit more, @galderz. Do you really think fine-grained timing info would actually be helpful or just nice to have throughout the development process of a native app? In what type of system would you feed such a JSON?

zakkak · 2022-06-14T19:36:03Z

the goal is that the JSON is considered public API but each build stage is an implementation detail and subject to change at any time. We could add this info but it will be annoying when things do change.

One way of implementing this could be to add such things under a special JSON field and make clear that anything under this field is subject to change. Alternatively, we could generate a companion json file including the non-API parts when a special flagged is used. WDYT?

fniephaus · 2022-06-14T20:01:08Z

One way of implementing this could be to add such things under a special JSON field and make clear that anything under this field is subject to change. Alternatively, we could generate a companion json file including the non-API parts when a special flagged is used. WDYT?

Right. Of course, eventually people will rely on the private API and then it's just like public API. Also, it will be harder to argue against other things that could also be exposed as "private API" and then I'm also concerned about the file size of the JSON (it might matter if people start storing hundreds of these, and JSON files can get quite large... see the dashboard outputs).

What we could also do is providing the timing info in a generic way that is easier to change in the future. Something like:

"stage_timings": [ {"name": "initializing", "millis": 1234 }, {"name": "analysis", "millis": 2345 }, ... ]

galderz · 2022-06-16T05:25:28Z

I understand why precise timing details would be useful but I have two reservations: 1) this seems to be only interesting for power users that understand what each stage does and want to benchmark and more importantly, 2) the goal is that the JSON is considered public API but each build stage is an implementation detail and subject to change at any time. We could add this info but it will be annoying when things do change.

Irrespective of the end user use case, both pre and post 22.0 outputs have been listing phases and the time spent in each of them. To not include that in the JSON output would be odd IMO. You could define the JSON in such way that the phase names are not fixed so you're not limiting yourself to specific phases.

Total memory wasn't really useful, which is why it now shows used memory. However, even used memory doesn't really tell you much, especially when the GC happen to clean up a lot right before a build stage ended.

Yeah, happy to leave memory out for per-phase info.

I don't want to block any of things and I hope you understand what I'm trying to say. Anyway, I still like to understand your use case a bit more, @galderz. Do you really think fine-grained timing info would actually be helpful or just nice to have throughout the development process of a native app? In what type of system would you feed such a JSON?

For the past year I've been extracting information out of the pre and post 22.0 outputs into CSV files, and then import them into Google spreadsheets to create diagrams. E.g.

I'm hoping to switch that code to read the JSON instead, in order to extract information more easily.

fniephaus · 2022-06-16T17:09:55Z

Irrespective of the end user use case, both pre and post 22.0 outputs have been listing phases and the time spent in each of them. To not include that in the JSON output would be odd IMO.

I don't think it's necessarily odd because the build output was never intended to be considered a public API, it's for human consumption only and as I mention, it can change at any point in time.

You could define the JSON in such way that the phase names are not fixed so you're not limiting yourself to specific phases.

That's what I was proposing in #4481 (comment). However, we of course already have an infrastructure for measuring timing events. Take a look at this: 6f0d40b#diff-4076a2705a4cfe89f104024bd4a4ce19096184ea7431452b7f58a78feff8e50a. Note that we don't really consider these options public API either, but it seems like we should keep timing events out of the JSON for now.

For the past year I've been extracting information out of the pre and post 22.0 outputs into CSV files, and then import them into Google spreadsheets to create diagrams. E.g.

Very cool! Thanks for sharing that!

fniephaus · 2022-06-16T17:48:12Z

I iterated over the json_formatted.txt and came up with this (comments below):

{
    "general_info": {
        "name": "helloworld",
        "graalvm_version": "GraalVM 22.2.0-dev Java 17 CE",
        "java_version": "17.0.4+5-jvmci-22.2-b03",
        "c_compiler": "gcc (linux, x86_64, 9.3.0)",
        "garbage_collector": "Serial GC",
        "user_specific_features": 4,
    },
    "analysis_results": {
        "classes": {
            "total": 3850,
            "reachable": 2839,
            "reflection": 28,
            "jni": 58,
        },
        "fields": {
            "total": 6665,
            "reachable": 3400,
            "reflection": 0,
            "jni": 58,
        },
        "methods": {
            "total": 29038,
            "reachable": 12916,
            "reflection": 332,
            "jni": 52,
        },
    },
    "image_details": {
        "total_bytes": 13057934,
        "code_area": {
            "bytes": 4610048,
            "compilation_units": 67007,
        },
        "image_heap": {
            "bytes": 7307264,
            "resources": {
                "count": 134,
                "bytes": 10200,
            },
        },
        "debug_info": {
            "bytes": 1140622,
        },
    },
    "resource_usage": {
        "cpu_load": 8.38,
        "gc_count": 17,
        "gc_time_sec": 0.9245,
        "peak_rss_bytes": 3506065408,
        "total_time_sec": 28.874970166,
    },
    "artifacts": [
        {"path": "/home/janedoe/helloworld/helloworld", "type": "executable"},
        {"path": "/home/janedoe/helloworld/sources", "type": "debug_info"},
        {"path": "/home/janedoe/helloworld/some.so", "type": "shared_lib"},
    ],
}

Comments:

I dropped system_info because it is something that can be easily determined outside of the image generator
Without system_info, there isn't much point of having image_generator so I dropped that too
general_info was missing some details, user_specific_features could be an interesting metric
I grouped analysis_results by classes, fields, and methods
image_creation_phase only had an image_size entry. I revised and extended it into image_details
Also revised ressource_usage_statistics and turned it into resource_usage
Added artifacts (eventually, it'd be nice if we could retire those .build_artifacts.txt files, which require custom parsing

Please let me know what you think! Maybe @olpaw could comment on the two proposals as well?

olpaw · 2022-06-20T09:45:06Z

Maybe @olpaw could comment on the two proposals as well?

user_specific_features": 4,

This information is not very valuable. Either list the fqn's of the feature classes or remove that information entirely

olpaw · 2022-06-20T09:48:50Z

total_time_sec

Instead of total_time_sec, times for each stage would be preferable to e.g. track if either individual stages suddenly become more expensive or if the build time increase reflects somewhat evenly on all stages (i.e. due to CI infrastructure changes).

olpaw · 2022-06-20T09:51:22Z

Added artifacts (eventually, it'd be nice if we could retire those .build_artifacts.txt files, which require custom parsing

I would keep produced artifacts separate from this output. But we might also want to switch to json for the output format there.
cc @pejovica

fniephaus · 2022-06-20T10:02:23Z

This information is not very valuable. Either list the fqn's of the feature classes or remove that information entirely

+1. We can always add it later when there is a use case for it

olpaw · 2022-06-20T10:10:09Z

total_time_sec

Instead of total_time_sec, times for each stage would be preferable to e.g. track if either individual stages suddenly become more expensive or if the build time increase reflects somewhat evenly on all stages (i.e. due to CI infrastructure changes).

Actually that should be dropped entirely. If timing info is needed we should all use -H:+CollectImageBuildStatistics -H:ImageBuildStatisticsFile=foo.json. Not only does this prevent code/feature duplication it also ensures we are measuring the same data points.

jerboaa · 2022-06-20T13:28:57Z

Thanks, @fniephaus for the feedback!

I iterated over the json_formatted.txt and came up with this (comments below):

[...]

Please let me know what you think!

I've updated the schema:
https://raw.githubusercontent.com/jerboaa/build-stats-json/708f05afa00308c13818f337488782f8162e091f/json_formatted.txt

Essentially, I feel like we should keep the CPU cores value as that directly relates to the CPU load metric (a load of 2.3 means something different on a 10 core machine/container than on a 2 cores box). Similarly the total system memory is important to know as it allows us to reason about (potential) container limits - if any - and also gives us an anchor for the significance of peak RSS. A peak RSS of 3GB on a 3GB total container with, say, 10 seconds spent in GC clearly indicates a problem on the build host.

Therefore, I propose this updated example schema (more comments below):

{
  "general_info": {
    "name": "helloworld",
    "graalvm_version": "GraalVM 22.2.0-dev Java 17 CE",
    "java_version": "17.0.4+5-jvmci-22.2-b03",
    "c_compiler": "gcc (linux, x86_64, 9.3.0)",
    "garbage_collector": "Serial GC"
  },
  "analysis_results": {
    "classes": {
      "total": 3850,
      "reachable": 2839,
      "reflection": 28,
      "jni": 58
    },
    "fields": {
      "total": 6665,
      "reachable": 3400,
      "reflection": 0,
      "jni": 58
    },
    "methods": {
      "total": 29038,
      "reachable": 12916,
      "reflection": 332,
      "jni": 52
    }
  },
  "image_details": {
    "total_bytes": 13057934,
    "code_area": {
      "bytes": 4610048,
      "compilation_units": 67007
    },
    "image_heap": {
      "bytes": 7307264,
      "resources": {
        "count": 134,
        "bytes": 10200
      }
    },
    "debug_info": {
      "bytes": 1140622
    }
  },
  "resource_usage": {
    "cpu": {
      "load": 8.38,
      "cores_total": 10
    },
    "gc_count": 17,
    "gc_time_sec": 0.9245,
    "memory": {
      "system_total": 33254146048,
      "peak_rss_bytes": 3506065408
    }
  }
}

Comments:

resource_usage has CPU load and CPU cores now, since the CPU load doesn't make any sense without knowing the total number of CPU cores of that build system.
memory has the total system memory as a field, because that is an important metric as compared to the peak RSS. Considering a containerized build with a memory limit in place, the total memory would report the container limits not the physical machine memory. So I disagree that this info is easily retrievable by some other means.

Please let me know what you think!

fniephaus · 2022-06-22T07:46:07Z

The proposal in #4481 (comment) looks very good! And your additions regarding CPU/memory make sense, thanks for revising! I think we can start implementing. :)

Two minor nits: To be consistent, should we rename "cores_total" to "total_cores", and group and rename the GC stats in resource_usage?

    "garbage_collection": {
        "count": 17,
        "total_secs": 0.9245
    }

jerboaa · 2022-06-22T08:49:30Z

The proposal in #4481 (comment) looks very good! And your additions regarding CPU/memory make sense, thanks for revising! I think we can start implementing. :)

OK. I'll update the schema and then start implementing.

Two minor nits: To be consistent, should we rename "cores_total" to "total_cores", and group and rename the GC stats in resource_usage?
    "garbage_collection": {
        "count": 17,
        "total_secs": 0.9245
    }

Makes sense. I'll add those too.

Thanks for the feedback!

jerboaa · 2022-06-24T15:52:34Z

@fniephaus I'm moving this out of draft. Please review! It's a simple Map-based implementation now. Example usage:

native-image -g -H:BuildOutputJSONFile=build-stats.json -cp $(pwd)/mandrel-build HelloWorld hello
========================================================================================================================
GraalVM Native Image: Generating 'hello' (executable)...
========================================================================================================================
[1/7] Initializing...                                                                                    (3.3s @ 0.24GB)
 Version info: 'GraalVM 22.3.0-dev Java 11 Mandrel Distribution'
 Java version info: '11.0.16-beta+7-202206201130'
 C compiler: gcc (redhat, x86_64, 11.3.1)
 Garbage collector: Serial GC
 Analysis: Points-To
[2/7] Performing analysis...  [*****]                                                                   (12.1s @ 0.88GB)
   2,709 (72.72%) of  3,725 classes reachable
   3,430 (53.69%) of  6,388 fields reachable
  12,255 (45.18%) of 27,123 methods reachable
      27 classes,     0 fields, and   267 methods registered for reflection
      58 classes,    58 fields, and    52 methods registered for JNI access
       4 native libraries: dl, pthread, rt, z
[3/7] Building universe...                                                                               (1.9s @ 0.57GB)
[4/7] Parsing methods...      [*]                                                                        (1.8s @ 1.34GB)
[5/7] Inlining methods...     [***]                                                                      (1.3s @ 0.60GB)
[6/7] Compiling methods...    [***]                                                                     (11.1s @ 2.30GB)
[7/7] Creating image...                                                                                  (5.9s @ 2.51GB)
   3.99MB (20.75%) for code area:     7,040 compilation units
   6.90MB (35.89%) for image heap:   87,480 objects and 5 resources
   7.34MB (38.17%) for debug info generated in 2.9s
1022.68KB ( 5.20%) for other data
  19.22MB in total
------------------------------------------------------------------------------------------------------------------------
Top 10 packages in code area:                               Top 10 object types in image heap:
 629.30KB java.util                                          874.37KB byte[] for code metadata
 315.03KB java.lang                                          822.56KB java.lang.String
 231.11KB java.util.regex                                    787.29KB byte[] for general heap data
 220.35KB java.text                                          611.73KB java.lang.Class
 198.08KB com.oracle.svm.jni                                 511.61KB byte[] for java.lang.String
 165.35KB java.util.concurrent                               365.86KB java.util.HashMap$Node
 115.22KB java.math                                          232.80KB com.oracle.svm.core.hub.DynamicHubCompanion
 105.90KB com.oracle.svm.core.genscavenge                    180.56KB java.util.HashMap$Node[]
  97.79KB sun.text.normalizer                                154.13KB java.lang.String[]
  96.14KB java.util.logging                                  139.54KB byte[] for embedded resources
   1.81MB for 117 more packages                                1.39MB for 684 more object types
------------------------------------------------------------------------------------------------------------------------
                        1.1s (2.8% of total time) in 20 GCs | Peak RSS: 2.91GB | CPU load: 6.31
------------------------------------------------------------------------------------------------------------------------
Produced artifacts:
 /disk/graal/upstream-sources/graalvm/build-stats.json (raw)
 /disk/graal/upstream-sources/graalvm/hello (executable, debug_info)
 /disk/graal/upstream-sources/graalvm/hello.build_artifacts.txt (txt)
 /disk/graal/upstream-sources/graalvm/sources (debug_info)
========================================================================================================================
Finished generating 'hello' in 39.0s.

Generates this json that validates against the schema:

{
  "resource_usage": {
    "memory": {
      "system_total": 33260355584,
      "peak_rss_bytes": 3127443456
    },
    "garbage_collection": {
      "count": 20,
      "total_secs": 1.097
    },
    "cpu": {
      "load": 6.307451297470753,
      "total_cores": 8
    }
  },
  "image_details": {
    "debug_info": {
      "bytes": 7694974
    },
    "code_area": {
      "bytes": 4181808,
      "compilation_units": 7040
    },
    "total_bytes": 20157545,
    "image_heap": {
      "bytes": 7233536,
      "resources": {
        "bytes": 142884,
        "count": 5
      }
    }
  },
  "general_info": {
    "c_compiler": "gcc (redhat, x86_64, 11.3.1)",
    "name": "hello",
    "java_version": "11.0.16-beta+7-202206201130",
    "garbage_collector": "Serial GC",
    "graalvm_version": "GraalVM 22.3.0-dev Java 11 Mandrel Distribution"
  },
  "analysis_results": {
    "methods": {
      "total": 27123,
      "reflection": 267,
      "jni": 52,
      "reachable": 12255
    },
    "classes": {
      "total": 3725,
      "reflection": 27,
      "jni": 58,
      "reachable": 2709
    },
    "fields": {
      "total": 6388,
      "reflection": 0,
      "jni": 58,
      "reachable": 3430
    }
  }
}

Let me know what you think!

fniephaus · 2022-06-24T16:23:16Z

Cool! It's on the top of my list for next week. What happens if the build fails? Do I get an incomplete JSON or no JSON currently? What should the behaviour be in that case?

jerboaa · 2022-06-27T08:56:22Z

Cool! It's on the top of my list for next week.

Great, thanks!

What happens if the build fails? Do I get an incomplete JSON or no JSON currently? What should the behaviour be in that case?

Right now, I think it doesn't produce complete output (like the text output on stdout), but it should easily be adjustable to not produce any output in that case. What is the preference for failed builds?

Aside: I'll fix the checkstyle issues meanwhile.

fniephaus

Looking good and lightweight so far. Left some minor remarks. I think one main question we still need to answer is: what about optional values? When users write a consumer for these JSON files, they will have to check for optional values somehow, which may be annoying. What do you think?

substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/ProgressReporter.java

docs/reference-manual/native-image/assets/build-output-schema-v1.json

This uses a simple backing Map (HashMap), for collecting the data. It's entirely driven by ProgressReporter and if not requested by the user, by specifying: `-H:BuildOutputJSONFile=<file>` on the command line, entirely disabled. It also adds a JSON schema to the native image manual as an asset.

jerboaa · 2022-06-28T15:58:33Z

Looking good and lightweight so far. Left some minor remarks. I think one main question we still need to answer is: what about optional values? When users write a consumer for these JSON files, they will have to check for optional values somehow, which may be annoying. What do you think?

Thanks for the review! I've updated the patch.

The only true optional value is debug_info which makes sense IMO. It's a build-time-decided thing (produce them or not). That's been represented in the schema from the start too. Optional elements don't show up in the required properties list. So this would need to be handled for consumers.

Other than that the remaining "optional" values are c_compiler, java_version, jni elements and peak_rss_bytes and cpu_load. The numerical values take on -1 if unavailable and the String values allow null as per the schema now.

More thoughts?

fniephaus · 2022-06-28T16:04:44Z

More thoughts?

Apart from changing the -1 reporting back to 0 for JNI classes, not really. I mean we could allow JSON reporting for unsuccessful builds but that would make most of the fields optional and consumers would need to handle that. Maybe tracking reachability, RSS, etc for those unsuccessful builds isn't so useful anyway. What do you think?

fniephaus

Thanks for incorporating the feedback! I'm happy to take care of the rest and get it merged. :)

jerboaa · 2022-06-28T16:16:33Z

Apart from changing the -1 reporting back to 0 for JNI classes, not really

Not sure what you mean. Could you clarify? Say, you'd have an application (most likely would need a JDK which doesn't use any JNI), which has jniClasses == 0, then the json metrics for jni would be:

...
"classes": {
    ....
    "jni": 0
},
"methods": {
   ....
   "jni": -1
},
"fields": {
    ...
   "jni": -1
}
...

Would you rather have that 0? Sure.

Ideally, those JNI members would be boxed (or use an Object to collect members) and using the unset state value == null. That would disambiguate 0 (due to default type value for "unset") from 0 due to 0 JNI classes, fields, methods.

jerboaa · 2022-06-28T16:17:00Z

Thanks for incorporating the feedback! I'm happy to take care of the rest and get it merged. :)

OK, thank you!

fniephaus · 2022-06-28T16:19:08Z

Apart from changing the -1 reporting back to 0 for JNI classes, not really

Not sure what you mean. Could you clarify? Say, you'd have an application (most likely would need a JDK which doesn't use any JNI), which has jniClasses == 0, then the json metrics for jni would be:
...
"classes": {
    ....
    "jni": 0
},
"methods": {
   ....
   "jni": -1
},
"fields": {
    ...
   "jni": -1
}
...
Would you rather have that 0? Sure.

Ideally, those JNI members would be boxed (or use an Object to collect members) and using the unset state value == null. That would disambiguate 0 (due to default type value for "unset") from 0 due to 0 JNI classes, fields, methods.

I just realized that there is ProgressReporter.setJNIInfo() and that the relevant fields are -1 initialized. So -1 should be allowed. Sorry!

fniephaus · 2022-07-01T06:33:39Z

Merged to master via #4685. Again, thanks for contributing!

jerboaa · 2022-07-04T09:06:43Z

Thank you!

galderz · 2022-07-08T13:22:30Z

That's what I was proposing in #4481 (comment). However, we of course already have an infrastructure for measuring timing events. Take a look at this: 6f0d40b#diff-4076a2705a4cfe89f104024bd4a4ce19096184ea7431452b7f58a78feff8e50a. Note that we don't really consider these options public API either, but it seems like we should keep timing events out of the JSON for now.

Hmmm, so where do those timing events go? To the standard output? It'd a pain having to parse both a JSON and standard output to get all the info :\

galderz · 2022-07-08T13:29:16Z

Btw, the reason I'm so adamant about timings per phase, it's because I see people making assumptions like "most of the native image build time is spent in method compilation" (e.g. Carlo in this presentation), and that's certainly not the case for all applications. E.g. Quarkus applications tend to spend more time on analysis than method compilation. The overall time is not enough to understand these kind of nuances.

galderz · 2022-07-08T15:26:36Z

@jerboaa Pointed out that -H:+CollectImageBuildStatistics -H:ImageBuildStatisticsFile=timing.json might work to obtain easy to digest timings of different phases. I'll give that a go.

graalvmbot added the oca-signed label Apr 8, 2022

fniephaus self-assigned this Apr 8, 2022

jerboaa mentioned this pull request May 30, 2022

[WIP] Reports build perf of a Q app to https://github.com/Karm/collector Karm/mandrel-integration-tests#96

Closed

jerboaa force-pushed the add_stats_reporting branch 2 times, most recently from c9e8029 to 5fb696e Compare June 24, 2022 15:45

jerboaa marked this pull request as ready for review June 24, 2022 15:52

jerboaa force-pushed the add_stats_reporting branch from 5fb696e to 54cc169 Compare June 27, 2022 09:17

fniephaus requested changes Jun 27, 2022

View reviewed changes

jerboaa force-pushed the add_stats_reporting branch from 54cc169 to cdf505d Compare June 28, 2022 15:37

fniephaus approved these changes Jun 28, 2022

View reviewed changes

fniephaus changed the title ~~Add image stats reporter to build output~~ [GR-39497] Add image stats reporter to build output Jun 28, 2022

fniephaus closed this Jul 1, 2022

[GR-39497] Add image stats reporter to build output #4481

[GR-39497] Add image stats reporter to build output #4481

Conversation

jerboaa commented Apr 8, 2022

jerboaa commented Apr 8, 2022

jerboaa commented Apr 8, 2022

fniephaus commented Apr 8, 2022

jerboaa commented Apr 11, 2022

fniephaus commented Apr 11, 2022

jerboaa commented Apr 12, 2022

fniephaus commented Apr 12, 2022

galderz commented Apr 14, 2022

fniephaus commented May 23, 2022

jerboaa commented May 23, 2022

fniephaus commented May 23, 2022

jerboaa commented May 23, 2022

fniephaus commented May 25, 2022

jerboaa commented May 25, 2022

jerboaa commented Jun 7, 2022

galderz commented Jun 14, 2022

jerboaa commented Jun 14, 2022

galderz commented Jun 14, 2022

fniephaus commented Jun 14, 2022

zakkak commented Jun 14, 2022

fniephaus commented Jun 14, 2022

galderz commented Jun 16, 2022

fniephaus commented Jun 16, 2022

fniephaus commented Jun 16, 2022

olpaw commented Jun 20, 2022

olpaw commented Jun 20, 2022

olpaw commented Jun 20, 2022

fniephaus commented Jun 20, 2022

olpaw commented Jun 20, 2022

jerboaa commented Jun 20, 2022

fniephaus commented Jun 22, 2022

jerboaa commented Jun 22, 2022

jerboaa commented Jun 24, 2022

fniephaus commented Jun 24, 2022

jerboaa commented Jun 27, 2022

fniephaus left a comment

Choose a reason for hiding this comment

jerboaa commented Jun 28, 2022

fniephaus commented Jun 28, 2022

fniephaus left a comment

Choose a reason for hiding this comment

jerboaa commented Jun 28, 2022 • edited Loading

jerboaa commented Jun 28, 2022

fniephaus commented Jun 28, 2022

fniephaus commented Jul 1, 2022

jerboaa commented Jul 4, 2022

galderz commented Jul 8, 2022

galderz commented Jul 8, 2022

galderz commented Jul 8, 2022

jerboaa commented Jun 28, 2022 •

edited

Loading