Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

native image build should report RSS size of build process #3879

Closed
jerboaa opened this issue Oct 8, 2021 · 7 comments
Closed

native image build should report RSS size of build process #3879

jerboaa opened this issue Oct 8, 2021 · 7 comments
Assignees
Labels

Comments

@jerboaa
Copy link
Collaborator

jerboaa commented Oct 8, 2021

Feature request

Add resident set size memory information to the native image build process for a more accurate picture of consumed memory.

Is your feature request related to a problem? Please describe.
We frequently analyze the native image build process for memory consumption. While it shows the current heap capacity (via Runtime.getRuntime().totalMemory()), but that's not the total picture as it only showing (some) heap information. There are other things consuming memory. Native GC structures, Class metadata, JIT compiler code caches, etc. This is particularly relevant on cloud systems where the build process might run with a memory limit and the native-image build process gets seemingly killed too early by the Linux OOM killer. The memory reporting of the native-image build process seemingly shows that there is still some room left. That can leave users confused.

Describe the solution you'd like.
Include resident set size (RSS) of the processes' virtual memory in addition to heap consumption for the native image build process. Example:

[spring-to-quarkus-todo-0.0.1-SNAPSHOT-runner:47]    classlist:  15,819.63 ms,  1.19 GB 0.71 GB (VmRSS)
[spring-to-quarkus-todo-0.0.1-SNAPSHOT-runner:47]        (cap):     691.90 ms,  1.19 GB 0.73 GB (VmRSS)
[spring-to-quarkus-todo-0.0.1-SNAPSHOT-runner:47]        setup:   2,464.19 ms,  1.19 GB 0.74 GB (VmRSS)
The bundle named: messages, has not been found. If the bundle is part of a module, verify the bundle name is a fully qualified class name. Otherwise verify the bundle path is accessible in the classpath.
17:07:37,727 INFO  [org.hib.Version] HHH000412: Hibernate ORM core version 5.5.7.Final
17:07:37,812 INFO  [org.hib.ann.com.Version] HCANN000001: Hibernate Commons Annotations {5.1.2.Final}
17:07:38,044 INFO  [org.hib.dia.Dialect] HHH000400: Using dialect: io.quarkus.hibernate.orm.runtime.dialect.QuarkusPostgreSQL10Dialect
17:09:02,211 INFO  [org.jbo.threads] JBoss Threads version 3.4.2.Final
[spring-to-quarkus-todo-0.0.1-SNAPSHOT-runner:47]     (clinit):   1,169.83 ms,  3.02 GB 3.49 GB (VmRSS)
[spring-to-quarkus-todo-0.0.1-SNAPSHOT-runner:47]   (typeflow): 118,958.38 ms,  3.02 GB 3.49 GB (VmRSS)
[spring-to-quarkus-todo-0.0.1-SNAPSHOT-runner:47]    (objects):  91,161.64 ms,  3.02 GB 3.49 GB (VmRSS)
[spring-to-quarkus-todo-0.0.1-SNAPSHOT-runner:47]   (features):   1,574.82 ms,  3.02 GB 3.49 GB (VmRSS)
[spring-to-quarkus-todo-0.0.1-SNAPSHOT-runner:47]     analysis: 215,031.84 ms,  3.02 GB 3.49 GB (VmRSS)
[spring-to-quarkus-todo-0.0.1-SNAPSHOT-runner:47]     universe:   7,198.35 ms,  3.02 GB 3.53 GB (VmRSS)
[spring-to-quarkus-todo-0.0.1-SNAPSHOT-runner:47]      (parse):  23,000.72 ms,  2.84 GB 3.76 GB (VmRSS)
[spring-to-quarkus-todo-0.0.1-SNAPSHOT-runner:47]     (inline):  21,371.34 ms,  2.83 GB 3.70 GB (VmRSS)
[spring-to-quarkus-todo-0.0.1-SNAPSHOT-runner:47]    (compile): 214,593.69 ms,  3.04 GB 3.74 GB (VmRSS)
[spring-to-quarkus-todo-0.0.1-SNAPSHOT-runner:47]      compile: 261,936.51 ms,  3.04 GB 3.74 GB (VmRSS)
[spring-to-quarkus-todo-0.0.1-SNAPSHOT-runner:47]        image:   9,250.12 ms,  2.98 GB 3.78 GB (VmRSS)
[spring-to-quarkus-todo-0.0.1-SNAPSHOT-runner:47]        write:   1,459.14 ms,  2.98 GB 3.77 GB (VmRSS)
[spring-to-quarkus-todo-0.0.1-SNAPSHOT-runner:47]      [total]: 513,565.72 ms,  2.98 GB 3.77 GB (VmRSS)
# Printing build artifacts to: /project/spring-to-quarkus-todo-0.0.1-SNAPSHOT-runner.build_artifacts.txt

Describe who do you think will benefit the most.
GraalVM users, GraalVM contributors, developers of libraries and frameworks which depend on GraalVM.

Describe alternatives you've considered.
I've done some experiments to hook this onto the existing native-image output. For Linux this can be done via:

native-image <native-image-args> | while read line; do echo -n "$line"; pid=$(echo "$line" | grep '^\[' | awk '{ print $1 }' | sed 's/\[//g' | sed 's/\]//g' | cut -d':' -f2); if [ "${pid}_" != "_" ]; then grep VmRSS /proc/$pid/status | awk '{ print $2 }' | awk '{ val=$1 / 1024; val=val/1024; printf "%5.2f GB (VmRSS)\n", val }'; else echo; fi; done

but this isn't easy to put into all possible build pipelines. A GraalVM patch seems the better approach.

Express whether you'd like to help contributing this feature
Feel free to assign this feature request to myself. I'll get an initial version (for Linux) implemented. If relevant parties can show me the corresponding Windows/Mac APIs I'd be happy to incorporate.

@jerboaa jerboaa added the feature label Oct 8, 2021
jerboaa added a commit to jerboaa/graal that referenced this issue Oct 12, 2021
Caveat is that the RSS values might be lower than the repoted
heap usage due to lazy commit. This can be worked-around by
-XX:+AlwaysPreTouch JVM option.

Closes oracle#3879
@fniephaus
Copy link
Member

Instead of RSS, a portable way would be to report:

Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory()

Would that work for your use cases?

@jerboaa
Copy link
Collaborator Author

jerboaa commented Oct 13, 2021

Instead of RSS, a portable way would be to report:

Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory()

Would that work for your use cases?

Thanks for the suggestion. I don't think this would work as both metrics totalMemory()[*] and freeMemory()[**] pertain to the Java heap. So the metric you are suggesting boils down to reporting heap(usage). While useful, it's not the full picture.

[*] implemented as heap(capacity)
[**] implemented as heap(capacity) - heap(usage)

@jerboaa
Copy link
Collaborator Author

jerboaa commented Oct 13, 2021

See also: #3913 (comment)

jerboaa added a commit to jerboaa/graal that referenced this issue Oct 13, 2021
As an observation the "memory usage" printed before this patch
was the heap capacity retrieved via API
Runtime.getRuntime().totalMemory(). The actual intent seemed to
be to print the heap usage, retrievable with API:
Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory()

This patch changes the output to dispaly heap usage instead.
Additionally, the resident set size (rss) is being displayed for
Linux. Other implementations may follow.

Closes oracle#3879
@jerboaa
Copy link
Collaborator Author

jerboaa commented Oct 13, 2021

@fniephaus I've changed the proposed PR to actually show heap usage (over capacity). This gives good diagnostic info on Linux. On other platforms we'd not loose any info - other than the heap(capacity) => heap(usage) change. Then again, I don't know how useful it is to show the heap capacity to begin with. Thoughts?

@fniephaus
Copy link
Member

The obvious problem with usage is that it can vary a lot. Here's an example:

image

@jerboaa
Copy link
Collaborator Author

jerboaa commented Oct 14, 2021

The obvious problem with usage is that it can vary a lot.

Why is that a problem? You get several snapshots for a native image build and you can be sure that it will never pessimize heap consumption. On the other hand, heap capacity almost always does that as your graph shows. When somebody also looks at the RSS values it leaves the reader with lots of question marks as to what's going on when heap usage seems to be larger than RSS sometimes.

@jerboaa
Copy link
Collaborator Author

jerboaa commented Nov 29, 2021

Closing as this has been added with: e04a9cd

@jerboaa jerboaa closed this as completed Nov 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants