Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to GraalVM 21.3 #20894

Merged
merged 1 commit into from
Nov 2, 2021
Merged

Update to GraalVM 21.3 #20894

merged 1 commit into from
Nov 2, 2021

Conversation

zakkak
Copy link
Contributor

@zakkak zakkak commented Oct 20, 2021

Creating as draft PR till Mandrel 21.3 gets released as well.

Blocked by: quarkusio/quarkus-images#175

@quarkus-bot quarkus-bot bot added area/core area/dependencies Pull requests that update a dependency file area/devtools Issues/PR related to maven, gradle, platform and cli tooling/plugins area/infra-automation anything related to CI, bots, etc. that are used to automated our infrastructure labels Oct 20, 2021
@zakkak zakkak marked this pull request as ready for review October 21, 2021 15:43
@quarkus-bot
Copy link

quarkus-bot bot commented Oct 21, 2021

This workflow status is outdated as a new workflow run has been triggered.

Failing Jobs - Building a3166ec

Status Name Step Failures Logs Raw logs
Native Tests - Main Build ⚠️ Check → Logs Raw logs

@zakkak zakkak self-assigned this Oct 21, 2021
@zakkak
Copy link
Contributor Author

zakkak commented Oct 21, 2021

Looking into the issue... 👀

@zakkak zakkak marked this pull request as draft October 23, 2021 19:25
@zakkak
Copy link
Contributor Author

zakkak commented Oct 23, 2021

Switching back to draft till we short out why main fails

@zakkak
Copy link
Contributor Author

zakkak commented Oct 26, 2021

Apparently main fails because in 21.3 the impact of ParseOnce on memory usage increased even more. For the time being it appears like the best thing to do is to disable ParseOnce like we did in the past, see #17711)

As a result, this PR is currently blocked by: #21018

@vsevel
Copy link
Contributor

vsevel commented Oct 27, 2021

funny the blog post claims memory improvements:

21.3 accumulates our consistent work on improving build times and memory requirements over the past few releases — in fact, even in 21.2 we saw improvements reports from the community, mentioning 20% faster build time and ~9% less memory usage. With 21.3 those improvements have gone even further.

image

@zakkak
Copy link
Contributor Author

zakkak commented Oct 27, 2021

funny the blog post claims memory improvements:

Getting the right numbers and understanding what you are looking at can be a bit tricky.

The "20% faster build time and ~9% less memory usage" part is coming from this tweet. Looking at the screenshots attached we see that native-image was run without setting the max heap size (although in this case it looks like 5G would be enough). This methodology and its results match the ones reported by myself in #19511 (comment). However, as mentioned in #21018:

#19511 didn't take into account runs with constraint memory and it also measured the max committed heap size instead of the max heap usage.

What that means is that despite seeing GraalVM report a smaller heap size (note that GraalVM prints the heap capacity instead of the heap usage, see oracle/graal#3913 and oracle/graal#3879) the actual heap usage (due to a larger live data set) might be higher. For instance, when compiling integration-tests/main with -H:+ParseOnce (the default) we get:

[1.098s][info][gc             ] GC(1) Pause Full (Metadata GC Threshold) 21M->21M(981M) 33.700ms
[1.837s][info][gc             ] GC(3) Pause Full (Metadata GC Threshold) 50M->49M(981M) 107.928ms
[2.855s][info][gc             ] GC(6) Pause Full (Metadata GC Threshold) 68M->66M(981M) 195.273ms
[4.637s][info][gc             ] GC(10) Pause Full (Metadata GC Threshold) 84M->82M(1214M) 218.478ms
[7.717s][info][gc             ] GC(15) Pause Full (Metadata GC Threshold) 113M->108M(1535M) 299.471ms
[11.592s][info][gc             ] GC(19) Pause Full (Metadata GC Threshold) 167M->159M(2029M) 479.966ms
[quarkus-integration-test-main-999-SNAPSHOT-runner:660320]    classlist:  15,316.48 ms,  1.99 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:660320]        (cap):     461.53 ms,  1.99 GB
[19.304s][info][gc             ] GC(23) Pause Full (Metadata GC Threshold) 274M->262M(2038M) 787.593ms
[quarkus-integration-test-main-999-SNAPSHOT-runner:660320]        setup:   4,257.11 ms,  1.99 GB
[40.280s][info][gc             ] GC(27) Pause Full (Ergonomics) 858M->819M(2648M) 2668.260ms
...
[660.914s][info][gc             ] GC(142) Pause Full (Ergonomics) 4233M->4020M(4677M) 8852.312ms
[664.718s][info][gc             ] GC(143) Pause Full (Ergonomics) 4233M->4020M(4677M) 3609.185ms
[quarkus-integration-test-main-999-SNAPSHOT-runner:660320]     (clinit):  32,017.65 ms,  4.57 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:660320]   (typeflow):  53,706.64 ms,  4.57 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:660320]    (objects): 493,701.13 ms,  4.57 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:660320]   (features):  55,578.81 ms,  4.57 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:660320]     analysis: 643,287.27 ms,  4.57 GB
[679.402s][info][gc             ] GC(144) Pause Full (Ergonomics) 4233M->4045M(4677M) 13812.440ms
...
[955.809s][info][gc             ] GC(168) Pause Full (Ergonomics) 4233M->4132M(4677M) 5785.816ms
[968.554s][info][gc             ] GC(169) Pause Full (Ergonomics) 4233M->4136M(4677M) 12270.639ms
[quarkus-integration-test-main-999-SNAPSHOT-runner:660320]     universe: 304,192.26 ms,  4.57 GB
[975.760s][info][gc             ] GC(170) Pause Full (Ergonomics) 4233M->2553M(4677M) 6218.938ms
[986.374s][info][gc             ] GC(174) Pause Full (Ergonomics) 3731M->2145M(4551M) 4980.004ms
[quarkus-integration-test-main-999-SNAPSHOT-runner:660320]      (parse):  17,163.81 ms,  4.44 GB
[996.957s][info][gc             ] GC(180) Pause Full (Ergonomics) 3350M->2347M(4551M) 5187.487ms
[quarkus-integration-test-main-999-SNAPSHOT-runner:660320]     (inline):  13,300.86 ms,  4.44 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:660320]    (compile):  65,098.31 ms,  4.50 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:660320]      compile: 101,154.78 ms,  4.51 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:660320]        image:   8,116.94 ms,  4.45 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:660320]        write:   1,089.84 ms,  4.45 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:660320]      [total]: 1,078,845.69 ms,  4.45 GB

while with -H:-ParseOnce we are getting:

[1.032s][info][gc             ] GC(1) Pause Full (Metadata GC Threshold) 21M->21M(981M) 35.079ms
[1.704s][info][gc             ] GC(3) Pause Full (Metadata GC Threshold) 47M->46M(981M) 98.235ms
[2.636s][info][gc             ] GC(5) Pause Full (Metadata GC Threshold) 69M->64M(981M) 175.362ms
[4.438s][info][gc             ] GC(10) Pause Full (Metadata GC Threshold) 83M->81M(1213M) 189.096ms
[7.389s][info][gc             ] GC(14) Pause Full (Metadata GC Threshold) 114M->110M(1735M) 266.495ms
[11.075s][info][gc             ] GC(18) Pause Full (Metadata GC Threshold) 172M->164M(2299M) 456.232ms
[quarkus-integration-test-main-999-SNAPSHOT-runner:238462]    classlist:  14,933.07 ms,  2.27 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:238462]        (cap):     584.39 ms,  2.27 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:238462]        setup:   3,420.07 ms,  2.27 GB
[21.461s][info][gc             ] GC(22) Pause Full (Metadata GC Threshold) 316M->304M(2279M) 889.954ms
[42.116s][info][gc             ] GC(27) Pause Full (Ergonomics) 713M->681M(2747M) 1845.657ms
[74.120s][info][gc             ] GC(39) Pause Full (Ergonomics) 1259M->1232M(3246M) 5075.440ms
[112.522s][info][gc             ] GC(66) Pause Full (Ergonomics) 1937M->1871M(4172M) 3979.535ms
[quarkus-integration-test-main-999-SNAPSHOT-runner:238462]     (clinit):   2,435.51 ms,  3.95 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:238462]   (typeflow):  17,452.69 ms,  3.95 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:238462]    (objects):  96,973.03 ms,  3.95 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:238462]   (features):  13,103.40 ms,  3.95 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:238462]     analysis: 134,716.99 ms,  3.95 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:238462]     universe:  11,758.36 ms,  3.87 GB
[183.398s][info][gc             ] GC(88) Pause Full (Ergonomics) 3151M->1788M(4090M) 5104.205ms
[quarkus-integration-test-main-999-SNAPSHOT-runner:238462]      (parse):  20,471.92 ms,  3.99 GB
[200.362s][info][gc             ] GC(97) Pause Full (Ergonomics) 3330M->2686M(4551M) 6108.123ms
[208.811s][info][gc             ] GC(102) Pause Full (Ergonomics) 3123M->2718M(4551M) 6170.834ms
[quarkus-integration-test-main-999-SNAPSHOT-runner:238462]     (inline):  23,507.43 ms,  4.44 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:238462]    (compile):  76,462.42 ms,  4.74 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:238462]      compile: 126,361.41 ms,  4.74 GB
[304.019s][info][gc             ] GC(174) Pause Full (Ergonomics) 4846M->1720M(4676M) 2978.740ms
[quarkus-integration-test-main-999-SNAPSHOT-runner:238462]        image:  12,978.97 ms,  4.57 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:238462]        write:   1,222.83 ms,  4.57 GB
[quarkus-integration-test-main-999-SNAPSHOT-runner:238462]      [total]: 306,752.97 ms,  4.57 GB

Notice how the latter reports a larger heap capacity (up to 4.74G), while the largest "observed" live data set is 2.7G. Note here that this comparison might be misleading as well, since the largest "observed" live data set might not be the actual maximum live data set. During the compile phase we see that the heap usage goes up to 4846M and only after that phase is completed a GC is triggered showing that only 1720M are still live. In a different run however, where the GC happens to get triggered during the compile phase we see 3253M of live data (still better than when running with -H:+ParseOnce, but still not sure whether this is the maximum).

[301.376s][info][gc             ] GC(164) Pause Full (Ergonomics) 3604M->3253M(4834M) 7615.336ms

Last, the table you attached presents image size, i.e., the size that the resulting binary takes on disk. I have not checked to see if the use of -H:-ParseOnce impacts this metric. Although an important metric for some we still consider it less important than being able to build a native image with less RAM.

HTH

@zakkak zakkak marked this pull request as ready for review October 29, 2021 18:20
@quarkus-bot
Copy link

quarkus-bot bot commented Oct 29, 2021

Failing Jobs - Building 875668e

Status Name Step Failures Logs Raw logs
✔️ JVM Tests - JDK 11
JVM Tests - JDK 17 Build Failures Logs Raw logs

Full information is available in the Build summary check run.

Failures

⚙️ JVM Tests - JDK 17 #

- Failing: integration-tests/vault-agroal 

📦 integration-tests/vault-agroal

io.quarkus.vault.AgroalVaultITCase. - More details - Source on GitHub

java.lang.RuntimeException: Error waiting for test resource future to finish.
	at io.quarkus.test.common.TestResourceManager.waitForAllFutures(TestResourceManager.java:151)
	at io.quarkus.test.common.TestResourceManager.start(TestResourceManager.java:136)

io.quarkus.vault.AgroalVaultKv1ITCase. - More details - Source on GitHub

java.lang.RuntimeException: Error waiting for test resource future to finish.
	at io.quarkus.test.common.TestResourceManager.waitForAllFutures(TestResourceManager.java:151)
	at io.quarkus.test.common.TestResourceManager.start(TestResourceManager.java:136)

io.quarkus.vault.VaultKv1ITCase. - More details - Source on GitHub

java.lang.RuntimeException: Error waiting for test resource future to finish.
	at io.quarkus.test.common.TestResourceManager.waitForAllFutures(TestResourceManager.java:151)
	at io.quarkus.test.common.TestResourceManager.start(TestResourceManager.java:136)

@zakkak zakkak requested review from gsmet and geoand November 1, 2021 09:56
@vsevel
Copy link
Contributor

vsevel commented Nov 1, 2021

@geoand I have tried executing a single vault test (including full vault init) 50 times in a row on my local host. I was able to reproduce one timeout, similar to the issue in this build. I will try making this part more resilient.

@geoand
Copy link
Contributor

geoand commented Nov 2, 2021

Great, thanks 👍

@gsmet gsmet merged commit ceb133f into quarkusio:main Nov 2, 2021
@quarkus-bot quarkus-bot bot added this to the 2.5 - main milestone Nov 2, 2021
@gsmet
Copy link
Member

gsmet commented Nov 2, 2021

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/core area/dependencies Pull requests that update a dependency file area/devtools Issues/PR related to maven, gradle, platform and cli tooling/plugins area/infra-automation anything related to CI, bots, etc. that are used to automated our infrastructure area/native-image release/noteworthy-feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants