-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide a way to set environment variables for the command and set MALLOC_ARENA_MAX=2 for all JVM apps to reduce the memory overhead #160
Conversation
Hey lhotari! Thanks for submitting this pull request! All pull request authors must have a Contributor License Agreement (CLA) on-file with us. Please sign the appropriate CLA (individual or corporate). When sending signed CLA please provide your github username in case of individual CLA or the list of github usernames that can make pull requests on behalf of your organization. |
SO answer recommending setting MALLOC_ARENA_MAX=1 |
IBM article about setting MALLOC_ARENA_MAX |
A blog post explaining MALLOC_ARENA_MAX https://www.infobright.com/index.php/malloc_arena_max |
More evidence that this is useful: This also applies to Ubuntu. Not just RedHat/CentOS. Any glibc >= 2.10 . I think this includes Ubuntu 10.04, 12.04 and 14.04. |
@glyn Can you please take a look over this and let me know what you think about it? My understanding was that |
Yes I think it's all about virtual memory also in warden. The major part of overhead might be caused by memory fragmentation. That's why it's useful to set This blog post about MALLOC_ARENA_MAX setting in glibc says "In addition, resident memory has been known to creep in a manner similar to a memory leak or memory fragmentation." . It might explain some of the behaviour I was seeing in the application I've been concentrating on. The RSS value was slowing rising even though there wasn't a memory leak in the application (I checked several heap dumps). Another IBM blog post by about the glibc malloc related tunables: https://www.ibm.com/developerworks/community/blogs/kevgrig/entry/linux_native_memory_fragmentation_and_process_size_growth?lang=en So there are also other tunables besides |
@nebhale On Linux you will usually always be dealing with virtual address space. I think it's the same on all modern operating systems. There is a good presentation by the Kevin Grigorenko from IBM about Linux memory management. How do you define "virtual memory"? Are you referring to something like disabling overcommitting memory? |
btw. The slide number 7 in Kevin Grigorenko's presentation explains what MALLOC_ARENA_MAX setting is about. |
Last time I looked, swapping was disabled in the stemcell. So there is virtual memory (of course), but no paging, so using excessive virtual memory will result in running out of physical memory and triggering the oom killer. If this environment variable reduces fragmentation of virtual memory, then it could presumably save an application from being killed oom. |
@glyn From what I've understood of the references I've read about It's very surprising that the overhead can be so remarkable. The Java 8 JVM behaves in a different way than Java 7 JVM. In Java 8, Metaspace is allocated from native memory and is subject to the The application I have is special from others because it's a Grails application. In Groovy, there are a lot of classes because all closures are classes. This causes a lot more pressure to metaspace than in plain Java applications. I found a Hadoop issue that's related to the fact that Java 8 allocates more virtual memory than Java 7. However that's a different subject than adding this |
@lhotari any environment variable can be set for the execution of an application using |
@nebhale I understand that it's always good to get opinion from several users. The problem is that most users don't complain. |
I'm happy to open the issue to do some testing but be aware that it probably won't rate highly in priority. It make take some time until we get to it. In the meantime, I encourage you to use |
fe995b2
to
5a93ffd
Compare
After some more testing on this, it turns out that the glibc version in Ubuntu 10.04 (lucid64 stack) doesn't seems to propertly follow the setting of MALLOC_ARENA_MAX . There is a bug (https://sourceware.org/bugzilla/show_bug.cgi?id=13071) that is fixed in later glibc versions. I've now done some more testing with the cflinuxfs2 stack and the savings on memory usage are remarkable now after setting MALLOC_ARENA_MAX to a low value (2). The bug report (https://sourceware.org/bugzilla/show_bug.cgi?id=13071) contains a test app for testing if there is a bug: docker commands to start different containers:
Output should contain only 1 arena (malloc_stats report)
And when you set MALLOC_ARENA_MAX to a high value, it should use the setting also in that case:
This test fails on the lucid64 container.
That might be some workaround to disable glibc malloc arenas in the lucid64 container. However there doesn't seem to be a reason to do that since there in now the cflinuxfs2 stack available where the MALLOC_ARENA_MAX setting is effective. So anyone having out of memory problems with CF should try the MALLOC_ARENA_MAX=2 env. setting in the cflinuxfs2 stack. |
There is some measurement data from Heroku that MALLOC_ARENA_MAX setting does really reduce memory usage. This setting will now actually get used when users switch to cflinuxfs2 stack. In the lucid64 stack MALLOC_ARENA_MAX setting didn't get used properly because of the bug in glibc 2.11. |
To do more analysis, I added a way to call malloc_info to the java-buildpack-diagnostics-app. malloc_info will give stats in xml format. I've also updated the tmate ssh solution to work with the new cflinuxfs2 stack. It's pretty handy if you want to get inside a running container with ssh. Works for me. :) |
There might be some room for more malloc parameter tuning. This bug report shows some details https://sourceware.org/bugzilla/show_bug.cgi?id=11044 http://linux.die.net/man/3/mallopt
This behaviour explains the fragmentation problem. Disabling dynamic mmap threshold should make sense for Java processes. I'll test with these values:
|
343c33e
to
89a21c6
Compare
See: #159 (comment) |
…ilt. Also make it possible to set environment variables that apply to all container types in the Java buildpack. - example use case is setting the MALLOC_ARENA_MAX environment variable for all container types
- see M_MMAP_THRESHOLD in "man mallopt" - https://sourceware.org/bugzilla/show_bug.cgi?id=11044 - allow disabling tuning by setting JBP_NO_MALLOC_TUNING env
89a21c6
to
32da2ef
Compare
I rebased this PR. |
@cgfrost I think this Linux glibc malloc tuning should be done by default in java-buildpack. Of course after proper evaluation and testing. This PR provides one solution for adding default environment variables. It's possible to override the values with cf environment values. |
This has now been investigated by our support team with several long running applications (about a week) to try out different |
Provide a way to set environment variables for the command that is being built.
Also make it possible to set environment variables that apply to all container types in the Java buildpack.
The code in this pull request might be totally bad in style. I was more or less learning the internals of Java buildpack and how the toolchain works while doing the PR and perhaps something is useful here. Feel free to scrap the code in this PR. :)
It would be nice if the structure in Java buildpack would support the concept of environment variables that are passed on the command line to the command. This is very useful if we want to set some environment variables for all containers types that the Java buildpack supports.
Setting MALLOC_ARENA_MAX for all JVM apps can help reduce the memory overhead in 64-bit JVM apps.
References about MALLOC_ARENA_MAX:
Heroku guide:
https://devcenter.heroku.com/articles/tuning-glibc-memory-behavior#what-value-to-choose-for-malloc_arena_max
Hadoop issue:
https://issues.apache.org/jira/browse/HADOOP-7154