Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to set environment variables for the command and set MALLOC_ARENA_MAX=2 for all JVM apps to reduce the memory overhead #160

Closed
wants to merge 2 commits into from

Conversation

lhotari
Copy link

@lhotari lhotari commented Mar 9, 2015

Provide a way to set environment variables for the command that is being built.
Also make it possible to set environment variables that apply to all container types in the Java buildpack.

  • example use case is setting the MALLOC_ARENA_MAX environment variable for all container types

The code in this pull request might be totally bad in style. I was more or less learning the internals of Java buildpack and how the toolchain works while doing the PR and perhaps something is useful here. Feel free to scrap the code in this PR. :)

It would be nice if the structure in Java buildpack would support the concept of environment variables that are passed on the command line to the command. This is very useful if we want to set some environment variables for all containers types that the Java buildpack supports.

Setting MALLOC_ARENA_MAX for all JVM apps can help reduce the memory overhead in 64-bit JVM apps.
References about MALLOC_ARENA_MAX:
Heroku guide:
https://devcenter.heroku.com/articles/tuning-glibc-memory-behavior#what-value-to-choose-for-malloc_arena_max
Hadoop issue:
https://issues.apache.org/jira/browse/HADOOP-7154

@cfdreddbot
Copy link

Hey lhotari!

Thanks for submitting this pull request!

All pull request authors must have a Contributor License Agreement (CLA) on-file with us. Please sign the appropriate CLA (individual or corporate).

When sending signed CLA please provide your github username in case of individual CLA or the list of github usernames that can make pull requests on behalf of your organization.

@lhotari
Copy link
Author

lhotari commented Mar 9, 2015

SO answer recommending setting MALLOC_ARENA_MAX=1
http://stackoverflow.com/a/12232725/166062

@lhotari
Copy link
Author

lhotari commented Mar 9, 2015

@lhotari
Copy link
Author

lhotari commented Mar 9, 2015

A blog post explaining MALLOC_ARENA_MAX https://www.infobright.com/index.php/malloc_arena_max

@lhotari
Copy link
Author

lhotari commented Mar 9, 2015

More evidence that this is useful:
hpcc-systems/HPCC-Platform#6767

This also applies to Ubuntu. Not just RedHat/CentOS. Any glibc >= 2.10 . I think this includes Ubuntu 10.04, 12.04 and 14.04.

@nebhale
Copy link
Member

nebhale commented Mar 9, 2015

@glyn Can you please take a look over this and let me know what you think about it? My understanding was that warden didn't actually expose any "virtual memory" to the process, so I'm not sure that this changes things for us in our virtualized, containerized environment.

@lhotari
Copy link
Author

lhotari commented Mar 9, 2015

Yes I think it's all about virtual memory also in warden.

The major part of overhead might be caused by memory fragmentation. That's why it's useful to set MALLOC_ARENA_MAX to a low value for Java processes. I have even seen reports that using a value of 1 is good for Java.

This blog post about MALLOC_ARENA_MAX setting in glibc says "In addition, resident memory has been known to creep in a manner similar to a memory leak or memory fragmentation." . It might explain some of the behaviour I was seeing in the application I've been concentrating on. The RSS value was slowing rising even though there wasn't a memory leak in the application (I checked several heap dumps).

Another IBM blog post by about the glibc malloc related tunables: https://www.ibm.com/developerworks/community/blogs/kevgrig/entry/linux_native_memory_fragmentation_and_process_size_growth?lang=en

So there are also other tunables besides MALLOC_ARENA_MAX. Here is an example usage .
However I probably wouldn't set those by default for the Java buildpack. I'm quite confident that setting MALLOC_ARENA_MAX to a low value <= 4 is very important.

@lhotari
Copy link
Author

lhotari commented Mar 9, 2015

@nebhale
Copy link
Member

nebhale commented Mar 9, 2015

@lhotari @glyn can correct me if I'm wrong, but I'm pretty sure that there isn't any virtual memory in the warden container. We know this because we've dealt with other problems previously due to its absence.

@lhotari
Copy link
Author

lhotari commented Mar 9, 2015

@nebhale On Linux you will usually always be dealing with virtual address space. I think it's the same on all modern operating systems.

There is a good presentation by the Kevin Grigorenko from IBM about Linux memory management.
Kevin Grigorenko has written the IBM blog posts I've previously linked to.

How do you define "virtual memory"? Are you referring to something like disabling overcommitting memory?
I have an understanding that with cgroups (warden container in based on cgroups) you cannot even disable overcommitting of memory and I don't think that it should even have to be possible.

@lhotari
Copy link
Author

lhotari commented Mar 9, 2015

btw. The slide number 7 in Kevin Grigorenko's presentation explains what MALLOC_ARENA_MAX setting is about.

@glyn
Copy link
Contributor

glyn commented Mar 10, 2015

Last time I looked, swapping was disabled in the stemcell. So there is virtual memory (of course), but no paging, so using excessive virtual memory will result in running out of physical memory and triggering the oom killer. If this environment variable reduces fragmentation of virtual memory, then it could presumably save an application from being killed oom.

@lhotari
Copy link
Author

lhotari commented Mar 10, 2015

@glyn From what I've understood of the references I've read about MALLOC_ARENA_MAX, using a low value does reduce fragmentation of memory allocations. That also reduces the amount of resident memory that gets allocated. The resources I've linked to this issue have some detailed information.

It's very surprising that the overhead can be so remarkable.

The Java 8 JVM behaves in a different way than Java 7 JVM. In Java 8, Metaspace is allocated from native memory and is subject to the MALLOC_ARENA_MAX setting.

The application I have is special from others because it's a Grails application. In Groovy, there are a lot of classes because all closures are classes. This causes a lot more pressure to metaspace than in plain Java applications.

I found a Hadoop issue that's related to the fact that Java 8 allocates more virtual memory than Java 7. However that's a different subject than adding this MALLOC_ARENA_MAX env variable and providing a way that the java-buildpack internally can add environment variables to the start command.

@nebhale
Copy link
Member

nebhale commented Mar 25, 2015

@lhotari any environment variable can be set for the execution of an application using cf set-env <APP> MALLOC_ARENA_MAX 2. Since we've not seen any other complaints about this problem, I'm inclined to let you set it that way on apps where there is a problem rather than making the change in the buildpack.

@lhotari
Copy link
Author

lhotari commented Mar 25, 2015

@nebhale I understand that it's always good to get opinion from several users. The problem is that most users don't complain.
Would it be possible to run some tests and measurements with some reference applications to see how the MALLOC_ARENA_MAX 2 setting does reduce JVM process RSS memory usage? It reduces JVM process RSS memory over time so the measurement isn't simply that the app gets started up.

@nebhale
Copy link
Member

nebhale commented Mar 25, 2015

I'm happy to open the issue to do some testing but be aware that it probably won't rate highly in priority. It make take some time until we get to it. In the meantime, I encourage you to use cf set-env.

@lhotari
Copy link
Author

lhotari commented Apr 21, 2015

After some more testing on this, it turns out that the glibc version in Ubuntu 10.04 (lucid64 stack) doesn't seems to propertly follow the setting of MALLOC_ARENA_MAX . There is a bug (https://sourceware.org/bugzilla/show_bug.cgi?id=13071) that is fixed in later glibc versions.
The same bug seems to be reported to Redhat as https://bugzilla.redhat.com/show_bug.cgi?id=799327 . Other reports: https://sourceware.org/bugzilla/show_bug.cgi?id=13137 , https://sourceware.org/bugzilla/show_bug.cgi?id=13754 , https://sourceware.org/bugzilla/show_bug.cgi?id=11261 . The fix is included in glibc 2.16 and the fix was also backported for glibc 2.15 . This is the commit fixing the bug: bminor/glibc@41b8189 (backport for 2.15 is bminor/glibc@7cf8e20 ).

I've now done some more testing with the cflinuxfs2 stack and the savings on memory usage are remarkable now after setting MALLOC_ARENA_MAX to a low value (2).

The bug report (https://sourceware.org/bugzilla/show_bug.cgi?id=13071) contains a test app for testing if there is a bug:

docker commands to start different containers:

docker run -it cloudfoundry/cflinuxfs2 bash
docker run -it cloudfoundry/lucid64 bash
wget --no-check-certificate --content-disposition https://sourceware.org/bugzilla/attachment.cgi?id=5890
gcc -pthread arena_max_test.c -o arena_max_test
MALLOC_ARENA_MAX=1 ./arena_max_test

Output should contain only 1 arena (malloc_stats report)

PID = 3539
Arena 0:
system bytes     =     135168
in use bytes     =      55296
Total (incl. mmap):
system bytes     =     135168
in use bytes     =      55296
max mmap regions =          0
max mmap bytes   =          0

And when you set MALLOC_ARENA_MAX to a high value, it should use the setting also in that case:

root@e47b7ae20aea:/# MALLOC_ARENA_MAX=10 ./arena_max_test
PID = 170
Arena 0:
system bytes     =     135168
in use bytes     =      38592
Arena 1:
system bytes     =     135168
in use bytes     =       3952
Arena 2:
system bytes     =     135168
in use bytes     =       4096
Arena 3:
system bytes     =     135168
in use bytes     =       4096
Arena 4:
system bytes     =     135168
in use bytes     =       4096
Arena 5:
system bytes     =     135168
in use bytes     =       4096
Arena 6:
system bytes     =     135168
in use bytes     =       4096
Arena 7:
system bytes     =     135168
in use bytes     =       4096
Arena 8:
system bytes     =     135168
in use bytes     =       4096
Arena 9:
system bytes     =     135168
in use bytes     =       4096
Total (incl. mmap):
system bytes     =    1351680
in use bytes     =      75312
max mmap regions =          0
max mmap bytes   =          0

This test fails on the lucid64 container.
In the RH bug report, the reporter mentioned that setting MALLOC_CHECK_=1 disables arenas (that setting has other side-effects too).
That setting does disable multiple arenas on the lucid64 container.

root@337ffdde85d1:/# MALLOC_CHECK_=1 ./arena_max_test
PID = 272
Arena 0:
system bytes     =     135168
in use bytes     =      55296
Total (incl. mmap):
system bytes     =     135168
in use bytes     =      55296
max mmap regions =          0
max mmap bytes   =          0

That might be some workaround to disable glibc malloc arenas in the lucid64 container. However there doesn't seem to be a reason to do that since there in now the cflinuxfs2 stack available where the MALLOC_ARENA_MAX setting is effective.

So anyone having out of memory problems with CF should try the MALLOC_ARENA_MAX=2 env. setting in the cflinuxfs2 stack.

@lhotari
Copy link
Author

lhotari commented Apr 21, 2015

There is some measurement data from Heroku that MALLOC_ARENA_MAX setting does really reduce memory usage. This setting will now actually get used when users switch to cflinuxfs2 stack. In the lucid64 stack MALLOC_ARENA_MAX setting didn't get used properly because of the bug in glibc 2.11.

@lhotari
Copy link
Author

lhotari commented Apr 21, 2015

To do more analysis, I added a way to call malloc_info to the java-buildpack-diagnostics-app. malloc_info will give stats in xml format.

I've also updated the tmate ssh solution to work with the new cflinuxfs2 stack. It's pretty handy if you want to get inside a running container with ssh. Works for me. :)

@lhotari
Copy link
Author

lhotari commented Apr 21, 2015

There might be some room for more malloc parameter tuning. This bug report shows some details https://sourceware.org/bugzilla/show_bug.cgi?id=11044

http://linux.die.net/man/3/mallopt

Note: Nowadays, glibc uses a dynamic mmap threshold by default. The initial value of the threshold
is 128*1024, but when blocks larger than the current threshold and less than or equal to
DEFAULT_MMAP_THRESHOLD_MAX are freed, the threshold is adjusted upwards to the size of
the freed block. When dynamic mmap thresholding is in effect, the threshold for trimming the heap is
also dynamically adjusted to be twice the dynamic mmap threshold. Dynamic adjustment of the
mmap threshold is disabled if any of the M_TRIM_THRESHOLD, M_TOP_PAD,
M_MMAP_THRESHOLD, or M_MMAP_MAX parameters is set.

This behaviour explains the fragmentation problem. Disabling dynamic mmap threshold should make sense for Java processes.

I'll test with these values:

export MALLOC_MMAP_THRESHOLD_=131072
export MALLOC_TRIM_THRESHOLD_=131072
export MALLOC_TOP_PAD_=131072
export MALLOC_MMAP_MAX_=65536

@cgfrost
Copy link
Contributor

cgfrost commented Jul 23, 2015

See: #159 (comment)

lhotari and others added 2 commits July 24, 2015 10:43
…ilt.

Also make it possible to set environment variables that apply to all container types in the Java buildpack.
- example use case is setting the MALLOC_ARENA_MAX environment variable for all container types
- see M_MMAP_THRESHOLD in "man mallopt"
- https://sourceware.org/bugzilla/show_bug.cgi?id=11044
- allow disabling tuning by setting JBP_NO_MALLOC_TUNING env
@lhotari
Copy link
Author

lhotari commented Jul 24, 2015

I rebased this PR.

@lhotari
Copy link
Author

lhotari commented Jul 24, 2015

@cgfrost I think this Linux glibc malloc tuning should be done by default in java-buildpack. Of course after proper evaluation and testing. This PR provides one solution for adding default environment variables. It's possible to override the values with cf environment values.

@cgfrost
Copy link
Contributor

cgfrost commented Aug 13, 2015

This has now been investigated by our support team with several long running applications (about a week) to try out different MALLOC_* settings. They have not seen any noticeable difference in the performance or memory usage of the applications. A lot of time has now been spent investigating the effect of these memory settings. Considering that it's simple to set these values on a per application bases which doesn't require forking the buildpack we aren't prepared to change the default behavior. I realize this isn't the outcome you were looking for but I'm going to close this pull request out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants