-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak when shard fails to allocate on missing synonym dictionary #19879
Comments
@abeyad , do you know if the memory leak has been fixed? Just limiting the retries does not mean it fixed the memory leak. Do you know if #18467 will be pushed to 2.x releases? It is a bug and it drastically affects the whole cluster when it happens. However, #18467 is flagged as Enhancement, so I am afraid it won't get the necessary priority :( |
@adrianocrestani From what I understood from your description, you weren't OOM'ing but rather just experiencing memory pressure buildup, which would be expected if you keep trying to allocate all these resources for a shard that isn't properly starting over and over and over again in quick succession. Its not a memory leak as such but just memory pressure buildup. I don't think it will be back ported to 2.x as it is a pretty sizable change and while its definitely a big issue for cluster stability, the ultimate solution that will return your cluster to stability and allow you to create usable indices is to include the required synonyms dictionary. |
@abeyad , I disagree, it is definitely a leak. Even after major GC, a node with 256MB of heap and no shards allocated is stuck at 99% of used heap. That is definitely a leak. To add to that, when the synonym dictionary is deployed, it files OOM when it tries to finally allocate the shard. |
@adrianocrestani let me try to reproduce the situation with your outlined steps and get back to you |
@abeyad thanks! Here is something else to support my statement: "Our analysis tells that your application is suffering from memory leak. It can cause OutOfMemoryError, JVM to freeze, poor response time and high CPU consumption." |
@adrianocrestani I was able to reproduce the heap usage spikes following your instructions. I took a cursory glance at the heap dump produced and it looks like the problem is in Guice. I'll look in detail tomorrow and provide feedback. Thanks for the easy-to-follow instructions to reproduce! |
@abeyad I am happy you could reproduce it :) Should we re-open this bug then? |
@adrianocrestani I looked at the heap dump and the problem is within Guice. Its actually a manifestation of this Guice issue: google/guice#756 Guice has subsequently fixed this issue with a significant patch. In Elasticsearch, we are using our own custom implementation of Guice that we forked from the main Guice project but haven't kept it much in sync with Guice. The result is that back porting the fix for this issue to our custom Guice implementation will be non-trivial. We are actually moving away from using Guice entirely and in 5.x, significant parts of our code base that used Guice before (including index creation, which caused this behavior to manifest) no longer use Guice. Hence you won't see this same index creation problem with 5.x. It is not likely that we will be spending time fixing Guice for this scenario, especially since its an edge case - failed index creation attempts due to missing file(s). The memory leak issue is indeed real, but given that there are tools to monitor your cluster in case of nodes climbing up in heap usage, it is not difficult to detect that there is an issue and remove the node in question until the problem is remedied. |
@abeyad thanks again for looking into it. I agree it is an edge case and that I think we know how to recover our cluster when that happens. My other big concern is how a node that is very responsive due to GC pauses can make the whole cluster unresponsive. Shouldn't masters just kick that node out? I believe the cluster should be more resilient in those cases. |
@adrianocrestani I removed the comment. Thank you for filing this issue and the comprehensive instructions that enabled getting to the bottom of it much quicker! Unresponsive nodes would be removed from the cluster by master if they don't respond to pings within a certain amount of time. The default is 30 seconds, but it can be configured via the |
I have verified that it is fixed on ES 5.x (5.0 beta1) |
Elasticsearch version: 2.1.1 and 2.3.5
Plugins installed: []
JVM version: Oracle JDK 1.8.0_60
OS version: Ubuntu 4.8.2 (Trusty 14.04)
Description of the problem including expected versus actual behavior:
We ran into a memory issue in our ES cluster when some shards tried to be allocated on certain nodes that were missing a synonym dictionary. The heap usage went from 15GB to 30GB and the behaviour was 100% reproducible.
Those nodes did not OOM, but were spending a lot of time doing GCs. The whole cluster got very unresponsive where stats and search requests would take a long time to respond and often time out. The dictionary was missing on 4 out of 16 data nodes, only those 4 nodes started running out of memory, the other 12 nodes had the synonym dictionary in place and were able to allocate shards.
Cluster configuration: ES 2.1.1, 16 data nodes, 2 masters, 6 proxy nodes.
We were able to reproduce the memory leak problem with the steps below.
Steps to reproduce:
Install ES 2.3.5 on two different nodes
Create a synonym dictionary using the command below on each node:
mkdir $ES_HOME/config/analysis/ && cat /usr/share/dict/words | sed -r 's/(.*)/\1,\1_synonym/' > $ES_HOME/config/analysis/synonym.txt
Start both nodes with 256MB of heap.
export ES_HEAP_SIZE=256m && $ES_HOME/bin/elasticsearch
Create an index using the following schema:
Create a test index:
for k in {1..10000} ; do for i in {1..100} ; do printf "{\"index\": {\"_index\":\"test_index\", \"_type\":\"my_type\",\"_id\":\"${k}-${i}\"}}\n{\"name\":\"$(shuf -n 5 /usr/share/dict/words | tr '\n' ' ')\"}\n" ; done | curl -XPUT 'localhost:9200/_bulk' --data-binary @- ; done
Once the indexing is done, there should be one index called test_index, with one million documents, 3 shards and one replica. That means that both nodes should have 3 shards, one replica each. The command below should say that heap is around 40-50%.
curl -s 'localhost:9200/_cat/nodes?v&h=name,heap.current,heap.max,heap.percent'
Shutdown one of the nodes.
Remove the dictionary file from that node and start it again.
At this point, the node without the dictionary file will attempt to allocate the index shards, but will fail with the exception pasted log section below. ES attempts to do that over and over. Use the command below to keep track of the allocated heap.
curl -s 'localhost:9200/_cat/nodes?v&h=name,heap.current,heap.max,heap.percent'
After about 15-20 minutes, the heap will go above 95% and GC doesn't seem to help.
To make things even worse, recreate the synonym dictionary (recreate, copy from a backed up file or from the other node).
At this point, the node that is running out of memory will finally be able to allocate the those shards. Unfortunately, it will OOM.
Provide logs (if relevant):
After deploying the synonym dictionary it prints this:
The text was updated successfully, but these errors were encountered: