Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mlockall: false despite being configured #9357

Closed
markwalkom opened this issue Jan 20, 2015 · 30 comments
Closed

mlockall: false despite being configured #9357

markwalkom opened this issue Jan 20, 2015 · 30 comments
Labels
:Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts >docs General docs changes help wanted adoptme Team:Delivery Meta label for Delivery team

Comments

@markwalkom
Copy link
Contributor

Lots of details below, but the general problem here is that mlockall is not being applied despite all settings being applied as per the config docs.

[root@centos7 ~]# grep MAX_LOCKED_MEMORY /etc/sysconfig/elasticsearch
MAX_LOCKED_MEMORY=unlimited
[root@centos7 ~]# grep mlock /etc/elasticsearch/elasticsearch.yml
bootstrap.mlockall: true

[root@centos7 ~]# rpm -qa|grep elastic
elasticsearch-1.4.2-1.noarch
[root@centos7 ~]# java -version
java version "1.7.0_71"
OpenJDK Runtime Environment (rhel-2.5.3.1.el7_0-x86_64 u71-b14)
OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)
[root@centos7 ~]# uname -a
Linux centos7 3.10.0-123.4.4.el7.x86_64 #1 SMP Fri Jul 25 05:07:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
[root@centos7 ~]# cat /etc/redhat-release
CentOS Linux release 7.0.1406 (Core)

From the logs
[2015-01-18 17:32:48,613][WARN ][common.jna ] Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).

From the _nodes API;
"max_file_descriptors" : 65535,

And the outcome is;

[vagrant@centos7 ~]$ curl -s localhost:9200/_nodes?pretty|grep mlockall
          "mlockall" : "true"
        "mlockall" : false

The last one is process.mlockall.

@drewr
Copy link
Contributor

drewr commented Jan 20, 2015

This can happen if the user starting ES doesn't have permission to lock the address space. In the parent environment root must have set ulimit -l unlimited before execing Java.

Is this with the RPM? With sysvinit or systemd?

@markwalkom
Copy link
Contributor Author

This is a default install from repo, so RPM and sysvinit.

@markwalkom
Copy link
Contributor Author

Looks like it uses systemd.

@masaruh pointed out that you can edit /usr/lib/systemd/system/elasticsearch.service and set;

LimitMEMLOCK=infinity

Then;

systemctl daemon-reload
service elasticsearch restart

That gives;

# curl -s localhost:9200/_nodes?pretty|grep mlock
          "mlockall" : "true"
        "mlockall" : true

@markwalkom
Copy link
Contributor Author

Looks like we need to correctly set this via systemd, which is the default as of RHEL/CentOS 7.

@clintongormley clintongormley added :Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts >bug help wanted adoptme labels Jan 20, 2015
@rsareth
Copy link

rsareth commented Apr 3, 2015

Hi,

I'm using the elasticsearch-1.4.2 RPM package and I'm dealing with the same problem on redhat-6.3 x86_64 / vmware vm with this version of java :

java version "1.7.0_71"
OpenJDK Runtime Environment (rhel-2.5.3.1.el5_11-x86_64 u71-b14)
OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)

I've changed the property LimitMEMLOCK /usr/lib/systemd/system/elasticsearch.service hoping it would've worked. No effect as espected :-(

And this line in the log that I don't understand :

[2015-04-03 17:22:28,827][WARN ][common.jna ] unable to link C library. native methods (mlockall) will be disabled.

The jna.jar is present :-|

And I don't know how to make the OS using the LimitMEMLOCK parameter.

What's very weird :

Any idea, please ?

Thank you

PS: I can't upgrade to ES-1.4.4. I'm stuck on ES-1.4.2 :-(

@lmgray
Copy link

lmgray commented Apr 11, 2015

I've been chasing this one also with latest elasticsearch on RHEL 6 -- kept getting the following on startup despite having "elasticsearch - memlock unlimited" in /etc/security/limits.conf, "MAX_LOCKED_MEMORY=unlimited" in /etc/sysconfig/elasticsearch:

Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).

It seems if I cd /etc/rc.d/init.d and start elasticsearch using ./elasticsearch start, the script runs and limits/memlock stuff works as expected. But when it's run as /etc/rc.d/init.d/elasticsearch start, it uses systemctl and the limits are not applied.

I found /usr/lib/systemd/system/elasticsearch.service had the needed line commented out:

# See MAX_LOCKED_MEMORY in sysconfig, use "infinity" when MAX_LOCKED_MEMORY=unlimited and using bootstrap.mlockall: true
#LimitMEMLOCK=infinity

Uncommenting that, running "systemctl daemon-reload" and restarting elasticsearch allowed the memlock to work properly.

@rsareth
Copy link

rsareth commented Apr 16, 2015

Hi Imgray,

Thank you for your response. I've tested what you have done, but it hasn't worked for me.

I've checked :

  • file /etc/security/limits.conf has the correct lines
  • file /etc/sysconfig/elasticsearch has "MAX_LOCKED_MEMORY=unlimited"
  • file /usr/lib/systemd/system/elasticsearch.service has the limitMEMLOCK uncommented

And I've restarted the service like you did with the script /etc/rc.d/init.d/elasticsearch

At the end, I've have the same result :

$ curl -s localhost:9200/_nodes?pretty|grep mlockall
"mlockall" : "true"
"mlockall" : false

@lmgray
Copy link

lmgray commented Apr 16, 2015

Have you tried:

cd /etc/rc.d/init.d

./elasticsearch start

I found running it that way it would run the init script vs. using systemd
and it would work.
That said, when I uncommented the limitMEMLOCK like in
/usr/lib/systemd/system/elasticsearch.service
systemd then worked so you seem to have a different problem than I had.

After you start it, try "cat /proc/pid/limits and see what the memory
limit there is?

@rsareth
Copy link

rsareth commented Apr 27, 2015

Hi Imgray,

Sorry for the delay. I've just tried your actions and it doesn't work too :-(

Here it is the limits for the current pid for elasticsearch :

$ cat /proc/33505/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 10485760 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 1024 30449 processes
Max open files 64000 64000 files
Max locked memory unlimited unlimited bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 30449 30449 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us

As you can see, the "Max locked memory" is set to unlimited. Now, I think I can confirm that our problems are different.

@matejzero
Copy link

Why is LimitMEMLOCK commented in systemd service file anyway?
How can I make sure, this stays uncommented during upgrades? It would be nice to add an env variable in sysconfig/elasticsearch so that it would set this to the right value and stayed persistent during upgrades.

I also had this problem and commenting out LimitMEMLOCK in elasticsearch.service resolved the issue. That was on CentOS 7.

Matej

@tlrx
Copy link
Member

tlrx commented May 20, 2015

@matejzero that's a nice suggestion but as far as I know env vars can be only be used in some SystemD settings like ExecStart but not in LimitMEMLOCK. I'll double check.

@matejzero
Copy link

I uncommented LimitMEMLOCK on all my nodes and it works on physical machines, but ES won't start on ovirt VM. systemctl reports elasticsearch.service: main process exited, code=killed, status=9/KILL.

Will investigate and report back.

@tlrx
Copy link
Member

tlrx commented May 20, 2015

@matejzero not sure it helps a lot but you can also have a look at #11163 and #9582

@matejzero
Copy link

It was a mistake on my side. I accidentally set ES_HEAP_SIZE to 100% memory and OOM killed the process. Ather fixing ES_HEAP_SIZE, it now works with LimitMEMLOCK set to infinity.

@markwalkom
Copy link
Contributor Author

Just wanted to poke the bear here to see if we can get this resolved, it's a performance hit that should be simple to fix and it'd be great to see in 2.0.

@markwalkom
Copy link
Contributor Author

@tlrx is there no way for us to manage this without forcing users to have to edit /usr/lib/systemd/system/elasticsearch.service after each upgrade/install? Or are we limited with systemd and how it handles things like this?

@harpreetsb
Copy link

hi,
i am using ubuntu and facing same problem
the output of http://localhost:9200/_nodes?pretty
gives

{
...

       "bootstrap" : {
          "mlockall" : "true"
        },
    ....

 "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 2962,
        "mlockall" : false
      },
}

Also
the command
systemctl daemon-reload is no available on ubuntu,
what is the option, i restarted the system, but nothing

can someone help

@rgevaert
Copy link

You can also create a file in /etc/systemd/system/elasticsearch.service.d/1-fixmemlock.conf with the contents

[Service]
LimitMEMLOCK=infinity

En then " systemctl daemon-reload"

@tlrx
Copy link
Member

tlrx commented Jan 20, 2016

@tlrx is there no way for us to manage this without forcing users to have to edit /usr/lib/systemd/system/elasticsearch.service after each upgrade/install? Or are we limited with systemd and how it handles things like this?

Sorry it took so long to answer. The systemd documentation indicates how to override default settings, see section "Overriding vendor settings". I did not test it but it should work on most distributions.

@clintongormley clintongormley added >docs General docs changes help wanted adoptme and removed help wanted adoptme >bug discuss labels Jan 20, 2016
@clintongormley
Copy link
Contributor

It sounds like what we need to do here is to add docs for adding a elasticsearch.service.d/ .conf file, so that this config survives upgrades

@onetruebob74
Copy link

So, to add more fuel to this fire, I have an ES cluster that all have bootstrap.mlockall: true. They also each have have a modified systemd file that has LimitMEMLOCK=infinity uncommented. However, when I issue curl -XGET 'http://es01:9200/_nodes/process?pretty' I get the following:

{
  "cluster_name" : "ELKCLUSTER",
  "nodes" : {
    "US3o_DhWR56SiTyHqhWG_g" : {
      "name" : "ws02",
      "transport_address" : "10.254.27.193:9300",
      "host" : "10.254.27.193",
      "ip" : "10.254.27.193",
      "version" : "2.1.1",
      "build" : "40e2c53",
      "http_address" : "10.254.27.193:9200",
      "attributes" : {
        "data" : "false",
        "master" : "false"
      },
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 31783,
        "mlockall" : false
      }
    },
    "LahFOMA7Thi4cN_ThxjmZw" : {
      "name" : "es02",
      "transport_address" : "10.254.27.192:9300",
      "host" : "10.254.27.192",
      "ip" : "10.254.27.192",
      "version" : "2.1.1",
      "build" : "40e2c53",
      "http_address" : "10.254.27.192:9200",
      "attributes" : {
        "master" : "true"
      },
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 1611,
        "mlockall" : false
      }
    },
    "Gl9E2ukUTJuSh0q4gnk7Rg" : {
      "name" : "es01",
      "transport_address" : "10.254.27.191:9300",
      "host" : "10.254.27.191",
      "ip" : "10.254.27.191",
      "version" : "2.1.1",
      "build" : "40e2c53",
      "http_address" : "10.254.27.191:9200",
      "attributes" : {
        "master" : "true"
      },
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 53852,
        "mlockall" : true
      }
    }
  }
}

ws02 is the Kibana server, the two es0? servers are elasticsearch. Any thoughts as to why mlockall is showing as false for es02 and ws02?

@clintongormley
Copy link
Contributor

@onetruebob74 have a look in your error log - see what it has to say. likeliest thing is that there is something different in the system configurations of theses machines

@elvarb
Copy link

elvarb commented Feb 22, 2016

I had the this problem and what Ingray mentioned worked for me

I found /usr/lib/systemd/system/elasticsearch.service had the needed line commented out:

# See MAX_LOCKED_MEMORY in sysconfig, use "infinity" when MAX_LOCKED_MEMORY=unlimited and using bootstrap.mlockall: true #LimitMEMLOCK=infinity
Uncommenting that, running "systemctl daemon-reload" and restarting elasticsearch allowed the memlock to work properly.

This should be added in the documentation

https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration.html#setup-configuration-memory

@jdelgad
Copy link

jdelgad commented Apr 7, 2016

I had the same issue and turned out /tmp was set to 'noexec' (grep tmp /etc/fstab). I ended up creating a new directory and used the -Djna.tmpdir option.

@clintongormley
Copy link
Contributor

These docs have been rewritten. Closing

@markwalkom
Copy link
Contributor Author

@improved-broccoli
Copy link

improved-broccoli commented Apr 27, 2017

Note that could happen if your Elasticsearch node is managed by supervisord, since the latter one seems to not take in account ulimits: Supervisor/supervisor#229

For example, Elasticbeanstalk use supervisor to manage processes.

@CHM-me-ch
Copy link

i am stuck with the error "Cannot allocate memory" while calling the mlockall() function. Can anyone please help me.

I am running my program in eclipse (2017) and OS is ubuntu 16.04 LTS.
I am trying to establish/ implement an UDP connection(real time & using i210 ethernet controller) and the real time extension i have used is "PREEMPT_RT".
I am pasting my code below:

#include <arpa/inet.h>
#include <netinet/in.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <unistd.h>
#include <limits.h>
#include <pthread.h>
#include <sched.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <string.h>

#define BUFLEN 512

#define NPACK 5
#define PORT 8089

#define SRV_IP "127.0.0.1"
/* diep(), #includes and #defines like in the server */

void diep(char *s)
{
perror(s);
exit(1);
}

void thread_func(void)
{
/
RT specific stuff here */
struct sockaddr_in si_other;
int s, i, slen=sizeof(si_other);
char buf[BUFLEN];

if ((s=socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP))==-1)
diep("socket");

memset((char *) &si_other, 0, sizeof(si_other));
si_other.sin_family = AF_INET;
si_other.sin_port = htons(PORT);
if (inet_aton(SRV_IP, &si_other.sin_addr)==0) {
fprintf(stderr, "inet_aton() failed\n");
exit(1);
    }

for (i=0; i<NPACK; i++) {
printf("Sending packet %d\n", i);
sprintf(buf, "This is packet %d\n", i);
if (sendto(s, buf, BUFLEN, 0, &si_other, slen)==-1)
diep("sendto()");
   }

    close(s);
    //return NULL;
    return 0;

}

int main(int argc, char* argv[])
{

struct sched_param param;
        pthread_attr_t attr;
        pthread_t thread;
        int ret;

        int *ptr;
        ptr=(int*) malloc(400*1024);
        if(ptr==NULL)
        {
        	printf("Error!Memory not allocated");
        	exit(0);
        }

        /* Lock memory*/
        if(mlockall(MCL_CURRENT|MCL_FUTURE) == -1) {
                printf("mlockall failed: %m\n");
                exit(-2);
        }

        /* Initialize pthread attributes (default values) */
        ret = pthread_attr_init(&attr);
        if (ret) {
                printf("init pthread attributes failed\n");
                goto out;
        }

        /* Set a specific stack size  */
        ret = pthread_attr_setstacksize (&attr, PTHREAD_STACK_MIN);
        if (ret) {
            printf("pthread setstacksize failed\n");
            goto out;
        }

        /* Set scheduler policy and priority of pthread */
        ret = pthread_attr_setschedpolicy(&attr, SCHED_FIFO);
        if (ret) {
                printf("pthread setschedpolicy failed\n");
                goto out;
        }
        param.sched_priority = 80;
        ret = pthread_attr_setschedparam(&attr, &param);
        if (ret) {
                printf("pthread setschedparam failed\n");
                goto out;
        }
        /* Use scheduling parameters of attr */
        ret = pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED);
        if (ret) {
                printf("pthread setinheritsched failed\n");
                goto out;
        }

        /* Create a pthread with specified attributes*/
        ret = pthread_create(&thread, &attr, thread_func, NULL);
        if (ret) {
                printf("create pthread failed\n");
                goto out;
        }

        /* Join the thread and wait until it is done*/
        ret = pthread_join(thread, NULL);
        if (ret)
                printf("join pthread failed: %m\n");



out:
	                return ret;

}

I am new to ll this, and kindly please point out what i am doing wrong.
This is just a client side program .. i have similar one for server also.

@jasontedor
Copy link
Member

@CHM-me-ch This issue is not for general help with mlockall, nor is this repository even for general help. We use this repository to track issues for Elasticsearch. It looks like you're seeking general programming help. Maybe try Stack Overflow?

@mark-vieira mark-vieira added the Team:Delivery Meta label for Delivery team label Nov 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts >docs General docs changes help wanted adoptme Team:Delivery Meta label for Delivery team
Projects
None yet
Development

No branches or pull requests