Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise docker default ulimit for nofile to 65535 #278

Closed
max-rocket-internet opened this issue Jun 3, 2019 · 18 comments
Closed

Raise docker default ulimit for nofile to 65535 #278

max-rocket-internet opened this issue Jun 3, 2019 · 18 comments
Assignees

Comments

@max-rocket-internet
Copy link
Contributor

In the latest AMI version, v20190327, in the file /etc/sysconfig/docker the file ulimit is set to 4096:

OPTIONS="--default-ulimit nofile=1024:4096"

We've already hit this limit with some java applications and have raised the limit to 65535 in user-data:

sed -i 's/^OPTIONS=.*/OPTIONS=\\\"--default-ulimit nofile=65535:65535\\\"/' /etc/sysconfig/docker && systemctl restart docker

Question: Isn't 4096 a little conservative for an EKS node? Is there anything wrong with just setting this to 65535 by default in the AMI?

@whereisaaron
Copy link

whereisaaron commented Jun 4, 2019

@max-rocket-internet it is supposed to be 65535, and was originally but there was a series of PR mishaps such that several images had changes added that reduced that to 4096 or 8192.

The fun started in #186 where someone thought the setting was lower and added a PR to ‘raise’ it to 8192. This actually reduced it from 65535 to 8192, which immediately caused problems (#193). People tried to revert that change in #206 but that didn’t work. Meanwhile a fix in #205 got closed in favor of #206. But #206 didn’t work because the latest commits weren’t being included in the AMI builds. So fresh builds in #233 tried to restore the #206 reversion of #186, while the ongoing issue was tracked in #234.

In theory the current latest AMIs should be back to 65535. Any fixed versions were dated 31 March or later, as the problem still wasn’t fixed on 29 March. And even after that I heard GPU AMI’s still had the issue. #233 (comment)

@mogren
Copy link

mogren commented Jun 4, 2019

Hehe, thanks @whereisaaron for the comprehensive history write up of this issue! 👍

@max-rocket-internet
Copy link
Contributor Author

Haha thanks for the run down, @whereisaaron

I've seen some of the previous issues around Elasticsearch but...

In theory the current latest AMIs should be back to 65535.

...is not true. We are running AMI version v20190327 and it's not fixed.

Any fixed versions were dated 31 March or later, as the problem still wasn’t fixed on 29 March

But there is no released version later than the AMI we are using!?

@thomasjungblut
Copy link

We are running AMI version v20190327 and it's not fixed.

I can also confirm that this is not fixed in v20190327.
Adding to the history, we actually have a wildly different number here, which is a bit too small for our gateway:

/ambassador $ ulimit -a
-f: file size (blocks)             unlimited
-t: cpu time (seconds)             unlimited
-d: data seg size (kb)             unlimited
-s: stack size (kb)                8192
-c: core file size (blocks)        unlimited
-m: resident set size (kb)         unlimited
-l: locked memory (kb)             64
-p: processes                      unlimited
-n: file descriptors               1024
-v: address space (kb)             unlimited
-w: locks                          unlimited
-e: scheduling priority            0
-r: real-time priority             0

@whereisaaron
Copy link

@max-rocket-internet

Any fixed versions were dated 31 March or later, as the problem still wasn’t fixed on 29 March

But there is no released version later than the [29 March] AMI we are using!?

That's a very good question @max-rocket-internet! Seems like someone left and turned out the lights. No AMI's published for a while.

@max-rocket-internet max-rocket-internet changed the title Consider raising docker default ulimit nofile? Raise docker default ulimit for nofile to 65535 Jun 7, 2019
@max-rocket-internet
Copy link
Contributor Author

max-rocket-internet commented Jun 7, 2019

OK I've changed the title of this issue to reflect a request to change the limit.

Now, I'm curious how is the OPTIONS line even in /etc/sysconfig/docker when it should be deleted according to here, right?

How about I make a PR to add a /etc/sysconfig/docker file to this repo, in /files, and then copy this file to the image in install-worker.sh in the same way that daemon.json is handled?

@max-rocket-internet
Copy link
Contributor Author

@mogren @micahhausler should I make a PR as I mentioned above? Or do you have something else in mind?

@echoboomer
Copy link

We have been running into this error on EKS nodes:

Jun 14 17:03:58 ip-10-128-13-134.ec2.internal kubelet[4364]: E0614 17:03:58.836313    4364 manager.go:337] Registration of the raw container factory failed: inotify_init: too many open files
Jun 14 17:03:58 ip-10-128-13-134.ec2.internal kubelet[4364]: F0614 17:03:58.836922    4364 kubelet.go:1344] Failed to start cAdvisor inotify_init: too many open files

The fix for us was to actually apply these changes in our userdata scripts where we bootstrap our EKS nodes in Terraform:

echo 'fs.inotify.max_user_instances = 8192' > /etc/sysctl.d/98-inotifyfix.conf
echo 'fs.inotify.max_user_watches = 524288' >> /etc/sysctl.d/98-inotifyfix.conf
sysctl --system

This overrides the 99-amazon.conf and, after applying, resolved our issue immediately. I think this needs to be fixed in the AMI as well.

@mcrute
Copy link
Contributor

mcrute commented Jul 3, 2019

Is this still an issue? The latest AMI version, currently v20190614, doesn't have any additional ulimit configuration. The OPTIONS line is no longer in /etc/sysconfig/docker and I see no other ulimit tweaks in this repo.

The default systemd unit file for dockerd in /usr/lib/systemd/system/docker.service (elided for brevity) sets the defaults to infinity:

[Service]
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity

Validated by looking at /proc/$(pidof dockerd)/limits:

Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        unlimited            unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             unlimited            unlimited            processes
Max open files            65536                65536                files
Max locked memory         65536                65536                bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       3830                 3830                 signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us

I plan to resolve this issue around July 10 if there is no confirmation that this is still an issue.

@echoboomer the inotify limit seems like it's independent of this issue. Would you mind opening that in a new issue so we can track a fix for it outside of this one? Thanks.

@max-rocket-internet
Copy link
Contributor Author

Looks resolve to me:

$ ssh ip-10-0-27-91.eu-west-1.compute.internal
The authenticity of host 'ip-10-0-27-91.eu-west-1.compute.internal (10.0.27.91)' can't be established.
ECDSA key fingerprint is SHA256:V5QfuYz9Nlw0IhA21gYOZYCiNoEF+KsH+KB9XxfQOdw.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ip-10-0-27-91.eu-west-1.compute.internal,10.0.27.91' (ECDSA) to the list of known hosts.

       __|  __|_  )
       _|  (     /   Amazon Linux 2 AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-2/
No packages needed for security; 3 packages available
Run "sudo yum update" to apply all updates.
[max.williams@ip-10-0-27-91 ~]$ sudo -i
[root@ip-10-0-27-91 ~]# ulimit
unlimited

Let's hope this issue doesn't come back again 🙏

@ironsalsa
Copy link

Would it be possible to raise it again to 82920 for TiKV? We're trying to run the TiDB database stack, but it requires higher ulimits. See pingcap/tidb-operator#299 for what I'm talking about - considering I've never run into ulimits like this in any other K8s providers, it shouldn't be set so low to prevent us from running applications.

@thjaeckle
Copy link

@max-rocket-internet sorry for reviving this discussion, but I am confused:

  • the docker "nofile" limits are indeed fixed:
    # docker run -it --rm ubuntu bash -c "ulimit -n -H"
    65536
    
  • but the "nofile" limits of the host (the EC2 based EKS worker) are not:
    # ulimit -n -H
    4096
    # ulimit -n
    1024
    

From reading the discussion history here those 2 different values were also mixed up I think.
As far as I understand, the host limits are in the last instance the ones that count, even if the docker process defines higher limits.

So this is not yet fixed, is it?

@sparrc
Copy link

sparrc commented Apr 1, 2020

@thjaeckle ulimit is user-based. Was the output of "ulimit -a" above run as ec2-user?

@lummie
Copy link

lummie commented May 21, 2020

Would it be possible to raise it again to 82920 for TiKV? We're trying to run the TiDB database stack, but it requires higher ulimits. See pingcap/tidb-operator#299 for what I'm talking about - considering I've never run into ulimits like this in any other K8s providers, it shouldn't be set so low to prevent us from running applications.

I'm hitting this too trying to deploy tikv.
ulimit -n in a pod in EKS reporting as 65536 however tikv wont start, saying it expects >= 82920
For comparison, local kind cluster AND azure k8s cluster have it set to 1048576

@ismailyenigul
Copy link

Hi
I am using AMI ID amazon-eks-node-1.16-v20200609 (ami-0a3879f5c5e608624) on EKS cluster. I have the same issue. I got Error: ENOSPC: System limit for number of file watchers reached while building a react app in the pod.

# sysctl fs.inotify.max_user_watches
fs.inotify.max_user_watches = 8192

@ajcann
Copy link

ajcann commented Jul 7, 2020

We're seeing this issue again in the EKS-optimized GPU images.

[ec2-user~]$ cat /etc/sysconfig/docker
# The max number of open files for the daemon itself, and all
# running containers.  The default value of 1048576 mirrors the value
# used by the systemd service unit.
DAEMON_MAXFILES=1048576

# Additional startup options for the Docker daemon, for example:
# OPTIONS="--ip-forward=true --iptables=true"
# By default we limit the number of open files per container
OPTIONS="--bridge=none --default-ulimit nofile=2048:8192 --log-driver=json-file --log-opt max-size=10m --log-opt max-file=10 --live-restore=true --max-concurrent-downloads=10"

# How many seconds the sysvinit script waits for the pidfile to appear
# when starting the daemon.
DAEMON_PIDFILE_TIMEOUT=10


[ec2-user ~]$ sysctl fs.inotify.max_user_watches
fs.inotify.max_user_watches = 8192

@Jeffwan
Copy link
Contributor

Jeffwan commented Jul 9, 2020

@ajcann Can you share the GPU AMI ID? I think I pass exact same docker default options to GPU AMI as well.

Update: container level setting on GPU AMI is still 2048:8192. basic AMI ulimit is 1048576 (we didn't specify ulimit, it inherits DAEMON_MAXFILES instead)

@Jeffwan
Copy link
Contributor

Jeffwan commented Jul 9, 2020

@thjaeckle

Linux kernel put nofile under cgroup control, and they are kind of independent. We don't necessarily need to change host ulimit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests