-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set containerd LimitNOFILE to recommended value #1535
Conversation
/ci |
@cartermckinnon roger that! I've dispatched a workflow. 👍 |
@cartermckinnon the workflow that you requested has completed. 🎉
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This reverts commit e098953.
FYI, this AMI release caused issues in production for my eks cluster.
Can we revert this PR? And is there a higher number or setting we can use for the default soft limit? edit:
|
This also caused issues in production for our EKS cluster running envoy. |
+1 Same for my clusters Is there a workaround to change that using Terraform? |
+1 here. The karpenter controller updated to the new ami and we crashed.
|
Yes, this resulted in a major production outage :( |
Same same, Merry Xmas 🎅 |
The limit basically went from "infinity" to 1,024 with "infinity" looking to be 1,048,576 (aka 1024*1024) ... at least from what we saw. This seems a rather risky change to make without forewarning. |
The soft limit. This is standard outside of containers (where it's been incorrectly set to
All this change has done has exposed that software is incorrectly relying on bad settings in the first place. It would be better to identify what software that is and to have it correctly fix the problem at their end.
ContextIn the meantime, you would be advised to workaround that by setting a higher soft limit, but I would highly discourage Notably with the
The soft limit is meant to be low, only raised for processes that are aware they need it and explicitly request to raise it. Please respect that. Github Actions has a hard limit of 65,536 ( WorkaroundIf you must raise it as a workaround, do not use The cases where that is insufficient are generally documented by the software and workload dependent. For Envoy, there is this 2019 issue regarding limits. That is a demonstration of where you'll need a sufficient hard limit. This 2022 issue also requests for Envoy to raise the soft limit, but the maintainers dismiss the request. I've opened an feature request at Envoy to convey the importance of their software raising the limit internally. |
I disagree with your assessment here. This may be true for the soft limit change, but this recommendation also changed the hard limit, which caused applications that legitimately request a higher ulimit than the new hard cap to fail. That is the failure mode that our workload encountered as a result of the AMI update. Raising the hard limit therefore isn't just a "workaround" as there are plenty of legitimate use cases where software may need more than 500k FDs. It would be better to recommend a higher hard cap - sure |
TL;DR:
In the collapsed section below, I would like to highlight an interest in knowing what use-cases are demanding more than 500k FDs per process.
Original response (verbose)
Yes, sorry about that. I was not aware that you were unable to set a limit higher than the AMI I don't have access to such environments, only what I could find online in discussions / reports and software docs.
Out of curiosity, is the software that raises the soft-limit to the hard-limit public? Does it document that requirement or is it specific to the scale of your deployment needs? Does your software actually use more than
The other main motivation behind choosing You'd have to raise the hard-limit there, so the assumption was businesses would have the talent with expertise to more easily identify this configuration need and handle it (vs the many individuals affected by the bugs It's a bit problematic when software that runs fine out of a container behaves in weird unexpected ways in a container environment due to defaults biased to enterprise. Doubling this hard limit did not seem worthwhile to match the traditional hard-limit of It's been years with
I'd appreciate you sharing some of those use cases?
My "workaround" advice refers to raising the soft-limit to
I have no real opinion with the AMI configuration here, other than discouraging
MongoDB documents 64k, Kafka documents 100k (as a minimum).
The approach systemd took involved finding what an appropriate hard limit would be with a variety of software and feedback cycles IIRC. Initially it was going to be For Docker and containerd, since they no longer set
|
* Update CHANGELOG.md for v20230703 AMI release (awslabs#1337) * Update CHANGELOG.md for v20230703 AMI release * Update CHANGELOG.md Co-authored-by: Carter <mckdev@amazon.com> * Update CHANGELOG.md --------- Co-authored-by: Carter <mckdev@amazon.com> * Update CHANGELOG.md (awslabs#1338) * Add logging for aws managed csi drivers (awslabs#1336) * Update CHANGELOG.md latest AMI release notes to highlight this was last 1.22 AMI (awslabs#1342) * Removing 1.22 from Makefile (awslabs#1343) * Generate version info for cached images only when is active (awslabs#1341) * Remove region names from us-iso/us-isob credential provider config (awslabs#1344) * Amazon Linux 2023 proof-of-concept (awslabs#1340) * Remove hardcoded pull_cni_from_github var (awslabs#1346) * Remove sonobuoy_e2e_registry (awslabs#1249) * Revert "avoid hard coding provisioner index array" (awslabs#1347) This reverts commit 6c16765. Signed-off-by: Davanum Srinivas <davanum@gmail.com> * Update sync-eni-max-pods.yaml role ARN (awslabs#1350) * Add CodeCommit sync action (awslabs#1351) * update core CNI plugins version (awslabs#1308) * Update internal build config (awslabs#1353) * Update binary references (awslabs#1355) * Update CHANGELOG.md for 20230711 AMI release (awslabs#1357) * Enable discard_unpacked_layers by default (awslabs#1360) * Mount bpffs on all supported Kubernetes versions (awslabs#1349) * Cleanup /var/log/audit (awslabs#1363) * Use GitHub bot user as committer/author (awslabs#1366) * Update eni-max-pods.txt (awslabs#1365) * Update CHANGELOG.md for 20230728 AMI release (awslabs#1371) * Update eni-max-pods.txt (awslabs#1373) Co-authored-by: GitHub <noreply@github.com> * Install latest amazon-ssm-agent from S3 (awslabs#1370) * Do not set KubeletCredentialProviders feature flag for 1.28+ (awslabs#1375) * Fix bug in var doc gen (awslabs#1378) * Generate docs for GitHub Pages (awslabs#1379) * Add write permissions to deploy-docs workflow (awslabs#1381) * Force-push docs to gh-pages (awslabs#1382) * Cache IMDS tokens per-user (awslabs#1386) * Install latest runc 1.1.* (awslabs#1384) * Update eni-max-pods.txt (awslabs#1388) * Update binary build dates (awslabs#1390) * Fetch new IMDS token for every request (awslabs#1395) * Update CHANGELOG for v20230816 (awslabs#1396) * Update eni-max-pods.txt (awslabs#1397) * Update Makefile with latest binaries (awslabs#1403) * Add CI bot (awslabs#1402) * Disable janitor in forks (awslabs#1407) * Add note about bot authorization (awslabs#1406) * noproxy for direct communication to apiserver and timeouts of 3 seconds (awslabs#1393) * Update CHANGELOG.md for 20230825 AMI release (awslabs#1408) * Update CHANGELOG.md for 20230825 AMI release --------- Co-authored-by: Vela WU <50354807+FerrelWallis@users.noreply.github.com> * Allow --reserved-cpus kubelet arg to be used (awslabs#1405) * Install kernel-headers, kernel-devel (awslabs#1302) * Handle eventually-consistent PrivateDnsName (awslabs#1383) * Add .git-commit to archivebuild (awslabs#1411) * Use archivebuild-wrapper system (awslabs#1413) * Discover .git-commit from environment (awslabs#1418) * Update eni-max-pods.txt (awslabs#1423) Co-authored-by: GitHub <noreply@github.com> * Update eni-max-pods.txt (awslabs#1424) Co-authored-by: GitHub <noreply@github.com> * Require builder instance to use IMDSv2 (awslabs#1422) * Add release note config (awslabs#1426) * Update eni-max-pods.txt (awslabs#1429) Co-authored-by: GitHub <noreply@github.com> * Use 2023-09-14 binaries, add 1.28 target (awslabs#1431) * Update eni-max-pods.txt (awslabs#1432) Co-authored-by: GitHub <noreply@github.com> * Set pid_max to 4194304 (awslabs#1434) * Install nerdctl (awslabs#1321) * Update CHANGELOG.md for 20230919 AMI release (awslabs#1439) * Update CHANGELOG.md for 20230919 AMI release Co-authored-by: Carter <cartermckinnon@gmail.com> --------- Co-authored-by: Carter <cartermckinnon@gmail.com> * bump latest Kubernetes build target version (awslabs#1440) * fix: Tag cached image with the ECR URI for the target region (awslabs#1442) * Add H100 into gpu clock (awslabs#1447) * bug: incorrect region variable name (awslabs#1449) Co-authored-by: ljosyula <ljosyula@amazon.com> * Update eni-max-pods.txt (awslabs#1452) Co-authored-by: GitHub <noreply@github.com> * Update CHANGELOG.md for 20231002 AMI release (awslabs#1456) Co-authored-by: ljosyula <ljosyula@amazon.com> * Build with latest binaries by default (awslabs#1391) * Fix region in cached image names (awslabs#1461) * Add 1.28 to CI (awslabs#1464) * Add optional FIPS support (awslabs#1458) * Set remote_folder on all shell provisioners (awslabs#1462) * Pull eksctl supported versions for CI (awslabs#1465) * remove kubernetes versions file and use eksctl supported version list * recognize compression Co-authored-by: Carter <cartermckinnon@gmail.com> --------- Co-authored-by: Carter <cartermckinnon@gmail.com> * Add CHANGELOG entry placeholder (awslabs#1466) * Add named arguments to bot commands (awslabs#1463) * get-ecr-uri.sh falls back to use another region in partition if region unconfigured (awslabs#1468) * Force delete CI clusters, don't wait for pod eviction (awslabs#1472) * Add CHANGELOG workflow for new releases (awslabs#1467) * Allow more flexible kernel_version (awslabs#1469) * Add r7i to eni-max-pods.txt (awslabs#1473) Co-authored-by: GitHub <noreply@github.com> * Fix containerd slice configuration (awslabs#1437) * Correctly tag cached images for us-gov-west-1 FIPS endpoint (awslabs#1476) * Lint space errors (awslabs#1121) * Ignore commit to address space errors (awslabs#1478) * Collect more info about Amazon VPC CNI (awslabs#1245) * Update eni-max-pods.txt (awslabs#1485) Co-authored-by: GitHub <noreply@github.com> * Fail fast if we cannot determine kubelet version (awslabs#1484) kubelet is likely to fail when there is a mismatch with GLIBC that is in the image vs the one golang uses to build the kubelet. So fail the image right away when this happens as this specific kubelet binary will NOT work in any instance started with this image. ``` 2023-10-25T10:11:38-04:00: amazon-ebs: kubelet: /lib64/libc.so.6: version `GLIBC_2.32' not found (required by kubelet) 2023-10-25T10:11:38-04:00: amazon-ebs: kubelet: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by kubelet) ``` Signed-off-by: Davanum Srinivas <davanum@gmail.com> * Persist CI version-info.json as artifact (awslabs#1493) * Add new i4i sizes to eni-max-pods.txt (awslabs#1495) Co-authored-by: GitHub <noreply@github.com> * Update eni-max-pods.txt (awslabs#1497) Co-authored-by: GitHub <noreply@github.com> * Drop the FIPS related provisioners for al2023 (awslabs#1499) Signed-off-by: Davanum Srinivas <davanum@gmail.com> * Set nerdctl default namespace to k8s.io (awslabs#1488) * Update CHANGELOG.md for release v20231027 (awslabs#1502) Co-authored-by: GitHub <noreply@github.com> * Skip installing amazon-ssm-agent if already present (awslabs#1501) * Exclude automated eni-max-pods.txt PR's from release notes (awslabs#1498) * Remove extraneous space character (awslabs#1505) * Update CHANGELOG.md (awslabs#1507) * Update CHANGELOG.md to fix docker version (awslabs#1511) * Update docker to the latest 20.10 version (awslabs#1510) * Changelog entry format tweaks (awslabs#1508) * Document how to collect UserData (awslabs#1504) * Update eni-max-pods.txt (awslabs#1518) Co-authored-by: GitHub <noreply@github.com> * Update CHANGELOG.md for release v20231116 (awslabs#1521) Co-authored-by: GitHub <noreply@github.com> * Add check for ecr-fips endpoint availability (awslabs#1524) * Miscellaneous fixes from AL2023 testing (awslabs#1528) Signed-off-by: Davanum Srinivas <davanum@gmail.com> * fix Permission denied for 99-default.link (awslabs#1529) Signed-off-by: Davanum Srinivas <davanum@gmail.com> * Install SSM agent from AL core repo by default (awslabs#1531) * Update to `containerd` 1.7 (awslabs#1516) * Capture logs for EKS Pod Identity Agent (awslabs#1533) * change how aws cli is installed * Update CHANGELOG.md for release v20231201 (awslabs#1538) Co-authored-by: GitHub <noreply@github.com> * AL2023 networking changes for VPC CNI compatibility (awslabs#1539) * Set containerd LimitNOFILE to recommended value (awslabs#1535) * fix networkd settings (awslabs#1540) * Update get-ecr-uri.sh with ca-west-1 account (awslabs#1542) * Install amazon packer plugin for CI (awslabs#1545) * Fix flag typo in logging (awslabs#1547) * Update CHANGELOG.md for release v20231220 (awslabs#1550) Co-authored-by: GitHub <noreply@github.com> * Revert "Set containerd LimitNOFILE to recommended value (awslabs#1535)" (awslabs#1552) This reverts commit e098953. * set ssm_agent_version after updating from upstream * Uncomment filtering for circle ci config --------- Signed-off-by: Davanum Srinivas <davanum@gmail.com> Co-authored-by: Xavier Ryan <108886506+xr1776@users.noreply.github.com> Co-authored-by: Carter <mckdev@amazon.com> Co-authored-by: jacobwolfaws <113703057+jacobwolfaws@users.noreply.github.com> Co-authored-by: Prasad Shende <prasad0896@users.noreply.github.com> Co-authored-by: camrakin <113552683+camrakin@users.noreply.github.com> Co-authored-by: Davanum Srinivas <davanum@gmail.com> Co-authored-by: Jeffrey Nelson <jdnelson@amazon.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Sichaow <sichaow@amazon.com> Co-authored-by: GitHub <noreply@github.com> Co-authored-by: Vincent Marguerie <24724195+vincentmrg@users.noreply.github.com> Co-authored-by: Andrew Johnstone <andrew@ajohnstone.com> Co-authored-by: Vela WU <50354807+wwvela@users.noreply.github.com> Co-authored-by: Vela WU <50354807+FerrelWallis@users.noreply.github.com> Co-authored-by: Raghvendra Singh <90425886+raghs-aws@users.noreply.github.com> Co-authored-by: Matthew Wong <mattwon@amazon.com> Co-authored-by: Nick Baker <ndbaker1@outlook.com> Co-authored-by: ddl-retornam <56278673+ddl-retornam@users.noreply.github.com> Co-authored-by: Carter <cartermckinnon@gmail.com> Co-authored-by: Bryant Biggs <bryantbiggs@gmail.com> Co-authored-by: Laxmi Soumya Josyula <42261978+ljosyula@users.noreply.github.com> Co-authored-by: ljosyula <ljosyula@amazon.com> Co-authored-by: Alex Schultz <aschultz@clumio.com> Co-authored-by: Julien Baladier <julienbaladier@users.noreply.github.com> Co-authored-by: Matt <merkes@amazon.com> Co-authored-by: Zoltán Reegn <zoltan.reegn@gmail.com> Co-authored-by: donovanrost <donovan.rost@gmail.com> Co-authored-by: guessi <guessi@gmail.com> Co-authored-by: pjaudiomv <34245618+pjaudiomv@users.noreply.github.com> Co-authored-by: Edmond Ceausu <eceausu@amazon.com> Co-authored-by: Joe North <joseph@jnorth.me> Co-authored-by: Keto D. Zhang <keto.zhang@gmail.com>
* Merge with upstream v20231116 (#30) * Update CHANGELOG.md for v20230703 AMI release (awslabs#1337) * Update CHANGELOG.md for v20230703 AMI release * Update CHANGELOG.md Co-authored-by: Carter <mckdev@amazon.com> * Update CHANGELOG.md --------- Co-authored-by: Carter <mckdev@amazon.com> * Update CHANGELOG.md (awslabs#1338) * Add logging for aws managed csi drivers (awslabs#1336) * Update CHANGELOG.md latest AMI release notes to highlight this was last 1.22 AMI (awslabs#1342) * Removing 1.22 from Makefile (awslabs#1343) * Generate version info for cached images only when is active (awslabs#1341) * Remove region names from us-iso/us-isob credential provider config (awslabs#1344) * Amazon Linux 2023 proof-of-concept (awslabs#1340) * Remove hardcoded pull_cni_from_github var (awslabs#1346) * Remove sonobuoy_e2e_registry (awslabs#1249) * Revert "avoid hard coding provisioner index array" (awslabs#1347) This reverts commit 6c16765. Signed-off-by: Davanum Srinivas <davanum@gmail.com> * Update sync-eni-max-pods.yaml role ARN (awslabs#1350) * Add CodeCommit sync action (awslabs#1351) * update core CNI plugins version (awslabs#1308) * Update internal build config (awslabs#1353) * Update binary references (awslabs#1355) * Update CHANGELOG.md for 20230711 AMI release (awslabs#1357) * Enable discard_unpacked_layers by default (awslabs#1360) * Mount bpffs on all supported Kubernetes versions (awslabs#1349) * Cleanup /var/log/audit (awslabs#1363) * Use GitHub bot user as committer/author (awslabs#1366) * Update eni-max-pods.txt (awslabs#1365) * Update CHANGELOG.md for 20230728 AMI release (awslabs#1371) * Update eni-max-pods.txt (awslabs#1373) Co-authored-by: GitHub <noreply@github.com> * Install latest amazon-ssm-agent from S3 (awslabs#1370) * Do not set KubeletCredentialProviders feature flag for 1.28+ (awslabs#1375) * Fix bug in var doc gen (awslabs#1378) * Generate docs for GitHub Pages (awslabs#1379) * Add write permissions to deploy-docs workflow (awslabs#1381) * Force-push docs to gh-pages (awslabs#1382) * Cache IMDS tokens per-user (awslabs#1386) * Install latest runc 1.1.* (awslabs#1384) * Update eni-max-pods.txt (awslabs#1388) * Update binary build dates (awslabs#1390) * Fetch new IMDS token for every request (awslabs#1395) * Update CHANGELOG for v20230816 (awslabs#1396) * Update eni-max-pods.txt (awslabs#1397) * Update Makefile with latest binaries (awslabs#1403) * Add CI bot (awslabs#1402) * Disable janitor in forks (awslabs#1407) * Add note about bot authorization (awslabs#1406) * noproxy for direct communication to apiserver and timeouts of 3 seconds (awslabs#1393) * Update CHANGELOG.md for 20230825 AMI release (awslabs#1408) * Update CHANGELOG.md for 20230825 AMI release --------- Co-authored-by: Vela WU <50354807+FerrelWallis@users.noreply.github.com> * Allow --reserved-cpus kubelet arg to be used (awslabs#1405) * Install kernel-headers, kernel-devel (awslabs#1302) * Handle eventually-consistent PrivateDnsName (awslabs#1383) * Add .git-commit to archivebuild (awslabs#1411) * Use archivebuild-wrapper system (awslabs#1413) * Discover .git-commit from environment (awslabs#1418) * Update eni-max-pods.txt (awslabs#1423) Co-authored-by: GitHub <noreply@github.com> * Update eni-max-pods.txt (awslabs#1424) Co-authored-by: GitHub <noreply@github.com> * Require builder instance to use IMDSv2 (awslabs#1422) * Add release note config (awslabs#1426) * Update eni-max-pods.txt (awslabs#1429) Co-authored-by: GitHub <noreply@github.com> * Use 2023-09-14 binaries, add 1.28 target (awslabs#1431) * Update eni-max-pods.txt (awslabs#1432) Co-authored-by: GitHub <noreply@github.com> * Set pid_max to 4194304 (awslabs#1434) * Install nerdctl (awslabs#1321) * Update CHANGELOG.md for 20230919 AMI release (awslabs#1439) * Update CHANGELOG.md for 20230919 AMI release Co-authored-by: Carter <cartermckinnon@gmail.com> --------- Co-authored-by: Carter <cartermckinnon@gmail.com> * bump latest Kubernetes build target version (awslabs#1440) * fix: Tag cached image with the ECR URI for the target region (awslabs#1442) * Add H100 into gpu clock (awslabs#1447) * bug: incorrect region variable name (awslabs#1449) Co-authored-by: ljosyula <ljosyula@amazon.com> * Update eni-max-pods.txt (awslabs#1452) Co-authored-by: GitHub <noreply@github.com> * Update CHANGELOG.md for 20231002 AMI release (awslabs#1456) Co-authored-by: ljosyula <ljosyula@amazon.com> * Build with latest binaries by default (awslabs#1391) * Fix region in cached image names (awslabs#1461) * Add 1.28 to CI (awslabs#1464) * Add optional FIPS support (awslabs#1458) * Set remote_folder on all shell provisioners (awslabs#1462) * Pull eksctl supported versions for CI (awslabs#1465) * remove kubernetes versions file and use eksctl supported version list * recognize compression Co-authored-by: Carter <cartermckinnon@gmail.com> --------- Co-authored-by: Carter <cartermckinnon@gmail.com> * Add CHANGELOG entry placeholder (awslabs#1466) * Add named arguments to bot commands (awslabs#1463) * get-ecr-uri.sh falls back to use another region in partition if region unconfigured (awslabs#1468) * Force delete CI clusters, don't wait for pod eviction (awslabs#1472) * Add CHANGELOG workflow for new releases (awslabs#1467) * Allow more flexible kernel_version (awslabs#1469) * Add r7i to eni-max-pods.txt (awslabs#1473) Co-authored-by: GitHub <noreply@github.com> * Fix containerd slice configuration (awslabs#1437) * Correctly tag cached images for us-gov-west-1 FIPS endpoint (awslabs#1476) * Lint space errors (awslabs#1121) * Ignore commit to address space errors (awslabs#1478) * Collect more info about Amazon VPC CNI (awslabs#1245) * Update eni-max-pods.txt (awslabs#1485) Co-authored-by: GitHub <noreply@github.com> * Fail fast if we cannot determine kubelet version (awslabs#1484) kubelet is likely to fail when there is a mismatch with GLIBC that is in the image vs the one golang uses to build the kubelet. So fail the image right away when this happens as this specific kubelet binary will NOT work in any instance started with this image. ``` 2023-10-25T10:11:38-04:00: amazon-ebs: kubelet: /lib64/libc.so.6: version `GLIBC_2.32' not found (required by kubelet) 2023-10-25T10:11:38-04:00: amazon-ebs: kubelet: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by kubelet) ``` Signed-off-by: Davanum Srinivas <davanum@gmail.com> * Persist CI version-info.json as artifact (awslabs#1493) * Add new i4i sizes to eni-max-pods.txt (awslabs#1495) Co-authored-by: GitHub <noreply@github.com> * Update eni-max-pods.txt (awslabs#1497) Co-authored-by: GitHub <noreply@github.com> * Drop the FIPS related provisioners for al2023 (awslabs#1499) Signed-off-by: Davanum Srinivas <davanum@gmail.com> * Set nerdctl default namespace to k8s.io (awslabs#1488) * Update CHANGELOG.md for release v20231027 (awslabs#1502) Co-authored-by: GitHub <noreply@github.com> * Skip installing amazon-ssm-agent if already present (awslabs#1501) * Exclude automated eni-max-pods.txt PR's from release notes (awslabs#1498) * Remove extraneous space character (awslabs#1505) * Update CHANGELOG.md (awslabs#1507) * Update CHANGELOG.md to fix docker version (awslabs#1511) * Update docker to the latest 20.10 version (awslabs#1510) * Changelog entry format tweaks (awslabs#1508) * Document how to collect UserData (awslabs#1504) * Update Fluence changelog * Update what kubernetes ami will be build --------- Signed-off-by: Davanum Srinivas <davanum@gmail.com> Co-authored-by: Xavier Ryan <108886506+xr1776@users.noreply.github.com> Co-authored-by: Carter <mckdev@amazon.com> Co-authored-by: jacobwolfaws <113703057+jacobwolfaws@users.noreply.github.com> Co-authored-by: Prasad Shende <prasad0896@users.noreply.github.com> Co-authored-by: camrakin <113552683+camrakin@users.noreply.github.com> Co-authored-by: Davanum Srinivas <davanum@gmail.com> Co-authored-by: Jeffrey Nelson <jdnelson@amazon.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Sichaow <sichaow@amazon.com> Co-authored-by: GitHub <noreply@github.com> Co-authored-by: Vincent Marguerie <24724195+vincentmrg@users.noreply.github.com> Co-authored-by: Andrew Johnstone <andrew@ajohnstone.com> Co-authored-by: Vela WU <50354807+wwvela@users.noreply.github.com> Co-authored-by: Vela WU <50354807+FerrelWallis@users.noreply.github.com> Co-authored-by: Raghvendra Singh <90425886+raghs-aws@users.noreply.github.com> Co-authored-by: Matthew Wong <mattwon@amazon.com> Co-authored-by: Nick Baker <ndbaker1@outlook.com> Co-authored-by: ddl-retornam <56278673+ddl-retornam@users.noreply.github.com> Co-authored-by: Carter <cartermckinnon@gmail.com> Co-authored-by: Bryant Biggs <bryantbiggs@gmail.com> Co-authored-by: Laxmi Soumya Josyula <42261978+ljosyula@users.noreply.github.com> Co-authored-by: ljosyula <ljosyula@amazon.com> Co-authored-by: Alex Schultz <aschultz@clumio.com> Co-authored-by: Julien Baladier <julienbaladier@users.noreply.github.com> Co-authored-by: Matt <merkes@amazon.com> Co-authored-by: Zoltán Reegn <zoltan.reegn@gmail.com> Co-authored-by: donovanrost <donovan.rost@gmail.com> Co-authored-by: guessi <guessi@gmail.com> Co-authored-by: pjaudiomv <34245618+pjaudiomv@users.noreply.github.com> Co-authored-by: Edmond Ceausu <eceausu@amazon.com> * Add awscli to build step (#31) * Update CHANGELOG.md for v20230703 AMI release (awslabs#1337) * Update CHANGELOG.md for v20230703 AMI release * Update CHANGELOG.md Co-authored-by: Carter <mckdev@amazon.com> * Update CHANGELOG.md --------- Co-authored-by: Carter <mckdev@amazon.com> * Update CHANGELOG.md (awslabs#1338) * Add logging for aws managed csi drivers (awslabs#1336) * Update CHANGELOG.md latest AMI release notes to highlight this was last 1.22 AMI (awslabs#1342) * Removing 1.22 from Makefile (awslabs#1343) * Generate version info for cached images only when is active (awslabs#1341) * Remove region names from us-iso/us-isob credential provider config (awslabs#1344) * Amazon Linux 2023 proof-of-concept (awslabs#1340) * Remove hardcoded pull_cni_from_github var (awslabs#1346) * Remove sonobuoy_e2e_registry (awslabs#1249) * Revert "avoid hard coding provisioner index array" (awslabs#1347) This reverts commit 6c16765. Signed-off-by: Davanum Srinivas <davanum@gmail.com> * Update sync-eni-max-pods.yaml role ARN (awslabs#1350) * Add CodeCommit sync action (awslabs#1351) * update core CNI plugins version (awslabs#1308) * Update internal build config (awslabs#1353) * Update binary references (awslabs#1355) * Update CHANGELOG.md for 20230711 AMI release (awslabs#1357) * Enable discard_unpacked_layers by default (awslabs#1360) * Mount bpffs on all supported Kubernetes versions (awslabs#1349) * Cleanup /var/log/audit (awslabs#1363) * Use GitHub bot user as committer/author (awslabs#1366) * Update eni-max-pods.txt (awslabs#1365) * Update CHANGELOG.md for 20230728 AMI release (awslabs#1371) * Update eni-max-pods.txt (awslabs#1373) Co-authored-by: GitHub <noreply@github.com> * Install latest amazon-ssm-agent from S3 (awslabs#1370) * Do not set KubeletCredentialProviders feature flag for 1.28+ (awslabs#1375) * Fix bug in var doc gen (awslabs#1378) * Generate docs for GitHub Pages (awslabs#1379) * Add write permissions to deploy-docs workflow (awslabs#1381) * Force-push docs to gh-pages (awslabs#1382) * Cache IMDS tokens per-user (awslabs#1386) * Install latest runc 1.1.* (awslabs#1384) * Update eni-max-pods.txt (awslabs#1388) * Update binary build dates (awslabs#1390) * Fetch new IMDS token for every request (awslabs#1395) * Update CHANGELOG for v20230816 (awslabs#1396) * Update eni-max-pods.txt (awslabs#1397) * Update Makefile with latest binaries (awslabs#1403) * Add CI bot (awslabs#1402) * Disable janitor in forks (awslabs#1407) * Add note about bot authorization (awslabs#1406) * noproxy for direct communication to apiserver and timeouts of 3 seconds (awslabs#1393) * Update CHANGELOG.md for 20230825 AMI release (awslabs#1408) * Update CHANGELOG.md for 20230825 AMI release --------- Co-authored-by: Vela WU <50354807+FerrelWallis@users.noreply.github.com> * Allow --reserved-cpus kubelet arg to be used (awslabs#1405) * Install kernel-headers, kernel-devel (awslabs#1302) * Handle eventually-consistent PrivateDnsName (awslabs#1383) * Add .git-commit to archivebuild (awslabs#1411) * Use archivebuild-wrapper system (awslabs#1413) * Discover .git-commit from environment (awslabs#1418) * Update eni-max-pods.txt (awslabs#1423) Co-authored-by: GitHub <noreply@github.com> * Update eni-max-pods.txt (awslabs#1424) Co-authored-by: GitHub <noreply@github.com> * Require builder instance to use IMDSv2 (awslabs#1422) * Add release note config (awslabs#1426) * Update eni-max-pods.txt (awslabs#1429) Co-authored-by: GitHub <noreply@github.com> * Use 2023-09-14 binaries, add 1.28 target (awslabs#1431) * Update eni-max-pods.txt (awslabs#1432) Co-authored-by: GitHub <noreply@github.com> * Set pid_max to 4194304 (awslabs#1434) * Install nerdctl (awslabs#1321) * Update CHANGELOG.md for 20230919 AMI release (awslabs#1439) * Update CHANGELOG.md for 20230919 AMI release Co-authored-by: Carter <cartermckinnon@gmail.com> --------- Co-authored-by: Carter <cartermckinnon@gmail.com> * bump latest Kubernetes build target version (awslabs#1440) * fix: Tag cached image with the ECR URI for the target region (awslabs#1442) * Add H100 into gpu clock (awslabs#1447) * bug: incorrect region variable name (awslabs#1449) Co-authored-by: ljosyula <ljosyula@amazon.com> * Update eni-max-pods.txt (awslabs#1452) Co-authored-by: GitHub <noreply@github.com> * Update CHANGELOG.md for 20231002 AMI release (awslabs#1456) Co-authored-by: ljosyula <ljosyula@amazon.com> * Build with latest binaries by default (awslabs#1391) * Fix region in cached image names (awslabs#1461) * Add 1.28 to CI (awslabs#1464) * Add optional FIPS support (awslabs#1458) * Set remote_folder on all shell provisioners (awslabs#1462) * Pull eksctl supported versions for CI (awslabs#1465) * remove kubernetes versions file and use eksctl supported version list * recognize compression Co-authored-by: Carter <cartermckinnon@gmail.com> --------- Co-authored-by: Carter <cartermckinnon@gmail.com> * Add CHANGELOG entry placeholder (awslabs#1466) * Add named arguments to bot commands (awslabs#1463) * get-ecr-uri.sh falls back to use another region in partition if region unconfigured (awslabs#1468) * Force delete CI clusters, don't wait for pod eviction (awslabs#1472) * Add CHANGELOG workflow for new releases (awslabs#1467) * Allow more flexible kernel_version (awslabs#1469) * Add r7i to eni-max-pods.txt (awslabs#1473) Co-authored-by: GitHub <noreply@github.com> * Fix containerd slice configuration (awslabs#1437) * Correctly tag cached images for us-gov-west-1 FIPS endpoint (awslabs#1476) * Lint space errors (awslabs#1121) * Ignore commit to address space errors (awslabs#1478) * Collect more info about Amazon VPC CNI (awslabs#1245) * Update eni-max-pods.txt (awslabs#1485) Co-authored-by: GitHub <noreply@github.com> * Fail fast if we cannot determine kubelet version (awslabs#1484) kubelet is likely to fail when there is a mismatch with GLIBC that is in the image vs the one golang uses to build the kubelet. So fail the image right away when this happens as this specific kubelet binary will NOT work in any instance started with this image. ``` 2023-10-25T10:11:38-04:00: amazon-ebs: kubelet: /lib64/libc.so.6: version `GLIBC_2.32' not found (required by kubelet) 2023-10-25T10:11:38-04:00: amazon-ebs: kubelet: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by kubelet) ``` Signed-off-by: Davanum Srinivas <davanum@gmail.com> * Persist CI version-info.json as artifact (awslabs#1493) * Add new i4i sizes to eni-max-pods.txt (awslabs#1495) Co-authored-by: GitHub <noreply@github.com> * Update eni-max-pods.txt (awslabs#1497) Co-authored-by: GitHub <noreply@github.com> * Drop the FIPS related provisioners for al2023 (awslabs#1499) Signed-off-by: Davanum Srinivas <davanum@gmail.com> * Set nerdctl default namespace to k8s.io (awslabs#1488) * Update CHANGELOG.md for release v20231027 (awslabs#1502) Co-authored-by: GitHub <noreply@github.com> * Skip installing amazon-ssm-agent if already present (awslabs#1501) * Exclude automated eni-max-pods.txt PR's from release notes (awslabs#1498) * Remove extraneous space character (awslabs#1505) * Update CHANGELOG.md (awslabs#1507) * Update CHANGELOG.md to fix docker version (awslabs#1511) * Update docker to the latest 20.10 version (awslabs#1510) * Changelog entry format tweaks (awslabs#1508) * Document how to collect UserData (awslabs#1504) * Update eni-max-pods.txt (awslabs#1518) Co-authored-by: GitHub <noreply@github.com> * Update CHANGELOG.md for release v20231116 (awslabs#1521) Co-authored-by: GitHub <noreply@github.com> * Add check for ecr-fips endpoint availability (awslabs#1524) * Miscellaneous fixes from AL2023 testing (awslabs#1528) Signed-off-by: Davanum Srinivas <davanum@gmail.com> * fix Permission denied for 99-default.link (awslabs#1529) Signed-off-by: Davanum Srinivas <davanum@gmail.com> * Install SSM agent from AL core repo by default (awslabs#1531) * Update to `containerd` 1.7 (awslabs#1516) * Capture logs for EKS Pod Identity Agent (awslabs#1533) * change how aws cli is installed * Update CHANGELOG.md for release v20231201 (awslabs#1538) Co-authored-by: GitHub <noreply@github.com> * AL2023 networking changes for VPC CNI compatibility (awslabs#1539) * Set containerd LimitNOFILE to recommended value (awslabs#1535) * fix networkd settings (awslabs#1540) * Update get-ecr-uri.sh with ca-west-1 account (awslabs#1542) * Install amazon packer plugin for CI (awslabs#1545) * Fix flag typo in logging (awslabs#1547) * Update CHANGELOG.md for release v20231220 (awslabs#1550) Co-authored-by: GitHub <noreply@github.com> * Revert "Set containerd LimitNOFILE to recommended value (awslabs#1535)" (awslabs#1552) This reverts commit e098953. * set ssm_agent_version after updating from upstream * Uncomment filtering for circle ci config --------- Signed-off-by: Davanum Srinivas <davanum@gmail.com> Co-authored-by: Xavier Ryan <108886506+xr1776@users.noreply.github.com> Co-authored-by: Carter <mckdev@amazon.com> Co-authored-by: jacobwolfaws <113703057+jacobwolfaws@users.noreply.github.com> Co-authored-by: Prasad Shende <prasad0896@users.noreply.github.com> Co-authored-by: camrakin <113552683+camrakin@users.noreply.github.com> Co-authored-by: Davanum Srinivas <davanum@gmail.com> Co-authored-by: Jeffrey Nelson <jdnelson@amazon.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Sichaow <sichaow@amazon.com> Co-authored-by: GitHub <noreply@github.com> Co-authored-by: Vincent Marguerie <24724195+vincentmrg@users.noreply.github.com> Co-authored-by: Andrew Johnstone <andrew@ajohnstone.com> Co-authored-by: Vela WU <50354807+wwvela@users.noreply.github.com> Co-authored-by: Vela WU <50354807+FerrelWallis@users.noreply.github.com> Co-authored-by: Raghvendra Singh <90425886+raghs-aws@users.noreply.github.com> Co-authored-by: Matthew Wong <mattwon@amazon.com> Co-authored-by: Nick Baker <ndbaker1@outlook.com> Co-authored-by: ddl-retornam <56278673+ddl-retornam@users.noreply.github.com> Co-authored-by: Carter <cartermckinnon@gmail.com> Co-authored-by: Bryant Biggs <bryantbiggs@gmail.com> Co-authored-by: Laxmi Soumya Josyula <42261978+ljosyula@users.noreply.github.com> Co-authored-by: ljosyula <ljosyula@amazon.com> Co-authored-by: Alex Schultz <aschultz@clumio.com> Co-authored-by: Julien Baladier <julienbaladier@users.noreply.github.com> Co-authored-by: Matt <merkes@amazon.com> Co-authored-by: Zoltán Reegn <zoltan.reegn@gmail.com> Co-authored-by: donovanrost <donovan.rost@gmail.com> Co-authored-by: guessi <guessi@gmail.com> Co-authored-by: pjaudiomv <34245618+pjaudiomv@users.noreply.github.com> Co-authored-by: Edmond Ceausu <eceausu@amazon.com> Co-authored-by: Joe North <joseph@jnorth.me> Co-authored-by: Keto D. Zhang <keto.zhang@gmail.com> --------- Signed-off-by: Davanum Srinivas <davanum@gmail.com> Co-authored-by: Xavier Ryan <108886506+xr1776@users.noreply.github.com> Co-authored-by: Carter <mckdev@amazon.com> Co-authored-by: jacobwolfaws <113703057+jacobwolfaws@users.noreply.github.com> Co-authored-by: Prasad Shende <prasad0896@users.noreply.github.com> Co-authored-by: camrakin <113552683+camrakin@users.noreply.github.com> Co-authored-by: Davanum Srinivas <davanum@gmail.com> Co-authored-by: Jeffrey Nelson <jdnelson@amazon.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Sichaow <sichaow@amazon.com> Co-authored-by: GitHub <noreply@github.com> Co-authored-by: Vincent Marguerie <24724195+vincentmrg@users.noreply.github.com> Co-authored-by: Andrew Johnstone <andrew@ajohnstone.com> Co-authored-by: Vela WU <50354807+wwvela@users.noreply.github.com> Co-authored-by: Vela WU <50354807+FerrelWallis@users.noreply.github.com> Co-authored-by: Raghvendra Singh <90425886+raghs-aws@users.noreply.github.com> Co-authored-by: Matthew Wong <mattwon@amazon.com> Co-authored-by: Nick Baker <ndbaker1@outlook.com> Co-authored-by: ddl-retornam <56278673+ddl-retornam@users.noreply.github.com> Co-authored-by: Carter <cartermckinnon@gmail.com> Co-authored-by: Bryant Biggs <bryantbiggs@gmail.com> Co-authored-by: Laxmi Soumya Josyula <42261978+ljosyula@users.noreply.github.com> Co-authored-by: ljosyula <ljosyula@amazon.com> Co-authored-by: Alex Schultz <aschultz@clumio.com> Co-authored-by: Julien Baladier <julienbaladier@users.noreply.github.com> Co-authored-by: Matt <merkes@amazon.com> Co-authored-by: Zoltán Reegn <zoltan.reegn@gmail.com> Co-authored-by: donovanrost <donovan.rost@gmail.com> Co-authored-by: guessi <guessi@gmail.com> Co-authored-by: pjaudiomv <34245618+pjaudiomv@users.noreply.github.com> Co-authored-by: Edmond Ceausu <eceausu@amazon.com> Co-authored-by: Joe North <joseph@jnorth.me> Co-authored-by: Keto D. Zhang <keto.zhang@gmail.com>
…" (awslabs#1552) This reverts commit e098953.
Description of changes:
The service unit in AL2 currently sets
LimitNOFILE=infinity
oncontainerd
. This is known to cause issues, and the recommendation is to set it explicitly toLimitNOFILE=1024:524288
on systems usingsystemd
older that v240.More info: containerd/containerd#8924
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.to replicate. -->
See this guide for recommended testing for PRs. Some tests may not apply. Completing tests and providing additional validation steps are not required, but it is recommended and may reduce review time and time to merge.