Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated EFA plugin version, added a new volume for EFA plugin to mount #1069

Merged
merged 1 commit into from
May 24, 2024
Merged

Updated EFA plugin version, added a new volume for EFA plugin to mount #1069

merged 1 commit into from
May 24, 2024

Conversation

zachdorame
Copy link
Contributor

@zachdorame zachdorame commented Feb 27, 2024

Issue

N/A

Description of changes

Bumped EFA plugin version to v0.5.1. As part of this new version, the EFA plugin needs to be mounted to /dev/infiniband; this change includes some additional lines to perform that mounting

Checklist

  • Added/modified documentation as required (such as the README.md for modified charts)
  • Incremented the chart version in Chart.yaml for the modified chart(s)
  • Manually tested. Describe what testing was done in the testing section below
  • Make sure the title of the PR is a good description that can go into the release notes

Testing

Installed chart on cluster. EFA plugin functioned normally and detected IB devices

doramebz@bcd074666b9c ~ % k logs efa-aws-efa-k8s-device-plugin-4kh6j
2024/02/27 23:34:55 Fetching EFA devices.
2024/02/27 23:34:55 device: rdmap79s0,uverbs0,/sys/class/infiniband_verbs/uverbs0,/sys/class/infiniband/rdmap79s0

2024/02/27 23:34:55 device: rdmap80s0,uverbs1,/sys/class/infiniband_verbs/uverbs1,/sys/class/infiniband/rdmap80s0

2024/02/27 23:34:55 device: rdmap81s0,uverbs2,/sys/class/infiniband_verbs/uverbs2,/sys/class/infiniband/rdmap81s0

2024/02/27 23:34:55 device: rdmap82s0,uverbs3,/sys/class/infiniband_verbs/uverbs3,/sys/class/infiniband/rdmap82s0

2024/02/27 23:34:55 device: rdmap96s0,uverbs4,/sys/class/infiniband_verbs/uverbs4,/sys/class/infiniband/rdmap96s0

2024/02/27 23:34:55 device: rdmap97s0,uverbs5,/sys/class/infiniband_verbs/uverbs5,/sys/class/infiniband/rdmap97s0

2024/02/27 23:34:55 device: rdmap98s0,uverbs6,/sys/class/infiniband_verbs/uverbs6,/sys/class/infiniband/rdmap98s0

2024/02/27 23:34:55 device: rdmap99s0,uverbs7,/sys/class/infiniband_verbs/uverbs7,/sys/class/infiniband/rdmap99s0

2024/02/27 23:34:55 device: rdmap113s0,uverbs8,/sys/class/infiniband_verbs/uverbs8,/sys/class/infiniband/rdmap113s0

2024/02/27 23:34:55 device: rdmap114s0,uverbs9,/sys/class/infiniband_verbs/uverbs9,/sys/class/infiniband/rdmap114s0

2024/02/27 23:34:55 device: rdmap115s0,uverbs10,/sys/class/infiniband_verbs/uverbs10,/sys/class/infiniband/rdmap115s0

2024/02/27 23:34:55 device: rdmap116s0,uverbs11,/sys/class/infiniband_verbs/uverbs11,/sys/class/infiniband/rdmap116s0

2024/02/27 23:34:55 device: rdmap130s0,uverbs12,/sys/class/infiniband_verbs/uverbs12,/sys/class/infiniband/rdmap130s0

2024/02/27 23:34:55 device: rdmap131s0,uverbs13,/sys/class/infiniband_verbs/uverbs13,/sys/class/infiniband/rdmap131s0

2024/02/27 23:34:55 device: rdmap132s0,uverbs14,/sys/class/infiniband_verbs/uverbs14,/sys/class/infiniband/rdmap132s0

2024/02/27 23:34:55 device: rdmap133s0,uverbs15,/sys/class/infiniband_verbs/uverbs15,/sys/class/infiniband/rdmap133s0

2024/02/27 23:34:55 device: rdmap147s0,uverbs16,/sys/class/infiniband_verbs/uverbs16,/sys/class/infiniband/rdmap147s0

2024/02/27 23:34:55 device: rdmap148s0,uverbs17,/sys/class/infiniband_verbs/uverbs17,/sys/class/infiniband/rdmap148s0

2024/02/27 23:34:55 device: rdmap149s0,uverbs18,/sys/class/infiniband_verbs/uverbs18,/sys/class/infiniband/rdmap149s0

2024/02/27 23:34:55 device: rdmap150s0,uverbs19,/sys/class/infiniband_verbs/uverbs19,/sys/class/infiniband/rdmap150s0

2024/02/27 23:34:55 device: rdmap164s0,uverbs20,/sys/class/infiniband_verbs/uverbs20,/sys/class/infiniband/rdmap164s0

2024/02/27 23:34:55 device: rdmap165s0,uverbs21,/sys/class/infiniband_verbs/uverbs21,/sys/class/infiniband/rdmap165s0

2024/02/27 23:34:55 device: rdmap166s0,uverbs22,/sys/class/infiniband_verbs/uverbs22,/sys/class/infiniband/rdmap166s0

2024/02/27 23:34:55 device: rdmap167s0,uverbs23,/sys/class/infiniband_verbs/uverbs23,/sys/class/infiniband/rdmap167s0

2024/02/27 23:34:55 device: rdmap181s0,uverbs24,/sys/class/infiniband_verbs/uverbs24,/sys/class/infiniband/rdmap181s0

2024/02/27 23:34:55 device: rdmap182s0,uverbs25,/sys/class/infiniband_verbs/uverbs25,/sys/class/infiniband/rdmap182s0

2024/02/27 23:34:55 device: rdmap183s0,uverbs26,/sys/class/infiniband_verbs/uverbs26,/sys/class/infiniband/rdmap183s0

2024/02/27 23:34:55 device: rdmap184s0,uverbs27,/sys/class/infiniband_verbs/uverbs27,/sys/class/infiniband/rdmap184s0

2024/02/27 23:34:55 device: rdmap198s0,uverbs28,/sys/class/infiniband_verbs/uverbs28,/sys/class/infiniband/rdmap198s0

2024/02/27 23:34:55 device: rdmap199s0,uverbs29,/sys/class/infiniband_verbs/uverbs29,/sys/class/infiniband/rdmap199s0

2024/02/27 23:34:55 device: rdmap200s0,uverbs30,/sys/class/infiniband_verbs/uverbs30,/sys/class/infiniband/rdmap200s0

2024/02/27 23:34:55 device: rdmap201s0,uverbs31,/sys/class/infiniband_verbs/uverbs31,/sys/class/infiniband/rdmap201s0

2024/02/27 23:34:55 EFA Device list: [{rdmap79s0 uverbs0 /sys/class/infiniband_verbs/uverbs0 /sys/class/infiniband/rdmap79s0} {rdmap80s0 uverbs1 /sys/class/infiniband_verbs/uverbs1 /sys/class/infiniband/rdmap80s0} {rdmap81s0 uverbs2 /sys/class/infiniband_verbs/uverbs2 /sys/class/infiniband/rdmap81s0} {rdmap82s0 uverbs3 /sys/class/infiniband_verbs/uverbs3 /sys/class/infiniband/rdmap82s0} {rdmap96s0 uverbs4 /sys/class/infiniband_verbs/uverbs4 /sys/class/infiniband/rdmap96s0} {rdmap97s0 uverbs5 /sys/class/infiniband_verbs/uverbs5 /sys/class/infiniband/rdmap97s0} {rdmap98s0 uverbs6 /sys/class/infiniband_verbs/uverbs6 /sys/class/infiniband/rdmap98s0} {rdmap99s0 uverbs7 /sys/class/infiniband_verbs/uverbs7 /sys/class/infiniband/rdmap99s0} {rdmap113s0 uverbs8 /sys/class/infiniband_verbs/uverbs8 /sys/class/infiniband/rdmap113s0} {rdmap114s0 uverbs9 /sys/class/infiniband_verbs/uverbs9 /sys/class/infiniband/rdmap114s0} {rdmap115s0 uverbs10 /sys/class/infiniband_verbs/uverbs10 /sys/class/infiniband/rdmap115s0} {rdmap116s0 uverbs11 /sys/class/infiniband_verbs/uverbs11 /sys/class/infiniband/rdmap116s0} {rdmap130s0 uverbs12 /sys/class/infiniband_verbs/uverbs12 /sys/class/infiniband/rdmap130s0} {rdmap131s0 uverbs13 /sys/class/infiniband_verbs/uverbs13 /sys/class/infiniband/rdmap131s0} {rdmap132s0 uverbs14 /sys/class/infiniband_verbs/uverbs14 /sys/class/infiniband/rdmap132s0} {rdmap133s0 uverbs15 /sys/class/infiniband_verbs/uverbs15 /sys/class/infiniband/rdmap133s0} {rdmap147s0 uverbs16 /sys/class/infiniband_verbs/uverbs16 /sys/class/infiniband/rdmap147s0} {rdmap148s0 uverbs17 /sys/class/infiniband_verbs/uverbs17 /sys/class/infiniband/rdmap148s0} {rdmap149s0 uverbs18 /sys/class/infiniband_verbs/uverbs18 /sys/class/infiniband/rdmap149s0} {rdmap150s0 uverbs19 /sys/class/infiniband_verbs/uverbs19 /sys/class/infiniband/rdmap150s0} {rdmap164s0 uverbs20 /sys/class/infiniband_verbs/uverbs20 /sys/class/infiniband/rdmap164s0} {rdmap165s0 uverbs21 /sys/class/infiniband_verbs/uverbs21 /sys/class/infiniband/rdmap165s0} {rdmap166s0 uverbs22 /sys/class/infiniband_verbs/uverbs22 /sys/class/infiniband/rdmap166s0} {rdmap167s0 uverbs23 /sys/class/infiniband_verbs/uverbs23 /sys/class/infiniband/rdmap167s0} {rdmap181s0 uverbs24 /sys/class/infiniband_verbs/uverbs24 /sys/class/infiniband/rdmap181s0} {rdmap182s0 uverbs25 /sys/class/infiniband_verbs/uverbs25 /sys/class/infiniband/rdmap182s0} {rdmap183s0 uverbs26 /sys/class/infiniband_verbs/uverbs26 /sys/class/infiniband/rdmap183s0} {rdmap184s0 uverbs27 /sys/class/infiniband_verbs/uverbs27 /sys/class/infiniband/rdmap184s0} {rdmap198s0 uverbs28 /sys/class/infiniband_verbs/uverbs28 /sys/class/infiniband/rdmap198s0} {rdmap199s0 uverbs29 /sys/class/infiniband_verbs/uverbs29 /sys/class/infiniband/rdmap199s0} {rdmap200s0 uverbs30 /sys/class/infiniband_verbs/uverbs30 /sys/class/infiniband/rdmap200s0} {rdmap201s0 uverbs31 /sys/class/infiniband_verbs/uverbs31 /sys/class/infiniband/rdmap201s0}]
2024/02/27 23:34:55 Starting FS watcher.
2024/02/27 23:34:55 Starting OS watcher.
2024/02/27 23:34:55 device: rdmap79s0,uverbs0,/sys/class/infiniband_verbs/uverbs0,/sys/class/infiniband/rdmap79s0

2024/02/27 23:34:55 device: rdmap80s0,uverbs1,/sys/class/infiniband_verbs/uverbs1,/sys/class/infiniband/rdmap80s0

2024/02/27 23:34:55 device: rdmap81s0,uverbs2,/sys/class/infiniband_verbs/uverbs2,/sys/class/infiniband/rdmap81s0

2024/02/27 23:34:55 device: rdmap82s0,uverbs3,/sys/class/infiniband_verbs/uverbs3,/sys/class/infiniband/rdmap82s0

2024/02/27 23:34:55 device: rdmap96s0,uverbs4,/sys/class/infiniband_verbs/uverbs4,/sys/class/infiniband/rdmap96s0

2024/02/27 23:34:55 device: rdmap97s0,uverbs5,/sys/class/infiniband_verbs/uverbs5,/sys/class/infiniband/rdmap97s0

2024/02/27 23:34:55 device: rdmap98s0,uverbs6,/sys/class/infiniband_verbs/uverbs6,/sys/class/infiniband/rdmap98s0

2024/02/27 23:34:55 device: rdmap99s0,uverbs7,/sys/class/infiniband_verbs/uverbs7,/sys/class/infiniband/rdmap99s0

2024/02/27 23:34:55 device: rdmap113s0,uverbs8,/sys/class/infiniband_verbs/uverbs8,/sys/class/infiniband/rdmap113s0

2024/02/27 23:34:55 device: rdmap114s0,uverbs9,/sys/class/infiniband_verbs/uverbs9,/sys/class/infiniband/rdmap114s0

2024/02/27 23:34:55 device: rdmap115s0,uverbs10,/sys/class/infiniband_verbs/uverbs10,/sys/class/infiniband/rdmap115s0

2024/02/27 23:34:55 device: rdmap116s0,uverbs11,/sys/class/infiniband_verbs/uverbs11,/sys/class/infiniband/rdmap116s0

2024/02/27 23:34:55 device: rdmap130s0,uverbs12,/sys/class/infiniband_verbs/uverbs12,/sys/class/infiniband/rdmap130s0

2024/02/27 23:34:55 device: rdmap131s0,uverbs13,/sys/class/infiniband_verbs/uverbs13,/sys/class/infiniband/rdmap131s0

2024/02/27 23:34:55 device: rdmap132s0,uverbs14,/sys/class/infiniband_verbs/uverbs14,/sys/class/infiniband/rdmap132s0

2024/02/27 23:34:55 device: rdmap133s0,uverbs15,/sys/class/infiniband_verbs/uverbs15,/sys/class/infiniband/rdmap133s0

2024/02/27 23:34:55 device: rdmap147s0,uverbs16,/sys/class/infiniband_verbs/uverbs16,/sys/class/infiniband/rdmap147s0

2024/02/27 23:34:55 device: rdmap148s0,uverbs17,/sys/class/infiniband_verbs/uverbs17,/sys/class/infiniband/rdmap148s0

2024/02/27 23:34:55 device: rdmap149s0,uverbs18,/sys/class/infiniband_verbs/uverbs18,/sys/class/infiniband/rdmap149s0

2024/02/27 23:34:55 device: rdmap150s0,uverbs19,/sys/class/infiniband_verbs/uverbs19,/sys/class/infiniband/rdmap150s0

2024/02/27 23:34:55 device: rdmap164s0,uverbs20,/sys/class/infiniband_verbs/uverbs20,/sys/class/infiniband/rdmap164s0

2024/02/27 23:34:55 device: rdmap165s0,uverbs21,/sys/class/infiniband_verbs/uverbs21,/sys/class/infiniband/rdmap165s0

2024/02/27 23:34:55 device: rdmap166s0,uverbs22,/sys/class/infiniband_verbs/uverbs22,/sys/class/infiniband/rdmap166s0

2024/02/27 23:34:55 device: rdmap167s0,uverbs23,/sys/class/infiniband_verbs/uverbs23,/sys/class/infiniband/rdmap167s0

2024/02/27 23:34:55 device: rdmap181s0,uverbs24,/sys/class/infiniband_verbs/uverbs24,/sys/class/infiniband/rdmap181s0

2024/02/27 23:34:55 device: rdmap182s0,uverbs25,/sys/class/infiniband_verbs/uverbs25,/sys/class/infiniband/rdmap182s0

2024/02/27 23:34:55 device: rdmap183s0,uverbs26,/sys/class/infiniband_verbs/uverbs26,/sys/class/infiniband/rdmap183s0

2024/02/27 23:34:55 device: rdmap184s0,uverbs27,/sys/class/infiniband_verbs/uverbs27,/sys/class/infiniband/rdmap184s0

2024/02/27 23:34:55 device: rdmap198s0,uverbs28,/sys/class/infiniband_verbs/uverbs28,/sys/class/infiniband/rdmap198s0

2024/02/27 23:34:55 device: rdmap199s0,uverbs29,/sys/class/infiniband_verbs/uverbs29,/sys/class/infiniband/rdmap199s0

2024/02/27 23:34:55 device: rdmap200s0,uverbs30,/sys/class/infiniband_verbs/uverbs30,/sys/class/infiniband/rdmap200s0

2024/02/27 23:34:55 device: rdmap201s0,uverbs31,/sys/class/infiniband_verbs/uverbs31,/sys/class/infiniband/rdmap201s0

2024/02/27 23:34:55 Starting to serve on /var/lib/kubelet/device-plugins/aws-efa-device-plugin.sock
2024/02/27 23:34:55 Registered device plugin with Kubelet
doramebz@bcd074666b9c ~ %

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@bryantbiggs
Copy link
Member

@jdn5126 / @suket22 are either of you able to review and approve this PR?

@jdn5126
Copy link
Contributor

jdn5126 commented Apr 19, 2024

@bryantbiggs I no longer work on EKS, so I would suggest reaching out to @orsenthil

@@ -67,7 +67,12 @@ spec:
volumeMounts:
- name: device-plugin
mountPath: /var/lib/kubelet/device-plugins
- name: infiniband-volume
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.5.1. As part of this new version, the EFA plugin needs to be mounted to /dev/infiniband;

Could you share the reference to this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the README in https://github.com/linux-rdma/rdma-core, we need to explicitly mount /dev/infiniband since that is where rdma-core accesses device nodes.

@jchen6585 jchen6585 merged commit 7248e07 into aws:master May 24, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants