Skip to content

Commit

Permalink
Change aws-ofi-plugin for EFA 1.35.0 due to regression (#466)
Browse files Browse the repository at this point in the history
  • Loading branch information
mhuguesaws authored Oct 25, 2024
1 parent e705fd5 commit 368f304
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 5 deletions.
6 changes: 3 additions & 3 deletions micro-benchmarks/nccl-tests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,15 +37,15 @@ The NCCL tests are packaged in a container.
> |-----------------------|-------------|---------------------------------------------------------------------------------------------|
> |`GDRCOPY_VERSION` | `v2.4.1` | [link](https://github.com/NVIDIA/gdrcopy) |
> |`EFA_INSTALLER_VERSION`| `1.35.0` | [link](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/efa-start.html#efa-start-enable) |
> |`AWS_OFI_NCCL_VERSION` | `v1.12.0-aws`| [link](https://github.com/aws/aws-ofi-nccl) |
> |`AWS_OFI_NCCL_VERSION` | `v1.12.1-aws`| [link](https://github.com/aws/aws-ofi-nccl) |
> |`NCCL_VERSION` | `v2.23.4-1` | [link](https://github.com/NVIDIA/nccl) |
> |`NCCL_TESTS_VERSION` | `v2.13.10` | [link](https://github.com/NVIDIA/nccl-tests) |
### Build the container
1. Build the container image with the command below:
```bash
EFA_INSTALLER_VERSION=1.35.0
AWS_OFI_NCCL_VERSION=v1.12.0-aws
AWS_OFI_NCCL_VERSION=v1.12.1-aws
NCCL_VERSION=v2.23.4-1
NCCL_TESTS_VERSION=v2.13.10
docker build -f nccl-tests.Dockerfile \
Expand Down Expand Up @@ -82,7 +82,7 @@ To run the NCCL tests on EKS, you will need to build the container image, then p
1. Create the ECR repository if it does not exist
```bash
EFA_INSTALLER_VERSION=1.35.0
AWS_OFI_NCCL_VERSION=v1.12.0-aws
AWS_OFI_NCCL_VERSION=v1.12.1-aws
NCCL_VERSION=v2.23.4-1
NCCL_TESTS_VERSION=v2.13.10
ECR_REPOSITORY_NAME="nccl-tests"
Expand Down
2 changes: 1 addition & 1 deletion micro-benchmarks/nccl-tests/buildspec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ env:
variables:
GDRCOPY_VERSION: "v2.4.1"
EFA_INSTALLER_VERSION: "1.35.0"
AWS_OFI_NCCL_VERSION: "v1.12.0-aws"
AWS_OFI_NCCL_VERSION: "v1.12.1-aws"
NCCL_VERSION: "v2.23.4-1"
NCCL_TESTS_VERSION: "v2.13.10"
exported-variables:
Expand Down
2 changes: 1 addition & 1 deletion micro-benchmarks/nccl-tests/nccl-tests.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ FROM nvidia/cuda:12.2.2-devel-ubuntu22.04

ARG GDRCOPY_VERSION=v2.4.1
ARG EFA_INSTALLER_VERSION=1.35.0
ARG AWS_OFI_NCCL_VERSION=v1.12.0-aws
ARG AWS_OFI_NCCL_VERSION=v1.12.1-aws
ARG NCCL_VERSION=v2.23.4-1
ARG NCCL_TESTS_VERSION=v2.13.10

Expand Down

0 comments on commit 368f304

Please sign in to comment.