diff --git a/docs/additional-functionality/rapids-shuffle.md b/docs/additional-functionality/rapids-shuffle.md index ca636bc22b9..95e66747504 100644 --- a/docs/additional-functionality/rapids-shuffle.md +++ b/docs/additional-functionality/rapids-shuffle.md @@ -43,7 +43,7 @@ be installed on the host and inside Docker containers (if not baremetal). A host requirements, like the MLNX_OFED driver and `nv_peer_mem` kernel module. The minimum UCX requirement for the RAPIDS Shuffle Manager is -[UCX 1.11.2](https://github.com/openucx/ucx/releases/tag/v1.11.2). +[UCX 1.12.1](https://github.com/openucx/ucx/releases/tag/v1.12.1). #### Baremetal @@ -73,47 +73,40 @@ The minimum UCX requirement for the RAPIDS Shuffle Manager is further. 2. Fetch and install the UCX package for your OS from: - [UCX 1.11.2](https://github.com/openucx/ucx/releases/tag/v1.11.2). - - NOTE: Please install the artifact with the newest CUDA 11.x version (for UCX 1.11.2 please - pick CUDA 11.2) as CUDA 11 introduced [CUDA Enhanced Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/index.html#enhanced-compat-minor-releases). - Starting with UCX 1.12, UCX will stop publishing individual artifacts for each minor version of CUDA. - - Please refer to our [FAQ](../FAQ.md#what-hardware-is-supported) for caveats with - CUDA Enhanced Compatibility. + [UCX 1.12.1](https://github.com/openucx/ucx/releases/tag/v1.12.1). RDMA packages have extra requirements that should be satisfied by MLNX_OFED. ##### CentOS UCX RPM -The UCX packages for CentOS 7 and 8 are divided into different RPMs. For example, UCX 1.11.2 +The UCX packages for CentOS 7 and 8 are divided into different RPMs. For example, UCX 1.12.1 available at -https://github.com/openucx/ucx/releases/download/v1.11.2/ucx-v1.11.2-centos7-mofed5.x-cuda11.2.tar.bz2 +https://github.com/openucx/ucx/releases/download/v1.12.1/ucx-v1.12.1-centos7-mofed5-cuda11.tar.bz2 contains: ``` -ucx-devel-1.11.2-1.el7.x86_64.rpm -ucx-debuginfo-1.11.2-1.el7.x86_64.rpm -ucx-1.11.2-1.el7.x86_64.rpm -ucx-cuda-1.11.2-1.el7.x86_64.rpm -ucx-rdmacm-1.11.2-1.el7.x86_64.rpm -ucx-cma-1.11.2-1.el7.x86_64.rpm -ucx-ib-1.11.2-1.el7.x86_64.rpm +ucx-devel-1.12.1-1.el7.x86_64.rpm +ucx-debuginfo-1.12.1-1.el7.x86_64.rpm +ucx-1.12.1-1.el7.x86_64.rpm +ucx-cuda-1.12.1-1.el7.x86_64.rpm +ucx-rdmacm-1.12.1-1.el7.x86_64.rpm +ucx-cma-1.12.1-1.el7.x86_64.rpm +ucx-ib-1.12.1-1.el7.x86_64.rpm ``` For a setup without RoCE or Infiniband networking, the only packages required are: ``` -ucx-1.11.2-1.el7.x86_64.rpm -ucx-cuda-1.11.2-1.el7.x86_64.rpm +ucx-1.12.1-1.el7.x86_64.rpm +ucx-cuda-1.12.1-1.el7.x86_64.rpm ``` If accelerated networking is available, the package list is: ``` -ucx-1.11.2-1.el7.x86_64.rpm -ucx-cuda-1.11.2-1.el7.x86_64.rpm -ucx-rdmacm-1.11.2-1.el7.x86_64.rpm -ucx-ib-1.11.2-1.el7.x86_64.rpm +ucx-1.12.1-1.el7.x86_64.rpm +ucx-cuda-1.12.1-1.el7.x86_64.rpm +ucx-rdmacm-1.12.1-1.el7.x86_64.rpm +ucx-ib-1.12.1-1.el7.x86_64.rpm ``` --- @@ -152,7 +145,7 @@ system if you have RDMA capable hardware. Within the Docker container we need to install UCX and its requirements. These are Dockerfile examples for Ubuntu 18.04: -The following are examples of Docker containers with UCX 1.11.2 and cuda-11.2 support. +The following are examples of Docker containers with UCX 1.12.1 and cuda-11.2 support. | OS Type | RDMA | Dockerfile | | ------- | ---- | ---------- | @@ -296,7 +289,7 @@ In this section, we are using a docker container built using the sample dockerfi | Databricks 9.1 | com.nvidia.spark.rapids.spark312db.RapidsShuffleManager | | Databricks 10.4 | com.nvidia.spark.rapids.spark321db.RapidsShuffleManager | -2. Settings for UCX 1.11.2+: +2. Settings for UCX 1.12.1+: Minimum configuration: @@ -345,9 +338,9 @@ guide for Databricks. The following are extra steps required to enable UCX. ``` #!/bin/bash sudo apt install -y wget libnuma1 && -wget https://github.com/openucx/ucx/releases/download/v1.11.2/ucx-v1.11.2-ubuntu18.04-mofed5.x-cuda11.2.deb && -sudo dpkg -i ucx-v1.11.2-ubuntu18.04-mofed5.x-cuda11.2.deb && -rm ucx-v1.11.2-ubuntu18.04-mofed5.x-cuda11.2.deb +wget https://github.com/openucx/ucx/releases/download/v1.12.1/ucx-v1.12.1-ubuntu18.04-mofed5-cuda11.deb && +sudo dpkg -i ucx-v1.12.1-ubuntu18.04-mofed5-cuda11.deb && +rm ucx-v1.12.1-ubuntu18.04-mofed5-cuda11.deb ``` Save the script in DBFS and add it to the "Init Scripts" list: diff --git a/docs/additional-functionality/shuffle-docker-examples/Dockerfile.centos_no_rdma b/docs/additional-functionality/shuffle-docker-examples/Dockerfile.centos_no_rdma index dca75b835f3..80de7e1e6b4 100644 --- a/docs/additional-functionality/shuffle-docker-examples/Dockerfile.centos_no_rdma +++ b/docs/additional-functionality/shuffle-docker-examples/Dockerfile.centos_no_rdma @@ -1,5 +1,5 @@ # -# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. +# Copyright (c) 2021-2022, NVIDIA CORPORATION. All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -22,15 +22,15 @@ # See: https://github.com/openucx/ucx/releases/ ARG CUDA_VER=11.2.2 -ARG UCX_VER=1.11.2 -ARG UCX_CUDA_VER=11.2 +ARG UCX_VER=1.12.1 +ARG UCX_CUDA_VER=11 FROM nvidia/cuda:${CUDA_VER}-runtime-centos7 ARG UCX_VER ARG UCX_CUDA_VER RUN yum update -y && yum install -y wget bzip2 -RUN cd /tmp && wget https://github.com/openucx/ucx/releases/download/v$UCX_VER/ucx-v$UCX_VER-centos7-mofed5.x-cuda$UCX_CUDA_VER.tar.bz2 +RUN cd /tmp && wget https://github.com/openucx/ucx/releases/download/v$UCX_VER/ucx-v$UCX_VER-centos7-mofed5-cuda$UCX_CUDA_VER.tar.bz2 RUN cd /tmp && tar -xvf *.bz2 && \ yum install -y ucx-$UCX_VER-1.el7.x86_64.rpm && \ yum install -y ucx-cuda-$UCX_VER-1.el7.x86_64.rpm && \ diff --git a/docs/additional-functionality/shuffle-docker-examples/Dockerfile.centos_rdma b/docs/additional-functionality/shuffle-docker-examples/Dockerfile.centos_rdma index e59d3ca5f68..572d7d850bf 100644 --- a/docs/additional-functionality/shuffle-docker-examples/Dockerfile.centos_rdma +++ b/docs/additional-functionality/shuffle-docker-examples/Dockerfile.centos_rdma @@ -1,5 +1,5 @@ # -# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. +# Copyright (c) 2021-2022, NVIDIA CORPORATION. All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -29,8 +29,8 @@ ARG RDMA_CORE_VERSION=32.1 ARG CUDA_VER=11.2.2 -ARG UCX_VER=1.11.2 -ARG UCX_CUDA_VER=11.2 +ARG UCX_VER=1.12.1 +ARG UCX_CUDA_VER=11 # Throw away image to build rdma_core FROM centos:7 as rdma_core @@ -59,7 +59,7 @@ COPY --from=rdma_core /tmp/*.rpm /tmp/ RUN yum update -y RUN yum install -y wget bzip2 -RUN cd /tmp && wget https://github.com/openucx/ucx/releases/download/v$UCX_VER/ucx-v$UCX_VER-centos7-mofed5.x-cuda$UCX_CUDA_VER.tar.bz2 +RUN cd /tmp && wget https://github.com/openucx/ucx/releases/download/v$UCX_VER/ucx-v$UCX_VER-centos7-mofed5-cuda$UCX_CUDA_VER.tar.bz2 RUN cd /tmp && \ yum install -y *.rpm && \ tar -xvf *.bz2 && \ diff --git a/docs/additional-functionality/shuffle-docker-examples/Dockerfile.ubuntu_no_rdma b/docs/additional-functionality/shuffle-docker-examples/Dockerfile.ubuntu_no_rdma index 8270f295f00..038d033a36b 100644 --- a/docs/additional-functionality/shuffle-docker-examples/Dockerfile.ubuntu_no_rdma +++ b/docs/additional-functionality/shuffle-docker-examples/Dockerfile.ubuntu_no_rdma @@ -1,5 +1,5 @@ # -# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. +# Copyright (c) 2021-2022, NVIDIA CORPORATION. All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -22,8 +22,8 @@ # See: https://github.com/openucx/ucx/releases/ ARG CUDA_VER=11.2.2 -ARG UCX_VER=1.11.2 -ARG UCX_CUDA_VER=11.2 +ARG UCX_VER=1.12.1 +ARG UCX_CUDA_VER=11 FROM nvidia/cuda:${CUDA_VER}-runtime-ubuntu18.04 ARG UCX_VER @@ -31,5 +31,5 @@ ARG UCX_CUDA_VER RUN apt update RUN apt-get install -y wget -RUN cd /tmp && wget https://github.com/openucx/ucx/releases/download/v$UCX_VER/ucx-v$UCX_VER-ubuntu18.04-mofed5.x-cuda$UCX_CUDA_VER.deb +RUN cd /tmp && wget https://github.com/openucx/ucx/releases/download/v$UCX_VER/ucx-v$UCX_VER-ubuntu18.04-mofed5-cuda$UCX_CUDA_VER.deb RUN apt install -y /tmp/*.deb && rm -rf /tmp/*.deb diff --git a/docs/additional-functionality/shuffle-docker-examples/Dockerfile.ubuntu_rdma b/docs/additional-functionality/shuffle-docker-examples/Dockerfile.ubuntu_rdma index b498239974e..fea627e30df 100644 --- a/docs/additional-functionality/shuffle-docker-examples/Dockerfile.ubuntu_rdma +++ b/docs/additional-functionality/shuffle-docker-examples/Dockerfile.ubuntu_rdma @@ -1,5 +1,5 @@ # -# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. +# Copyright (c) 2021-2022, NVIDIA CORPORATION. All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -29,8 +29,8 @@ ARG RDMA_CORE_VERSION=32.1 ARG CUDA_VER=11.2.2 -ARG UCX_VER=1.11.2 -ARG UCX_CUDA_VER=11.2 +ARG UCX_VER=1.12.1 +ARG UCX_CUDA_VER=11 # Throw away image to build rdma_core FROM ubuntu:18.04 as rdma_core @@ -50,5 +50,5 @@ COPY --from=rdma_core /*.deb /tmp/ RUN apt update RUN apt-get install -y wget -RUN cd /tmp && wget https://github.com/openucx/ucx/releases/download/v$UCX_VER/ucx-v$UCX_VER-ubuntu18.04-mofed5.x-cuda$UCX_CUDA_VER.deb +RUN cd /tmp && wget https://github.com/openucx/ucx/releases/download/v$UCX_VER/ucx-v$UCX_VER-ubuntu18.04-mofed5-cuda$UCX_CUDA_VER.deb RUN apt install -y /tmp/*.deb && rm -rf /tmp/*.deb diff --git a/jenkins/Dockerfile-blossom.ubuntu b/jenkins/Dockerfile-blossom.ubuntu index 0d3d7fe62a5..f96e4dd61aa 100644 --- a/jenkins/Dockerfile-blossom.ubuntu +++ b/jenkins/Dockerfile-blossom.ubuntu @@ -1,5 +1,5 @@ # -# Copyright (c) 2020-2021, NVIDIA CORPORATION. All rights reserved. +# Copyright (c) 2020-2022, NVIDIA CORPORATION. All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -21,15 +21,19 @@ # Arguments: # CUDA_VER=11.0+ # UBUNTU_VER=18.04 or 20.04 +# UCX_CUDA_VER=11 (major CUDA version) +# UCX_VER=1.12.1 ### ARG CUDA_VER=11.0 ARG UBUNTU_VER=18.04 -ARG UCX_VER=1.11.2 +ARG UCX_VER=1.12.1 +ARG UCX_CUDA_VER=11 FROM nvidia/cuda:${CUDA_VER}-runtime-ubuntu${UBUNTU_VER} ARG CUDA_VER ARG UBUNTU_VER ARG UCX_VER +ARG UCX_CUDA_VER # Install jdk-8, jdk-11, maven, docker image RUN apt-get update -y && \ @@ -53,6 +57,6 @@ RUN apt install -y inetutils-ping expect wget libnuma1 libgomp1 RUN mkdir -p /tmp/ucx && \ cd /tmp/ucx && \ - wget https://github.com/openucx/ucx/releases/download/v${UCX_VER}/ucx-v${UCX_VER}-ubuntu${UBUNTU_VER}-mofed5.x-cuda${CUDA_VER}.deb && \ + wget https://github.com/openucx/ucx/releases/download/v${UCX_VER}/ucx-v${UCX_VER}-ubuntu${UBUNTU_VER}-mofed5-cuda${UCX_CUDA_VER}.deb && \ dpkg -i *.deb && \ rm -rf /tmp/ucx diff --git a/shuffle-plugin/pom.xml b/shuffle-plugin/pom.xml index db903e6e21e..918d1114341 100644 --- a/shuffle-plugin/pom.xml +++ b/shuffle-plugin/pom.xml @@ -44,7 +44,7 @@ org.openucx jucx - 1.11 + 1.12.1 compile