Skip to content

Commit

Permalink
update 21.08 gh-pages doc (#3233)
Browse files Browse the repository at this point in the history
* Update docs for 21.08 release (#3080)

* Update docs for 21.08 release

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

* Modify "or higher" -> "or later" when referring to versions

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

* Update docs/download.md

Co-authored-by: Hao Zhu <9665750+viadea@users.noreply.github.com>

* Move AQE for developers doc into development section of web site

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

* Add 21.06.1 release information to the download page

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

* Add information about kryoserialization buffer overflow

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

* Remove SNAPSHOT from getting-started-on-prem.md

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

* Update README.md for integration tests to remove SNAPSHOT

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

* Fix table in supported ops doc issue #2779

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

* Fix typo (Abillity -> Ability)

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

Co-authored-by: Hao Zhu <9665750+viadea@users.noreply.github.com>
Signed-off-by: yuali <yuali@nvidia.com>

* Update docs for 21.08 release (#3080)

* Update docs for 21.08 release

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

* Modify "or higher" -> "or later" when referring to versions

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

* Update docs/download.md

Co-authored-by: Hao Zhu <9665750+viadea@users.noreply.github.com>

* Move AQE for developers doc into development section of web site

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

* Add 21.06.1 release information to the download page

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

* Add information about kryoserialization buffer overflow

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

* Remove SNAPSHOT from getting-started-on-prem.md

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

* Update README.md for integration tests to remove SNAPSHOT

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

* Fix table in supported ops doc issue #2779

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

* Fix typo (Abillity -> Ability)

Signed-off-by: Sameer Raheja <sraheja@nvidia.com>

Co-authored-by: Hao Zhu <9665750+viadea@users.noreply.github.com>
Signed-off-by: yuali <yuali@nvidia.com>

* update config.md and supported_ops.md

Signed-off-by: yuali <yuali@nvidia.com>

* update rapids-shuffle.md and download.md

Signed-off-by: yuali <yuali@nvidia.com>

* update dev/README.md and formated

Signed-off-by: yuali <yuali@nvidia.com>

* fix typos

Signed-off-by: yuali <yuali@nvidia.com>

Co-authored-by: Hao Zhu <9665750+viadea@users.noreply.github.com>
  • Loading branch information
nvliyuan and viadea authored Aug 18, 2021
1 parent b9b7e89 commit 994fc93
Show file tree
Hide file tree
Showing 24 changed files with 2,235 additions and 733 deletions.
30 changes: 29 additions & 1 deletion docs/FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,8 @@ Spark driver and executor logs with messages that are similar to the following:

### What is the right hardware setup to run GPU accelerated Spark?

Reference architectures should be available around Q1 2021.
GPU accelerated Spark can run on any NVIDIA Pascal or better GPU architecture, including Volta,
Turing or Ampere.

### What parts of Apache Spark are accelerated?

Expand Down Expand Up @@ -360,6 +361,14 @@ different because of compression. Users can turn on
the GPU try to replicate more closely what the output ordering would have been if sort were used,
like on the CPU.

### Why am I getting the error `Failed to open the timezone file` when reading files?

When reading from a file that contains data referring to a particular timezone, e.g.: reading
timestamps from an ORC file, the system's timezone database at `/usr/share/zoneinfo/` must contain
the timezone in order to process the data properly. This error often indicates the system is
missing the timezone database. The timezone database is provided by the `tzdata` package on many
Linux distributions.

### Why am I getting an error when trying to use pinned memory?

```
Expand All @@ -377,6 +386,25 @@ for this issue.
To fix it you can either disable the IOMMU, or you can disable using pinned memory by setting
[spark.rapids.memory.pinnedPool.size](configs.md#memory.pinnedPool.size) to 0.

### Why am I getting a buffer overflow error when using the KryoSerializer?
Buffer overflow will happen when trying to serialize an object larger than
[`spark.kryoserializer.buffer.max`](https://spark.apache.org/docs/latest/configuration.html#compression-and-serialization),
and may result in an error such as:
```
Caused by: com.esotericsoftware.kryo.KryoException: Buffer overflow. Available: 0, required: 636
at com.esotericsoftware.kryo.io.Output.require(Output.java:167)
at com.esotericsoftware.kryo.io.Output.writeBytes(Output.java:251)
at com.esotericsoftware.kryo.io.Output.write(Output.java:219)
at java.base/java.io.ObjectOutputStream$BlockDataOutputStream.write(ObjectOutputStream.java:1859)
at java.base/java.io.ObjectOutputStream.write(ObjectOutputStream.java:712)
at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:123)
at java.base/java.io.DataOutputStream.write(DataOutputStream.java:107)
...
```
Try increasing the
[`spark.kryoserializer.buffer.max`](https://spark.apache.org/docs/latest/configuration.html#compression-and-serialization)
from a default of 64M to something larger, for example 512M.

### Is speculative execution supported?

Yes, speculative execution in Spark is fine with the RAPIDS Accelerator plugin.
Expand Down
13 changes: 2 additions & 11 deletions docs/additional-functionality/cache-serializer.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,21 +35,12 @@ nav_order: 2
should not be used. Using the serializer with negative decimal scales will generate
an error at runtime.

Make sure to use the right package corresponding to the spark version you are using. To use
this serializer with Spark 3.1.1 please run Spark with the following conf.
To use this serializer please run Spark with the following conf.
```
spark-shell --conf spark.sql.cache.serializer=com.nvidia.spark.rapids.shims.spark311.ParquetCachedBatchSerializer"
```
See the below table for all the names of the serializers corresponding to the Spark
versions


| Spark version | Serializer name |
| ------ | -----|
| 3.1.1 | com.nvidia.spark.rapids.shims.spark311.ParquetCachedBatchSerializer |
| 3.1.2 | com.nvidia.spark.rapids.shims.spark312.ParquetCachedBatchSerializer |
| 3.1.3 | com.nvidia.spark.rapids.shims.spark313.ParquetCachedBatchSerializer |
| 3.2.0 | com.nvidia.spark.rapids.shims.spark320.ParquetCachedBatchSerializer |

## Supported Types

All types are supported on the CPU, on the GPU, ArrayType, MapType and BinaryType are not
Expand Down
715 changes: 715 additions & 0 deletions docs/additional-functionality/qualification-profiling-tools.md

Large diffs are not rendered by default.

106 changes: 61 additions & 45 deletions docs/additional-functionality/rapids-shuffle.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ be installed on the host and inside Docker containers (if not baremetal). A host
requirements, like the MLNX_OFED driver and `nv_peer_mem` kernel module.

The minimum UCX requirement for the RAPIDS Shuffle Manager is
[UCX 1.10.1](https://github.com/openucx/ucx/releases/tag/v1.10.1)
[UCX 1.11.0](https://github.com/openucx/ucx/releases/tag/v1.11.0).

#### Baremetal

Expand Down Expand Up @@ -65,52 +65,48 @@ The minimum UCX requirement for the RAPIDS Shuffle Manager is
[file a GitHub issue](https://github.com/NVIDIA/spark-rapids/issues) so we can investigate
further.

2. Fetch and install the UCX package for your OS and CUDA version
[UCX 1.10.1](https://github.com/openucx/ucx/releases/tag/v1.10.1).

RDMA packages have extra requirements that should be satisfied by MLNX_OFED.

---
**NOTE:**
2. Fetch and install the UCX package for your OS from:
[UCX 1.11.0](https://github.com/openucx/ucx/releases/tag/v1.11.0).

NOTE: Please install the artifact with the newest CUDA 11.x version (for UCX 1.11.0 please
pick CUDA 11.2) as CUDA 11 introduced [CUDA Enhanced Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/index.html#enhanced-compat-minor-releases).
Starting with UCX 1.12, UCX will stop publishing individual artifacts for each minor version of CUDA.

Please note that the RAPIDS Shuffle Manager is built against
[JUCX 1.11.0](https://search.maven.org/artifact/org.openucx/jucx/1.11.0/jar). This is the JNI
component of UCX and was published ahead of the native library (UCX 1.11.0). Please disregard the
startup [compatibility warning](https://github.com/openucx/ucx/issues/6694),
as the JUCX usage within the RAPIDS Shuffle Manager is compatible with UCX 1.10.x.
Please refer to our [FAQ](../FAQ.md#what-hardware-is-supported) for caveats with
CUDA Enhanced Compatibility.

---
RDMA packages have extra requirements that should be satisfied by MLNX_OFED.

##### CentOS UCX RPM
The UCX packages for CentOS 7 and 8 are divided into different RPMs. For example, UCX 1.10.1
The UCX packages for CentOS 7 and 8 are divided into different RPMs. For example, UCX 1.11.0
available at
https://github.com/openucx/ucx/releases/download/v1.10.1/ucx-v1.10.1-centos7-mofed5.x-cuda11.0.tar.bz2
https://github.com/openucx/ucx/releases/download/v1.11.0/ucx-v1.11.0-centos7-mofed5.x-cuda11.2.tar.bz2
contains:

```
ucx-devel-1.10.1-1.el7.x86_64.rpm
ucx-debuginfo-1.10.1-1.el7.x86_64.rpm
ucx-1.10.1-1.el7.x86_64.rpm
ucx-cuda-1.10.1-1.el7.x86_64.rpm
ucx-rdmacm-1.10.1-1.el7.x86_64.rpm
ucx-cma-1.10.1-1.el7.x86_64.rpm
ucx-ib-1.10.1-1.el7.x86_64.rpm
ucx-devel-1.11.0-1.el7.x86_64.rpm
ucx-debuginfo-1.11.0-1.el7.x86_64.rpm
ucx-1.11.0-1.el7.x86_64.rpm
ucx-cuda-1.11.0-1.el7.x86_64.rpm
ucx-rdmacm-1.11.0-1.el7.x86_64.rpm
ucx-cma-1.11.0-1.el7.x86_64.rpm
ucx-ib-1.11.0-1.el7.x86_64.rpm
```

For a setup without RoCE or Infiniband networking, the only packages required are:

```
ucx-1.10.1-1.el7.x86_64.rpm
ucx-cuda-1.10.1-1.el7.x86_64.rpm
ucx-1.11.0-1.el7.x86_64.rpm
ucx-cuda-1.11.0-1.el7.x86_64.rpm
```

If accelerated networking is available, the package list is:

```
ucx-1.10.1-1.el7.x86_64.rpm
ucx-cuda-1.10.1-1.el7.x86_64.rpm
ucx-rdmacm-1.10.1-1.el7.x86_64.rpm
ucx-ib-1.10.1-1.el7.x86_64.rpm
ucx-1.11.0-1.el7.x86_64.rpm
ucx-cuda-1.11.0-1.el7.x86_64.rpm
ucx-rdmacm-1.11.0-1.el7.x86_64.rpm
ucx-ib-1.11.0-1.el7.x86_64.rpm
```

---
Expand Down Expand Up @@ -149,7 +145,7 @@ system if you have RDMA capable hardware.
Within the Docker container we need to install UCX and its requirements. These are Dockerfile
examples for Ubuntu 18.04:

The following are examples of Docker containers with UCX 1.10.1 and cuda-11.0 support.
The following are examples of Docker containers with UCX 1.11.0 and cuda-11.2 support.

| OS Type | RDMA | Dockerfile |
| ------- | ---- | ---------- |
Expand Down Expand Up @@ -288,24 +284,44 @@ In this section, we are using a docker container built using the sample dockerfi
| 3.0.1 EMR | com.nvidia.spark.rapids.spark301emr.RapidsShuffleManager |
| 3.0.2 | com.nvidia.spark.rapids.spark302.RapidsShuffleManager |
| 3.0.3 | com.nvidia.spark.rapids.spark303.RapidsShuffleManager |
| 3.0.4 | com.nvidia.spark.rapids.spark304.RapidsShuffleManager |
| 3.1.1 | com.nvidia.spark.rapids.spark311.RapidsShuffleManager |
| 3.1.1 CDH | com.nvidia.spark.rapids.spark311cdh.RapidsShuffleManager |
| 3.1.2 | com.nvidia.spark.rapids.spark312.RapidsShuffleManager |
| 3.1.3 | com.nvidia.spark.rapids.spark313.RapidsShuffleManager |
| 3.2.0 | com.nvidia.spark.rapids.spark320.RapidsShuffleManager |
2. Recommended settings for UCX 1.10.1+
```shell
...
--conf spark.shuffle.manager=com.nvidia.spark.rapids.spark301.RapidsShuffleManager \
--conf spark.shuffle.service.enabled=false \
--conf spark.executorEnv.UCX_TLS=cuda_copy,cuda_ipc,rc,tcp \
--conf spark.executorEnv.UCX_ERROR_SIGNALS= \
--conf spark.executorEnv.UCX_RNDV_SCHEME=put_zcopy \
--conf spark.executorEnv.UCX_MAX_RNDV_RAILS=1 \
--conf spark.executorEnv.UCX_MEMTYPE_CACHE=n \
--conf spark.executorEnv.UCX_IB_RX_QUEUE_LEN=1024 \
--conf spark.executor.extraClassPath=${SPARK_CUDF_JAR}:${SPARK_RAPIDS_PLUGIN_JAR}
```
2. Settings for UCX 1.11.0+:
Minimum configuration:
```shell
...
--conf spark.shuffle.manager=com.nvidia.spark.rapids.[shim package].RapidsShuffleManager \
--conf spark.shuffle.service.enabled=false \
--conf spark.dynamicAllocation.enabled=false \
--conf spark.executor.extraClassPath=${SPARK_CUDF_JAR}:${SPARK_RAPIDS_PLUGIN_JAR} \
--conf spark.executorEnv.UCX_ERROR_SIGNALS= \
--conf spark.executorEnv.UCX_MEMTYPE_CACHE=n
```
Recommended configuration:
```shell
...
--conf spark.shuffle.manager=com.nvidia.spark.rapids.[shim package].RapidsShuffleManager \
--conf spark.shuffle.service.enabled=false \
--conf spark.dynamicAllocation.enabled=false \
--conf spark.executor.extraClassPath=${SPARK_CUDF_JAR}:${SPARK_RAPIDS_PLUGIN_JAR} \
--conf spark.executorEnv.UCX_ERROR_SIGNALS= \
--conf spark.executorEnv.UCX_MEMTYPE_CACHE=n \
--conf spark.executorEnv.UCX_IB_RX_QUEUE_LEN=1024 \
--conf spark.executorEnv.UCX_TLS=cuda_copy,cuda_ipc,rc,tcp \
--conf spark.executorEnv.UCX_RNDV_SCHEME=put_zcopy \
--conf spark.executorEnv.UCX_MAX_RNDV_RAILS=1
```
Please replace `[shim package]` with the appropriate value. For example, the full class name for
Apache Spark 3.1.3 is: `com.nvidia.spark.rapids.spark313.RapidsShuffleManager`.
Please note `LD_LIBRARY_PATH` should optionally be set if the UCX library is installed in a
non-standard location.
Expand Down
20 changes: 10 additions & 10 deletions docs/additional-functionality/rapids-udfs.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,38 +139,38 @@ in the [udf-examples](../../udf-examples) project.

### Spark Scala UDF Examples

- [URLDecode](https://github.com/NVIDIA/spark-rapids/tree/main/udf-examples/src/main/scala/com/nvidia/spark/rapids/udf/scala/URLDecode.scala)
- [URLDecode](../../udf-examples/src/main/scala/com/nvidia/spark/rapids/udf/scala/URLDecode.scala)
decodes URL-encoded strings using the
[Java APIs of RAPIDS cudf](https://docs.rapids.ai/api/cudf-java/stable)
- [URLEncode](https://github.com/NVIDIA/spark-rapids/tree/main/udf-examples/src/main/scala/com/nvidia/spark/rapids/udf/scala/URLEncode.scala)
- [URLEncode](../../udf-examples/src/main/scala/com/nvidia/spark/rapids/udf/scala/URLEncode.scala)
URL-encodes strings using the
[Java APIs of RAPIDS cudf](https://docs.rapids.ai/api/cudf-java/stable)

### Spark Java UDF Examples

- [URLDecode](https://github.com/NVIDIA/spark-rapids/tree/main/udf-examples/src/main/java/com/nvidia/spark/rapids/udf/java/URLDecode.java)
- [URLDecode](../../udf-examples/src/main/java/com/nvidia/spark/rapids/udf/java/URLDecode.java)
decodes URL-encoded strings using the
[Java APIs of RAPIDS cudf](https://docs.rapids.ai/api/cudf-java/stable)
- [URLEncode](https://github.com/NVIDIA/spark-rapids/tree/main/udf-examples/src/main/java/com/nvidia/spark/rapids/udf/java/URLEncode.java)
- [URLEncode](../../udf-examples/src/main/java/com/nvidia/spark/rapids/udf/java/URLEncode.java)
URL-encodes strings using the
[Java APIs of RAPIDS cudf](https://docs.rapids.ai/api/cudf-java/stable)
- [CosineSimilarity](https://github.com/NVIDIA/spark-rapids/tree/main/udf-examples/src/main/java/com/nvidia/spark/rapids/udf/java/CosineSimilarity.java)
- [CosineSimilarity](../../udf-examples/src/main/java/com/nvidia/spark/rapids/udf/java/CosineSimilarity.java)
computes the [cosine similarity](https://en.wikipedia.org/wiki/Cosine_similarity)
between two float vectors using [native code](https://github.com/NVIDIA/spark-rapids/tree/main/udf-examples/src/main/cpp/src)
between two float vectors using [native code](../../udf-examples/src/main/cpp/src)

### Hive UDF Examples

- [URLDecode](https://github.com/NVIDIA/spark-rapids/tree/main/udf-examples/src/main/java/com/nvidia/spark/rapids/udf/hive/URLDecode.java)
- [URLDecode](../../udf-examples/src/main/java/com/nvidia/spark/rapids/udf/hive/URLDecode.java)
implements a Hive simple UDF using the
[Java APIs of RAPIDS cudf](https://docs.rapids.ai/api/cudf-java/stable)
to decode URL-encoded strings
- [URLEncode](https://github.com/NVIDIA/spark-rapids/tree/main/udf-examples/src/main/java/com/nvidia/spark/rapids/udf/hive/URLEncode.java)
- [URLEncode](../../udf-examples/src/main/java/com/nvidia/spark/rapids/udf/hive/URLEncode.java)
implements a Hive generic UDF using the
[Java APIs of RAPIDS cudf](https://docs.rapids.ai/api/cudf-java/stable)
to URL-encode strings
- [StringWordCount](https://github.com/NVIDIA/spark-rapids/tree/main/udf-examples/src/main/java/com/nvidia/spark/rapids/udf/hive/StringWordCount.java)
- [StringWordCount](../../udf-examples/src/main/java/com/nvidia/spark/rapids/udf/hive/StringWordCount.java)
implements a Hive simple UDF using
[native code](https://github.com/NVIDIA/spark-rapids/tree/main/udf-examples/src/main/cpp/src) to count words in strings
[native code](../../udf-examples/src/main/cpp/src) to count words in strings


## GPU Support for Pandas UDF
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,14 @@
# Sample Dockerfile to install UCX in a CentosOS 7 image
#
# The parameters are:
# - CUDA_VER: 11.0.3 to pick up the latest 11.x CUDA base layer
# - UCX_VER and UCX_CUDA_VER: these are used to pick a package matchin a specific UCX version and
# - CUDA_VER: 11.2.2 to pick up the latest 11.2 CUDA base layer
# - UCX_VER and UCX_CUDA_VER: these are used to pick a package matching a specific UCX version and
# CUDA runtime from the UCX github repo.
# See: https://github.com/openucx/ucx/releases/

ARG CUDA_VER=11.0.3
ARG UCX_VER=v1.10.1
ARG UCX_CUDA_VER=11.0
ARG CUDA_VER=11.2.2
ARG UCX_VER=v1.11.0
ARG UCX_CUDA_VER=11.2

FROM nvidia/cuda:${CUDA_VER}-runtime-centos7
ARG UCX_VER
Expand All @@ -32,6 +32,6 @@ ARG UCX_CUDA_VER
RUN yum update -y && yum install -y wget bzip2
RUN cd /tmp && wget https://github.com/openucx/ucx/releases/download/$UCX_VER/ucx-$UCX_VER-centos7-mofed5.x-cuda$UCX_CUDA_VER.tar.bz2
RUN cd /tmp && tar -xvf *.bz2 && \
yum install -y ucx-1.10.1-1.el7.x86_64.rpm && \
yum install -y ucx-cuda-1.10.1-1.el7.x86_64.rpm && \
yum install -y ucx-1.11.0-1.el7.x86_64.rpm && \
yum install -y ucx-cuda-1.11.0-1.el7.x86_64.rpm && \
rm -rf /tmp/*.rpm
Original file line number Diff line number Diff line change
Expand Up @@ -19,18 +19,18 @@
# The parameters are:
# - RDMA_CORE_VERSION: Set to 32.1 to match the rdma-core line in the latest
# released MLNX_OFED 5.x driver
# - CUDA_VER: 11.0.3 to pick up the latest 11.x CUDA base layer
# - UCX_VER and UCX_CUDA_VER: these are used to pick a package matchin a specific UCX version and
# - CUDA_VER: 11.2.2 to pick up the latest 11.2 CUDA base layer
# - UCX_VER and UCX_CUDA_VER: these are used to pick a package matching a specific UCX version and
# CUDA runtime from the UCX github repo.
# See: https://github.com/openucx/ucx/releases/
#
# The Dockerfile first fetches and builds `rdma-core` to satisfy requirements for
# the ucx-ib and ucx-rdma RPMs.

ARG RDMA_CORE_VERSION=32.1
ARG CUDA_VER=11.0.3
ARG UCX_VER=v1.10.1
ARG UCX_CUDA_VER=11.0
ARG CUDA_VER=11.2.2
ARG UCX_VER=v1.11.0
ARG UCX_CUDA_VER=11.2

# Throw away image to build rdma_core
FROM centos:7 as rdma_core
Expand Down Expand Up @@ -63,8 +63,8 @@ RUN cd /tmp && wget https://github.com/openucx/ucx/releases/download/$UCX_VER/uc
RUN cd /tmp && \
yum install -y *.rpm && \
tar -xvf *.bz2 && \
yum install -y ucx-1.10.1-1.el7.x86_64.rpm && \
yum install -y ucx-cuda-1.10.1-1.el7.x86_64.rpm && \
yum install -y ucx-ib-1.10.1-1.el7.x86_64.rpm && \
yum install -y ucx-rdmacm-1.10.1-1.el7.x86_64.rpm
yum install -y ucx-1.11.0-1.el7.x86_64.rpm && \
yum install -y ucx-cuda-1.11.0-1.el7.x86_64.rpm && \
yum install -y ucx-ib-1.11.0-1.el7.x86_64.rpm && \
yum install -y ucx-rdmacm-1.11.0-1.el7.x86_64.rpm
RUN rm -rf /tmp/*.rpm && rm /tmp/*.bz2
Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,14 @@
# Sample Dockerfile to install UCX in a Ubuntu 18.04 image
#
# The parameters are:
# - CUDA_VER: 11.0.3 to pick up the latest 11.x CUDA base layer
# - UCX_VER and UCX_CUDA_VER: these are used to pick a package matchin a specific UCX version and
# - CUDA_VER: 11.2.2 to pick up the latest 11.2 CUDA base layer
# - UCX_VER and UCX_CUDA_VER: these are used to pick a package matching a specific UCX version and
# CUDA runtime from the UCX github repo.
# See: https://github.com/openucx/ucx/releases/

ARG CUDA_VER=11.0
ARG UCX_VER=v1.10.1
ARG UCX_CUDA_VER=11.0
ARG CUDA_VER=11.2.2
ARG UCX_VER=v1.11.0
ARG UCX_CUDA_VER=11.2

FROM nvidia/cuda:${CUDA_VER}-runtime-ubuntu18.04
ARG UCX_VER
Expand Down
Loading

0 comments on commit 994fc93

Please sign in to comment.