Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docs for the 22.04 release[skip ci] #4997

Merged
merged 36 commits into from
Apr 7, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
2e817ac
Update 2204 doc including add a download page section
viadea Mar 21, 2022
5a565fe
Update docs/FAQ.md
viadea Mar 21, 2022
4eba485
Update docs/FAQ.md
viadea Mar 21, 2022
50b8173
Update docs/FAQ.md
viadea Mar 21, 2022
5a5f0cf
Update docs/FAQ.md
viadea Mar 21, 2022
99df03d
reword on FAQ
viadea Mar 21, 2022
b9a0a6f
Change to CUDA 11.5 in FAQ guide
viadea Mar 22, 2022
247b7d0
Update docs/FAQ.md
viadea Mar 22, 2022
4c19d8b
Update docs/FAQ.md
viadea Mar 22, 2022
54ee0cb
Update docs/FAQ.md
viadea Mar 22, 2022
5844c99
Update docs/FAQ.md
viadea Mar 22, 2022
8fd9e2e
Update docs/FAQ.md
viadea Mar 22, 2022
a9ff4db
Update docs/FAQ.md
viadea Mar 22, 2022
b9f9d70
Update docs/FAQ.md
viadea Mar 22, 2022
5e39bea
Update docs/get-started/getting-started-databricks.md
viadea Mar 22, 2022
7957ba1
minor wording change in FAQ
viadea Mar 22, 2022
05bae05
Add some notes in GCP guide
viadea Mar 22, 2022
9acae90
Add avro reader
viadea Mar 22, 2022
10ae0bf
Add CDP /CDS versions in FAQ
viadea Mar 23, 2022
2c27550
resolve conflict
viadea Mar 24, 2022
6eb298f
add spark 3.3
viadea Mar 24, 2022
750af1a
resolve conflict
viadea Mar 24, 2022
1d8c1c4
Merge branch 'branch-22.04' into 2204-doc
viadea Mar 24, 2022
49119c6
Add 3.3.0
viadea Mar 24, 2022
3fdf64c
remove 330
viadea Mar 25, 2022
882af8b
Add support email
viadea Mar 25, 2022
b0bab03
Update docs/download.md
viadea Apr 1, 2022
b38b4b3
Update docs/download.md
viadea Apr 1, 2022
dca6600
delete generate-init-script-cuda11.ipynb
viadea Apr 1, 2022
6c76439
reformatted init script
viadea Apr 1, 2022
25ea4d6
MIG FAQ update
viadea Apr 1, 2022
756b7e0
Modified CLDR/EMR section in download
viadea Apr 2, 2022
353b8f4
Update docs/get-started/getting-started-databricks.md
viadea Apr 5, 2022
b46562e
Update docs/get-started/getting-started-gcp.md
viadea Apr 5, 2022
3d551f9
Update docs/get-started/getting-started-databricks.md
viadea Apr 5, 2022
9eb8765
Update docs/FAQ.md
viadea Apr 5, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 28 additions & 6 deletions docs/FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,17 @@ nav_order: 12

### What versions of Apache Spark does the RAPIDS Accelerator for Apache Spark support?

The RAPIDS Accelerator for Apache Spark requires version 3.0.1, 3.0.2, 3.0.3, 3.1.1, 3.1.2 or 3.2.0 of
Apache Spark. Because the plugin replaces parts of the physical plan that Apache Spark considers to
be internal the code for those plans can change even between bug fix releases. As a part of our
The RAPIDS Accelerator for Apache Spark requires version 3.1.1, 3.1.2, 3.1.3, 3.2.0 , 3.2.1 or 3.3.0
viadea marked this conversation as resolved.
Show resolved Hide resolved
of Apache Spark. Because the plugin replaces parts of the physical plan that Apache Spark considers
to be internal the code for those plans can change even between bug fix releases. As a part of our
process, we try to stay on top of these changes and release updates as quickly as possible.

### Which distributions are supported?

The RAPIDS Accelerator for Apache Spark officially supports:
- [Apache Spark](get-started/getting-started-on-prem.md)
- [AWS EMR 6.2+](get-started/getting-started-aws-emr.md)
- [Databricks Runtime 7.3, 9.1](get-started/getting-started-databricks.md)
- [Databricks Runtime 9.1, 10.4](get-started/getting-started-databricks.md)
- [Google Cloud Dataproc 2.0](get-started/getting-started-gcp.md)
- [Azure Synapse](get-started/getting-started-azure-synapse-analytics.md)

Expand Down Expand Up @@ -75,8 +75,9 @@ Turing or Ampere.

Currently a limited set of SQL and DataFrame operations are supported, please see the
[configs](configs.md) and [supported operations](supported_ops.md) for a more complete list of what
is supported. Some of structured streaming is likely to be accelerated, but it has not been an area
of focus right now. Other areas like MLLib, GraphX or RDDs are not accelerated.
is supported. Some of the MLlib functions, such as `PCA` is supported.
viadea marked this conversation as resolved.
Show resolved Hide resolved
Some of structured streaming is likely to be accelerated, but it has not been an area
of focus right now. Other areas like GraphX or RDDs are not accelerated.

### Is the Spark `Dataset` API supported?

Expand Down Expand Up @@ -470,3 +471,24 @@ later) finishes before the slow task that triggered speculation. If the speculat
finishes first then that's good, it is working as intended. If many tasks are speculating, but the
original task always finishes first then this is a pure loss, the speculation is adding load to
the Spark cluster with no benefit.

### Why is my query in GPU mode slower than CPU mode?

Below are some troubleshooting tips on GPU query performance issue:
* Make sure the query in GPU mode is fully on GPU. Please refer to
[Getting Started on Spark workload qualification](./get-started/getting-started-workload-qualification.md)
tgravescs marked this conversation as resolved.
Show resolved Hide resolved
for more details. If there are some CPU fallbacks, check if those are some known features which
can be enabled by turning on some Spark RAPIDS parameters. If the features needed do not exsit in
the most recent release of Spark RAPIDS, please file
viadea marked this conversation as resolved.
Show resolved Hide resolved
[feature request](https://github.com/NVIDIA/spark-rapids/issues) with a minimum reproduce.
viadea marked this conversation as resolved.
Show resolved Hide resolved

* Tune the Spark and Spark RAPIDS parameters such as `spark.sql.shuffle.partitions`,
viadea marked this conversation as resolved.
Show resolved Hide resolved
`spark.sql.files.maxPartitionBytes` and `spark.rapids.sql.concurrentGpuTasks` to get the best run.
viadea marked this conversation as resolved.
Show resolved Hide resolved
Please refer to [Tuning Guide](./tuning-guide.md) for more details.

* Identify the most time consuming part of the query. You can use
[Profiling tool](./spark-profiling-tool.md) to process the spark eventlog to get more insights of
viadea marked this conversation as resolved.
Show resolved Hide resolved
the query performance. For example, if IO is the bottleneck, we suggest optimize the backend
storage IO performance. It's because the most suitable query type is computation bound instead of
IO or Network bound.
viadea marked this conversation as resolved.
Show resolved Hide resolved

2 changes: 1 addition & 1 deletion docs/additional-functionality/ml-integration.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ access to any of the memory that RMM is holding.
## Spark ML Algorithms Supported by RAPIDS Accelerator

The [spark-rapids-examples repository](https://github.com/NVIDIA/spark-rapids-examples) provides a
[working example](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.02/examples/Spark-cuML/pca)
[working example](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.04/examples/Spark-cuML/pca)
of accelerating the `transform` API for
[Principal Component Analysis (PCA)](https://spark.apache.org/docs/latest/mllib-dimensionality-reduction#principal-component-analysis-pca).
The example leverages the [RAPIDS accelerated UDF interface](rapids-udfs.md) to provide a native
Expand Down
2 changes: 1 addition & 1 deletion docs/demo/Databricks/generate-init-script-cuda11.ipynb
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"cells":[{"cell_type":"code","source":["dbutils.fs.mkdirs(\"dbfs:/databricks/init_scripts/\")\n \ndbutils.fs.put(\"/databricks/init_scripts/init.sh\",\"\"\"\n#!/bin/bash\nsudo wget -O /databricks/jars/rapids-4-spark_2.12-22.02.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/22.02.0/rapids-4-spark_2.12-22.02.0.jar\nsudo wget -O /databricks/jars/cudf-22.02.0-cuda11.jar https://repo1.maven.org/maven2/ai/rapids/cudf/22.02.0/cudf-22.02.0-cuda11.jar\n\nsudo wget -O /etc/apt/preferences.d/cuda-repository-pin-600 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin\nsudo wget -O ~/cuda-repo-ubuntu1804-11-0-local_11.0.3-450.51.06-1_amd64.deb https://developer.download.nvidia.com/compute/cuda/11.0.3/local_installers/cuda-repo-ubuntu1804-11-0-local_11.0.3-450.51.06-1_amd64.deb\nsudo dpkg -i ~/cuda-repo-ubuntu1804-11-0-local_11.0.3-450.51.06-1_amd64.deb\nsudo apt-key add /var/cuda-repo-ubuntu1804-11-0-local/7fa2af80.pub\nsudo apt-get update\nsudo apt -y install cuda-toolkit-11-0\"\"\", True)"],"metadata":{},"outputs":[],"execution_count":1},{"cell_type":"code","source":["%sh\ncd ../../dbfs/databricks/init_scripts\npwd\nls -ltr\ncat init.sh"],"metadata":{},"outputs":[],"execution_count":2},{"cell_type":"code","source":[""],"metadata":{},"outputs":[],"execution_count":3}],"metadata":{"name":"generate-init-script","notebookId":2645746662301564},"nbformat":4,"nbformat_minor":0}
{"cells":[{"cell_type":"code","source":["dbutils.fs.mkdirs(\"dbfs:/databricks/init_scripts/\")\n \ndbutils.fs.put(\"/databricks/init_scripts/init.sh\",\"\"\"\n#!/bin/bash\nsudo wget -O /databricks/jars/rapids-4-spark_2.12-22.04.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/22.04.0/rapids-4-spark_2.12-22.04.0.jar\nsudo wget -O /databricks/jars/cudf-22.04.0-cuda11.jar https://repo1.maven.org/maven2/ai/rapids/cudf/22.04.0/cudf-22.04.0-cuda11.jar\n\nsudo wget -O /etc/apt/preferences.d/cuda-repository-pin-600 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin\nsudo wget -O ~/cuda-repo-ubuntu1804-11-0-local_11.0.3-450.51.06-1_amd64.deb https://developer.download.nvidia.com/compute/cuda/11.0.3/local_installers/cuda-repo-ubuntu1804-11-0-local_11.0.3-450.51.06-1_amd64.deb\nsudo dpkg -i ~/cuda-repo-ubuntu1804-11-0-local_11.0.3-450.51.06-1_amd64.deb\nsudo apt-key add /var/cuda-repo-ubuntu1804-11-0-local/7fa2af80.pub\nsudo apt-get update\nsudo apt -y install cuda-toolkit-11-0\"\"\", True)"],"metadata":{},"outputs":[],"execution_count":1},{"cell_type":"code","source":["%sh\ncd ../../dbfs/databricks/init_scripts\npwd\nls -ltr\ncat init.sh"],"metadata":{},"outputs":[],"execution_count":2},{"cell_type":"code","source":[""],"metadata":{},"outputs":[],"execution_count":3}],"metadata":{"name":"generate-init-script","notebookId":2645746662301564},"nbformat":4,"nbformat_minor":0}
viadea marked this conversation as resolved.
Show resolved Hide resolved
2 changes: 1 addition & 1 deletion docs/demo/Databricks/generate-init-script.ipynb
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"cells":[{"cell_type":"code","source":["dbutils.fs.mkdirs(\"dbfs:/databricks/init_scripts/\")\n \ndbutils.fs.put(\"/databricks/init_scripts/init.sh\",\"\"\"\n#!/bin/bash\nsudo wget -O /databricks/jars/rapids-4-spark_2.12-22.02.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/22.02.0/rapids-4-spark_2.12-22.02.0.jar\nsudo wget -O /databricks/jars/cudf-22.02.0-cuda11.jar https://repo1.maven.org/maven2/ai/rapids/cudf/22.02.0/cudf-22.02.0-cuda11.jar\"\"\", True)"],"metadata":{},"outputs":[],"execution_count":1},{"cell_type":"code","source":["%sh\ncd ../../dbfs/databricks/init_scripts\npwd\nls -ltr\ncat init.sh"],"metadata":{},"outputs":[],"execution_count":2},{"cell_type":"code","source":[""],"metadata":{},"outputs":[],"execution_count":3}],"metadata":{"name":"generate-init-script","notebookId":2645746662301564},"nbformat":4,"nbformat_minor":0}
{"cells":[{"cell_type":"code","source":["dbutils.fs.mkdirs(\"dbfs:/databricks/init_scripts/\")\n \ndbutils.fs.put(\"/databricks/init_scripts/init.sh\",\"\"\"\n#!/bin/bash\nsudo wget -O /databricks/jars/rapids-4-spark_2.12-22.04.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/22.04.0/rapids-4-spark_2.12-22.04.0.jar\nsudo wget -O /databricks/jars/cudf-22.04.0-cuda11.jar https://repo1.maven.org/maven2/ai/rapids/cudf/22.04.0/cudf-22.04.0-cuda11.jar\"\"\", True)"],"metadata":{},"outputs":[],"execution_count":1},{"cell_type":"code","source":["%sh\ncd ../../dbfs/databricks/init_scripts\npwd\nls -ltr\ncat init.sh"],"metadata":{},"outputs":[],"execution_count":2},{"cell_type":"code","source":[""],"metadata":{},"outputs":[],"execution_count":3}],"metadata":{"name":"generate-init-script","notebookId":2645746662301564},"nbformat":4,"nbformat_minor":0}
sameerz marked this conversation as resolved.
Show resolved Hide resolved
57 changes: 57 additions & 0 deletions docs/download.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,63 @@ cuDF jar, that is either preinstalled in the Spark classpath on all nodes or sub
that uses the RAPIDS Accelerator For Apache Spark. See the [getting-started
guide](https://nvidia.github.io/spark-rapids/Getting-Started/) for more details.

## Release v22.04.0
Hardware Requirements:

The plugin is tested on the following architectures:

GPU Models: NVIDIA V100, T4 and A2/A10/A30/A100 GPUs

Software Requirements:

OS: Ubuntu 18.04, Ubuntu 20.04 or CentOS 7, CentOS 8

CUDA & NVIDIA Drivers*: 11.x & v450.80.02+

Apache Spark 3.1.1, 3.1.2, 3.1.3, 3.2.0, 3.2.1, 3.3.0, Cloudera CDP 7.1.6, 7.1.7, Databricks 9.1 ML LTS or 10.4 ML LTS Runtime and GCP Dataproc 2.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are we doing with CDP here, we don't have the code in our shim to actually work with those.
@sameerz Should just point to their release info ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we completely remove CDP versions from download page like what we did for EMR before?

How about we add the CDP version in FAQ with below 2 links?
https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/yarn-allocate-resources/topics/yarn-using-gpu-scheduling.html
https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/yarn-allocate-resources/topics/yarn-using-gpu-scheduling.html

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some discussion, i added a section in FAQ guide for CDP/CDS and also removed from download page. Please take it look at again to see if it looks good to you?


Python 3.6+, Scala 2.12, Java 8

*Some hardware may have a minimum driver version greater than v450.80.02+. Check the GPU spec sheet
for your hardware's minimum driver version.

### Download v22.04.0
* Download the [RAPIDS
Accelerator for Apache Spark 22.04.0 jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/22.04.0/rapids-4-spark_2.12-22.04.0.jar)
* Download the [RAPIDS cuDF 22.04.0 jar](https://repo1.maven.org/maven2/ai/rapids/cudf/22.04.0/cudf-22.04.0-cuda11.jar)

This package is built against CUDA 11.5 and has [CUDA forward
compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/index.html) enabled. It is tested
on V100, T4, A2, A10, A30 and A100 GPUs with CUDA 11.0-11.5. For those using other types of GPUs which
do not have CUDA forward compatibility (for example, GeForce), CUDA 11.5 is required. Users will
need to ensure the minimum driver (450.80.02) and CUDA toolkit are installed on each Spark node.

### Verify signature
* Download the [RAPIDS Accelerator for Apache Spark 22.04.0 jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/22.04.0/rapids-4-spark_2.12-22.04.0.jar)
and [RAPIDS Accelerator for Apache Spark 22.04.0 jars.asc](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/22.04.0/rapids-4-spark_2.12-22.04.0.jar.asc)
* Download the [PUB_KEY](https://keys.openpgp.org/search?q=sw-spark@nvidia.com).
* Import the public key: `gpg --import PUB_KEY`
* Verify the signature: `gpg --verify rapids-4-spark_2.12-22.04.0.jar.asc rapids-4-spark_2.12-22.04.0.jar`

The output if signature verify:

gpg: Good signature from "NVIDIA Spark (For the signature of spark-rapids release jars) <sw-spark@nvidia.com>"

### Release Notes
New functionality and performance improvements for this release include:
* ExistenceJoin support
* ArrayExists support
* GetArrayStructFields support
* Function str_to_map support
* Function percent_rank support
* Add regular expression support for function split on string
viadea marked this conversation as resolved.
Show resolved Hide resolved
* Support function approx_percentile in reduction context
* Support function element_at with non-literal index
* Spark cuSpatial UDF

For a detailed list of changes, please refer to the
[CHANGELOG](https://github.com/NVIDIA/spark-rapids/blob/main/CHANGELOG.md).

## Release v22.02.0
Hardware Requirements:

Expand Down
28 changes: 14 additions & 14 deletions docs/get-started/getting-started-databricks.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@ At the end of this guide, the reader will be able to run a sample Apache Spark a
on NVIDIA GPUs on Databricks.

## Prerequisites
* Apache Spark 3.x running in Databricks Runtime 7.3 ML or 9.1 ML with GPU
* AWS: 7.3 LTS ML (GPU, Scala 2.12, Spark 3.0.1) or 9.1 LTS ML (GPU, Scala 2.12, Spark 3.1.2)
* Azure: 7.3 LTS ML (GPU, Scala 2.12, Spark 3.0.1) or 9.1 LTS ML (GPU, Scala 2.12, Spark 3.1.2)
* Apache Spark 3.x running in Databricks Runtime 9.1 ML or 10.4 ML with GPU
* AWS: 9.1 LTS ML (GPU, Scala 2.12, Spark 3.1.2) or 10.4 LTS ML (GPU, Scala 2.12, Spark 3.2.1)
* Azure: 9.1 LTS ML (GPU, Scala 2.12, Spark 3.1.2) or 10.4 LTS ML (GPU, Scala 2.12, Spark 3.2.1)

Databricks may do [maintenance
releases](https://docs.databricks.com/release-notes/runtime/maintenance-updates.html) for their
Expand Down Expand Up @@ -77,19 +77,19 @@ cluster.
your workspace. See [Managing
Notebooks](https://docs.databricks.com/notebooks/notebooks-manage.html#id2) for instructions on
how to import a notebook.
Select the initialization script based on the Databricks runtime
Select the Spark RAPIDS accelerator version based on the Databricks runtime
viadea marked this conversation as resolved.
Show resolved Hide resolved
version:
- [Databricks 7.3 LTS
ML](https://docs.databricks.com/release-notes/runtime/7.3ml.html#system-environment) runs CUDA 10.1
Update 2. Users wishing to try 21.06.1 or later on Databricks 7.3 LTS ML will need to install the
CUDA 11.0 toolkit on the cluster. This can be done with the [generate-init-script-cuda11.ipynb
](../demo/Databricks/generate-init-script-cuda11.ipynb) init script, which installs both the RAPIDS
Spark plugin and the CUDA 11 toolkit.
- [Databricks 9.1 LTS
ML](https://docs.databricks.com/release-notes/runtime/9.1ml.html#system-environment) has CUDA 11
installed. Users will need to use 21.12.0 or later on Databricks 9.1 LTS ML. In this case use
[generate-init-script.ipynb](../demo/Databricks/generate-init-script.ipynb) which will install
the RAPIDS Spark plugin.
installed. Users will need to use 21.12.0 or later on Databricks 9.1 LTS ML.
- [Databricks 10.4 LTS
ML](https://docs.databricks.com/release-notes/runtime/10.4ml.html#system-environment) has CUDA 11
installed. Users will need to use 22.04.0 or later on Databricks 10.4 LTS ML.

In both cases use
[generate-init-script.ipynb](../demo/Databricks/generate-init-script.ipynb) which will install
the RAPIDS Spark plugin.

2. Once you are in the notebook, click the “Run All” button.
3. Ensure that the newly created init.sh script is present in the output from cell 2 and that the
contents of the script are correct.
Expand Down Expand Up @@ -143,7 +143,7 @@ Spark plugin and the CUDA 11 toolkit.
```bash
spark.rapids.sql.python.gpu.enabled true
spark.python.daemon.module rapids.daemon_databricks
spark.executorEnv.PYTHONPATH /databricks/jars/rapids-4-spark_2.12-22.02.0.jar:/databricks/spark/python
spark.executorEnv.PYTHONPATH /databricks/jars/rapids-4-spark_2.12-22.04.0.jar:/databricks/spark/python
```

7. Once you’ve added the Spark config, click “Confirm and Restart”.
Expand Down
6 changes: 3 additions & 3 deletions docs/get-started/getting-started-on-prem.md
Original file line number Diff line number Diff line change
Expand Up @@ -301,7 +301,7 @@ $SPARK_HOME/bin/spark-shell \

### MIG GPU on YARN
Using [MIG](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#introduction)
enabled GPUs on YARN requires enabling YARN GPU scheduling, cgroups, and using
enabled GPUs on YARN requires enabling YARN GPU scheduling and using
NVIDIA Docker runtime v2. The way to set this up depends on the version of YARN and the version
of Spark. It is important to note that CUDA 11 only supports enumeration of a single MIG instance.
This means that using any MIG device on YARN means only 1 GPU per container is allowed. See the
Expand All @@ -311,7 +311,7 @@ are using.
#### YARN version 3.3.0+
YARN version 3.3.0 and newer support a pluggable device framework which allows adding support for
MIG devices via a plugin. See
[NVIDIA GPU Plugin for YARN with MIG support for YARN 3.3.0+](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.02/examples/MIG-Support/device-plugins/gpu-mig).
[NVIDIA GPU Plugin for YARN with MIG support for YARN 3.3.0+](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.04/examples/MIG-Support/device-plugins/gpu-mig).
If you are using that plugin with a Spark version older than 3.2.1 and/or specifying the resource
as `nvidia/miggpu` you will also need to specify the config:

Expand All @@ -328,7 +328,7 @@ required.
If you are using YARN version from 3.1.2 up until 3.3.0, it requires making modifications to YARN
and deploying a version that adds support for MIG to the built-in YARN GPU resource plugin.

See [NVIDIA Support for GPU for YARN with MIG support for YARN 3.1.2 until YARN 3.3.0](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.02/examples/MIG-Support/resource-types/gpu-mig)
See [NVIDIA Support for GPU for YARN with MIG support for YARN 3.1.2 until YARN 3.3.0](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.04/examples/MIG-Support/resource-types/gpu-mig)
for details.

## Running on Kubernetes
Expand Down
2 changes: 1 addition & 1 deletion docs/spark-profiling-tool.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ more information.
The Profiling tool requires the Spark 3.x jars to be able to run but do not need an Apache Spark run time.
If you do not already have Spark 3.x installed,
you can download the Spark distribution to any machine and include the jars in the classpath.
- Download the jar file from [Maven repository](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/22.02.0/)
- Download the jar file from [Maven repository](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/22.04.0/)
- [Download Apache Spark 3.x](http://spark.apache.org/downloads.html) - Spark 3.1.1 for Apache Hadoop is recommended
If you want to compile the jars, please refer to the instructions [here](./spark-qualification-tool.md#How-to-compile-the-tools-jar).

Expand Down
4 changes: 2 additions & 2 deletions docs/spark-qualification-tool.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ more information.
The Qualification tools require the Spark 3.x jars to be able to run but do not need an Apache Spark run time.
If you do not already have Spark 3.x installed, you can download the Spark distribution to
any machine and include the jars in the classpath.
- Download the jar file from [Maven repository](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/22.02.0/)
- Download the jar file from [Maven repository](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/22.04.0/)
- [Download Apache Spark 3.x](http://spark.apache.org/downloads.html) - Spark 3.1.1 for Apache Hadoop is recommended

### Step 2 Run the Qualification tool
Expand Down Expand Up @@ -236,7 +236,7 @@ below for the description of output fields.
- Java 8 or above, Spark 3.0.1+

### Download the tools jar
- Download the jar file from [Maven repository](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/22.02.0/)
- Download the jar file from [Maven repository](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/22.04.0/)

### Modify your application code to call the api's

Expand Down