Correct 21.10 docs such as PCBS related FAQ [skip ci] (#3815)

* Signed-off-by: Hao Zhu <hazhu@nvidia.com> Correct some doc for 21.10 * Signed-off-by: Hao Zhu <hazhu@nvidia.com> Add 21.10 release notes * Signed-off-by: Hao Zhu <hazhu@nvidia.com> Add more release notes for 21.10 * Update docs/download.md Co-authored-by: Sameer Raheja <sameerz@users.noreply.github.com> * Update docs/download.md Co-authored-by: Nghia Truong <ttnghia@users.noreply.github.com> Co-authored-by: Sameer Raheja <sameerz@users.noreply.github.com> Co-authored-by: Nghia Truong <ttnghia@users.noreply.github.com>
NVIDIA · Oct 15, 2021 · 1203e6a · 1203e6a
1 parent 145a72c
commit 1203e6a
Show file tree

Hide file tree

Showing 3 changed files with 70 additions and 13 deletions.
diff --git a/docs/FAQ.md b/docs/FAQ.md
@@ -10,7 +10,7 @@ nav_order: 11
 
 ### What versions of Apache Spark does the RAPIDS Accelerator for Apache Spark support?
 
-The RAPIDS Accelerator for Apache Spark requires version 3.0.1, 3.0.2, 3.0.3, 3.1.1, or 3.1.2 of
+The RAPIDS Accelerator for Apache Spark requires version 3.0.1, 3.0.2, 3.0.3, 3.1.1, 3.1.2 or 3.2.0 of
 Apache Spark. Because the plugin replaces parts of the physical plan that Apache Spark considers to
 be internal the code for those plans can change even between bug fix releases. As a part of our
 process, we try to stay on top of these changes and release updates as quickly as possible.
@@ -287,13 +287,15 @@ AdaptiveSparkPlan isFinalPlan=false
 
 ### Are cache and persist supported?
 
-Yes cache and persist are supported, but they are not GPU accelerated yet. We are working with
-the Spark community on changes that would allow us to accelerate compression when caching data.
+Yes cache and persist are supported, the cache is GPU accelerated 
+but still stored on the host memory. 
+Please refer to [RAPIDS Cache Serializer](./additional-functionality/cache-serializer.md) 
+for more details.
 
 ### Can I cache data into GPU memory?
 
-No, that is not currently supported. It would require much larger changes to Apache Spark to be able
-to support this.
+No, that is not currently supported. 
+It would require much larger changes to Apache Spark to be able to support this.
 
 ### Is PySpark supported?
 

diff --git a/docs/additional-functionality/cache-serializer.md b/docs/additional-functionality/cache-serializer.md
@@ -29,21 +29,18 @@ nav_order: 2
   `spark.sql.inMemoryColumnarStorage.enableVectorizedReader` will not be honored as the GPU
   data is always read in as columnar. If `spark.rapids.sql.enabled` is set to false
   the cached objects will still be compressed on the CPU as a part of the caching process.
-
-  Please note that ParquetCachedBatchSerializer doesn't support negative decimal scale, so if 
-  `spark.sql.legacy.allowNegativeScaleOfDecimal` is set to true ParquetCachedBatchSerializer
-  should not be used.  Using the serializer with negative decimal scales will generate
-  an error at runtime.
 
   To use this serializer please run Spark with the following conf.
+
   ```
-  spark-shell --conf spark.sql.cache.serializer=com.nvidia.spark.ParquetCachedBatchSerializer"
+  spark-shell --conf spark.sql.cache.serializer=com.nvidia.spark.ParquetCachedBatchSerializer
   ```
 
 
 ##          Supported Types                       
 
- All types are supported on the CPU, on the GPU, ArrayType, MapType and BinaryType are not
- supported. If an unsupported type is encountered the Rapids Accelerator for Apache Spark will fall 
+ All types are supported on the CPU.
+ On the GPU, MapType and BinaryType are not supported. 
+ If an unsupported type is encountered the Rapids Accelerator for Apache Spark will fall 
  back to using the CPU for caching. 
 
diff --git a/docs/download.md b/docs/download.md
@@ -18,6 +18,64 @@ cuDF jar, that is either preinstalled in the Spark classpath on all nodes or sub
 that uses the RAPIDS Accelerator For Apache Spark. See the [getting-started
 guide](https://nvidia.github.io/spark-rapids/Getting-Started/) for more details.
 
+## Release v21.10.0
+Hardware Requirements:
+
+The plugin is tested on the following architectures:
+
+	GPU Architecture: NVIDIA V100, T4 and A10/A30/A100 GPUs
+
+Software Requirements:
+
+	OS: Ubuntu 18.04, Ubuntu 20.04 or CentOS 7, CentOS 8
+
+	CUDA & Nvidia Drivers*: 11.0-11.4 & v450.80.02+
+
+	Apache Spark 3.0.1, 3.0.2, 3.0.3, 3.1.1, 3.1.2, 3.2.0, Cloudera CDP 7.1.6, 7.1.7, Databricks 7.3 ML LTS or 8.2 ML Runtime, and GCP Dataproc 2.0
+
+	Apache Hadoop 2.10+ or 3.1.1+ (3.1.1 for nvidia-docker version 2)
+
+	Python 3.6+, Scala 2.12, Java 8
+
+*Some hardware may have a minimum driver version greater than v450.80.02+.  Check the GPU spec sheet
+for your hardware's minimum driver version.
+
+### Download v21.10.0
+* Download the [RAPIDS
+  Accelerator for Apache Spark 21.10.0 jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/21.10.0/rapids-4-spark_2.12-21.10.0.jar)
+* Download the [RAPIDS cuDF 21.10.0 jar](https://repo1.maven.org/maven2/ai/rapids/cudf/21.10.0/cudf-21.10.0-cuda11.jar)
+
+This package is built against CUDA 11.2 and has [CUDA forward
+compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/index.html) enabled.  It is tested
+on V100, T4, A30 and A100 GPUs with CUDA 11.0-11.4.  For those using other types of GPUs which
+do not have CUDA forward compatibility (for example, GeForce), CUDA 11.2 is required. Users will
+need to ensure the minimum driver (450.80.02) and CUDA toolkit are installed on each Spark node.
+
+### Release Notes
+New functionality and performance improvements for this release include:
+* Support collect_list and collect_set in group-by aggregation
+* Support stddev, percentile_approx in group-by aggregation
+* RunningWindow operations on map
+* HashAggregate on struct and nested struct
+* Sorting on nested structs
+* Explode on map, array, struct
+* Union-all on map, array and struct of maps
+* Parquet writing of map
+* ORC reader supports reading map/struct columns
+* ORC reader support decimal64 
+* Spark Qualification Tool
+  * Add conjunction and disjunction filters
+  * Filtering specific configuration values
+  * Filtering user name
+  * Reporting nested data types
+  * Reporting write data formats
+* Spark Profiling Tool
+  * Generating structured output format
+  * Improved profiling tool performance
+
+For a detailed list of changes, please refer to the
+[CHANGELOG](https://github.com/NVIDIA/spark-rapids/blob/main/CHANGELOG.md).
+
 ## Release v21.08.0
 Hardware Requirements: