upmerge

NVIDIA · May 26, 2022 · 9a2b9d3 · 9a2b9d3
2 parents c8a1fce + 781607b
commit 9a2b9d3
Show file tree

Hide file tree

Showing 51 changed files with 563 additions and 322 deletions.
diff --git a/docs/FAQ.md b/docs/FAQ.md
@@ -376,7 +376,7 @@ There are multiple reasons why this a problematic configuration:
 
 Yes, but it requires support from the underlying cluster manager to isolate the MIG GPU instance
 for each executor (e.g.: by setting `CUDA_VISIBLE_DEVICES`, 
-[YARN with docker isolation](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.04/examples/MIG-Support) 
+[YARN with docker isolation](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.06/examples/MIG-Support) 
 or other means).
 
 Note that MIG is not recommended for use with the RAPIDS Accelerator since it significantly

diff --git a/docs/additional-functionality/ml-integration.md b/docs/additional-functionality/ml-integration.md
@@ -40,7 +40,7 @@ access to any of the memory that RMM is holding.
 ## Spark ML Algorithms Supported by RAPIDS Accelerator
 
 The [spark-rapids-examples repository](https://github.com/NVIDIA/spark-rapids-examples) provides a
-[working example](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.04/examples/Spark-cuML/pca)
+[working example](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.06/examples/ML+DL-Examples/Spark-cuML/pca)
 of accelerating the `transform` API for
 [Principal Component Analysis (PCA)](https://spark.apache.org/docs/latest/mllib-dimensionality-reduction#principal-component-analysis-pca).
 The example leverages the [RAPIDS accelerated UDF interface](rapids-udfs.md) to provide a native

diff --git a/docs/additional-functionality/rapids-udfs.md b/docs/additional-functionality/rapids-udfs.md
@@ -135,7 +135,7 @@ type `DECIMAL64(scale=-2)`.
 ## RAPIDS Accelerated UDF Examples
 
 <!-- Note: should update the branch name to tag when releasing-->
-Source code for examples of RAPIDS accelerated UDFs is provided in the [udf-examples](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.04/examples/RAPIDS-accelerated-UDFs) project.
+Source code for examples of RAPIDS accelerated UDFs is provided in the [udf-examples](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.06/examples/UDF-Examples/RAPIDS-accelerated-UDFs) project.
 
 ## GPU Support for Pandas UDF
 

diff --git a/docs/compatibility.md b/docs/compatibility.md
@@ -63,8 +63,8 @@ conditions within the computation itself the result may not be the same each tim
 run. This is inherent in how the plugin speeds up the calculations and cannot be "fixed." If a query
 joins on a floating point value, which is not wise to do anyways, and the value is the result of a
 floating point aggregation then the join may fail to work properly with the plugin but would have
-worked with plain Spark. Because of this most floating point aggregations are off by default but can
-be enabled with the config
+worked with plain Spark. As of 22.06 this is behavior is enabled by default but can be disabled with 
+the config
 [`spark.rapids.sql.variableFloatAgg.enabled`](configs.md#sql.variableFloatAgg.enabled).
 
 Additionally, some aggregations on floating point columns that contain `NaN` can produce results
@@ -792,13 +792,12 @@ disabled on the GPU by default and require configuration options to be specified
 
 The GPU will use a different strategy from Java's BigDecimal to handle/store decimal values, which
 leads to restrictions:
-* It is only available when `ansiMode` is on.
 * Float values cannot be larger than `1e18` or smaller than `-1e18` after conversion.
 * The results produced by GPU slightly differ from the default results of Spark.
 
-To enable this operation on the GPU, set
-[`spark.rapids.sql.castFloatToDecimal.enabled`](configs.md#sql.castFloatToDecimal.enabled) to `true`
-and set `spark.sql.ansi.enabled` to `true`.
+As of 22.06 this conf is enabled, to disable this operation on the GPU when using Spark 3.1.0 or 
+later, set
+[`spark.rapids.sql.castFloatToDecimal.enabled`](configs.md#sql.castFloatToDecimal.enabled) to `false`
 
 ### Float to Integral Types
 
@@ -808,9 +807,9 @@ Spark 3.1.0 the MIN and MAX values were floating-point values such as `Int.MaxVa
 starting with 3.1.0 these are now integral types such as `Int.MaxValue` so this has slightly
 affected the valid range of values and now differs slightly from the behavior on GPU in some cases.
 
-To enable this operation on the GPU when using Spark 3.1.0 or later, set
+As of 22.06 this conf is enabled, to disable this operation on the GPU when using Spark 3.1.0 or later, set
 [`spark.rapids.sql.castFloatToIntegralTypes.enabled`](configs.md#sql.castFloatToIntegralTypes.enabled)
-to `true`.
+to `false`.
 
 This configuration setting is ignored when using Spark versions prior to 3.1.0.
 
@@ -820,8 +819,8 @@ The GPU will use different precision than Java's toString method when converting
 types to strings. The GPU uses a lowercase `e` prefix for an exponent while Spark uses uppercase
 `E`. As a result the computed string can differ from the default behavior in Spark.
 
-To enable this operation on the GPU, set
-[`spark.rapids.sql.castFloatToString.enabled`](configs.md#sql.castFloatToString.enabled) to `true`.
+As of 22.06 this conf is enabled by default, to disable this operation on the GPU, set
+[`spark.rapids.sql.castFloatToString.enabled`](configs.md#sql.castFloatToString.enabled) to `false`.
 
 ### String to Float
 
@@ -834,8 +833,8 @@ default behavior in Apache Spark is to return `+Infinity` and `-Infinity`, respe
 
 Also, the GPU does not support casting from strings containing hex values.
 
-To enable this operation on the GPU, set
-[`spark.rapids.sql.castStringToFloat.enabled`](configs.md#sql.castStringToFloat.enabled) to `true`.
+As of 22.06 this conf is enabled by default, to enable this operation on the GPU, set
+[`spark.rapids.sql.castStringToFloat.enabled`](configs.md#sql.castStringToFloat.enabled) to `false`.
 
 ### String to Date