Skip to content

Commit

Permalink
upmerge
Browse files Browse the repository at this point in the history
  • Loading branch information
andygrove committed May 26, 2022
2 parents c8a1fce + 781607b commit 9a2b9d3
Show file tree
Hide file tree
Showing 51 changed files with 563 additions and 322 deletions.
2 changes: 1 addition & 1 deletion docs/FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -376,7 +376,7 @@ There are multiple reasons why this a problematic configuration:

Yes, but it requires support from the underlying cluster manager to isolate the MIG GPU instance
for each executor (e.g.: by setting `CUDA_VISIBLE_DEVICES`,
[YARN with docker isolation](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.04/examples/MIG-Support)
[YARN with docker isolation](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.06/examples/MIG-Support)
or other means).

Note that MIG is not recommended for use with the RAPIDS Accelerator since it significantly
Expand Down
2 changes: 1 addition & 1 deletion docs/additional-functionality/ml-integration.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ access to any of the memory that RMM is holding.
## Spark ML Algorithms Supported by RAPIDS Accelerator

The [spark-rapids-examples repository](https://github.com/NVIDIA/spark-rapids-examples) provides a
[working example](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.04/examples/Spark-cuML/pca)
[working example](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.06/examples/ML+DL-Examples/Spark-cuML/pca)
of accelerating the `transform` API for
[Principal Component Analysis (PCA)](https://spark.apache.org/docs/latest/mllib-dimensionality-reduction#principal-component-analysis-pca).
The example leverages the [RAPIDS accelerated UDF interface](rapids-udfs.md) to provide a native
Expand Down
2 changes: 1 addition & 1 deletion docs/additional-functionality/rapids-udfs.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ type `DECIMAL64(scale=-2)`.
## RAPIDS Accelerated UDF Examples

<!-- Note: should update the branch name to tag when releasing-->
Source code for examples of RAPIDS accelerated UDFs is provided in the [udf-examples](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.04/examples/RAPIDS-accelerated-UDFs) project.
Source code for examples of RAPIDS accelerated UDFs is provided in the [udf-examples](https://github.com/NVIDIA/spark-rapids-examples/tree/branch-22.06/examples/UDF-Examples/RAPIDS-accelerated-UDFs) project.

## GPU Support for Pandas UDF

Expand Down
23 changes: 11 additions & 12 deletions docs/compatibility.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@ conditions within the computation itself the result may not be the same each tim
run. This is inherent in how the plugin speeds up the calculations and cannot be "fixed." If a query
joins on a floating point value, which is not wise to do anyways, and the value is the result of a
floating point aggregation then the join may fail to work properly with the plugin but would have
worked with plain Spark. Because of this most floating point aggregations are off by default but can
be enabled with the config
worked with plain Spark. As of 22.06 this is behavior is enabled by default but can be disabled with
the config
[`spark.rapids.sql.variableFloatAgg.enabled`](configs.md#sql.variableFloatAgg.enabled).

Additionally, some aggregations on floating point columns that contain `NaN` can produce results
Expand Down Expand Up @@ -792,13 +792,12 @@ disabled on the GPU by default and require configuration options to be specified

The GPU will use a different strategy from Java's BigDecimal to handle/store decimal values, which
leads to restrictions:
* It is only available when `ansiMode` is on.
* Float values cannot be larger than `1e18` or smaller than `-1e18` after conversion.
* The results produced by GPU slightly differ from the default results of Spark.

To enable this operation on the GPU, set
[`spark.rapids.sql.castFloatToDecimal.enabled`](configs.md#sql.castFloatToDecimal.enabled) to `true`
and set `spark.sql.ansi.enabled` to `true`.
As of 22.06 this conf is enabled, to disable this operation on the GPU when using Spark 3.1.0 or
later, set
[`spark.rapids.sql.castFloatToDecimal.enabled`](configs.md#sql.castFloatToDecimal.enabled) to `false`

### Float to Integral Types

Expand All @@ -808,9 +807,9 @@ Spark 3.1.0 the MIN and MAX values were floating-point values such as `Int.MaxVa
starting with 3.1.0 these are now integral types such as `Int.MaxValue` so this has slightly
affected the valid range of values and now differs slightly from the behavior on GPU in some cases.

To enable this operation on the GPU when using Spark 3.1.0 or later, set
As of 22.06 this conf is enabled, to disable this operation on the GPU when using Spark 3.1.0 or later, set
[`spark.rapids.sql.castFloatToIntegralTypes.enabled`](configs.md#sql.castFloatToIntegralTypes.enabled)
to `true`.
to `false`.

This configuration setting is ignored when using Spark versions prior to 3.1.0.

Expand All @@ -820,8 +819,8 @@ The GPU will use different precision than Java's toString method when converting
types to strings. The GPU uses a lowercase `e` prefix for an exponent while Spark uses uppercase
`E`. As a result the computed string can differ from the default behavior in Spark.

To enable this operation on the GPU, set
[`spark.rapids.sql.castFloatToString.enabled`](configs.md#sql.castFloatToString.enabled) to `true`.
As of 22.06 this conf is enabled by default, to disable this operation on the GPU, set
[`spark.rapids.sql.castFloatToString.enabled`](configs.md#sql.castFloatToString.enabled) to `false`.

### String to Float

Expand All @@ -834,8 +833,8 @@ default behavior in Apache Spark is to return `+Infinity` and `-Infinity`, respe

Also, the GPU does not support casting from strings containing hex values.

To enable this operation on the GPU, set
[`spark.rapids.sql.castStringToFloat.enabled`](configs.md#sql.castStringToFloat.enabled) to `true`.
As of 22.06 this conf is enabled by default, to enable this operation on the GPU, set
[`spark.rapids.sql.castStringToFloat.enabled`](configs.md#sql.castStringToFloat.enabled) to `false`.

### String to Date

Expand Down
Loading

0 comments on commit 9a2b9d3

Please sign in to comment.