Change log

Generated on 2021-05-03

Release 0.5

Features


#938	[FEA] Have hashed shuffle match spark
#1604	[FEA] Support casting structs to strings
#1920	[FEA] Support murmur3 hashing of structs
#2018	[FEA] A way for user to find out the plugin version and cudf version in REPL
#77	[FEA] Support ArrayContains
#1721	[FEA] build cudf jars with NVTX enabled
#1782	[FEA] Shim layers to support spark versions
#1625	[FEA] Support Decimal Casts to String and String to Decimal
#166	[FEA] Support get_json_object
#1698	[FEA] Support casting structs to string
#1912	[FEA] Let `Scalar Pandas UDF` support array of struct type.
#1136	[FEA] Audit: Script to list commits between different Spark versions/tags
#1921	[FEA] cudf version check should be lenient on later patch version
#19	[FEA] Out of core sorts

Performance


#2090	[FEA] Make row count estimates available to the cost-based optimizer
#1341	Optimize unnecessary columnar->row->columnar transitions with AQE
#1558	[FEA] Initialize UCX early
#1633	[FEA] Implement a cost-based optimizer
#1727	[FEA] Put RangePartitioner data path on the GPU

Bugs Fixed


#2279	[BUG] Hash Partitioning can fail for very small batches
#2314	[BUG] v0.5.0 pre-release pytests join_test.py::test_hash_join_array FAILED on SPARK-EGX Yarn Cluster
#2317	[BUG] GpuColumnarToRowIterator can stop after receiving an empty batch
#2244	[BUG] Executors hanging when running NDS benchmarks
#2278	[BUG] FullOuter join can produce too many results
#2220	[BUG] csv_test.py::test_csv_fallback FAILED on the EMR Cluster
#2225	[BUG] GpuSort fails on tables containing arrays.
#2232	[BUG] hash_aggregate_test.py::test_hash_grpby_pivot FAILED on the Databricks Cluster
#2231	[BUG]string_test.py::test_re_replace FAILED on the Dataproc Cluster
#2042	[BUG] NDS q14a fails with "GpuColumnarToRow does not implement doExecuteBroadcast"
#2203	[BUG] Spark nightly cache tests fail with -- master flag
#2230	[BUG] qa_nightly_select_test.py::test_select FAILED on the Dataproc Cluster
#1711	[BUG] find a way to stop allocating from RMM on the shuffle-client thread
#2109	[BUG] Fix high priority violations detected by code analysis tools
#2217	[BUG] qa_nightly_select_test failure in test_select
#2127	[BUG] Parsing with two-digit year should fall back to CPU
#2078	[BUG] java.lang.ArithmeticException: divide by zero when spark.sql.ansi.enabled=true
#2048	[BUG] split function+ repartition result in "ai.rapids.cudf.CudaException: device-side assert triggered"
#2036	[BUG] Stackoverflow when writing wide parquet files.
#1973	[BUG] generate_expr_test FAILED on Dataproc Cluster
#2079	[BUG] koalas.sql fails with java.lang.ArrayIndexOutOfBoundsException
#217	[BUG] CudaUtil should be removed
#1550	[BUG] The ORC output data of a query is not readable
#2074	[BUG] Intermittent NPE in RapidsBufferCatalog when running test suite
#2027	[BUG] udf_cudf_test.py integration tests fail
#1899	[BUG] Some queries fail when cost-based optimizations are enabled
#1914	[BUG] Add in float, double, timestamp, and date support to murmur3
#2014	[BUG] earlyStart option added in 0.5 can cause errors when starting UCX
#1984	[BUG] NDS q58 Decimal scale (59) cannot be greater than precision (38).
#2001	[BUG] RapidsShuffleManager didn't pass `dirs` to `getBlockData` from a wrapped `ShuffleBlockResolver`
#1797	[BUG] occasional crashes in CI
#1861	Encountered column data outside the range of input buffer
#1905	[BUG] Large concat task time in GpuShuffleCoalesce with pinned memory pool
#1638	[BUG] Tests `test_window_aggs_for_rows_collect_list` fails when there are null values in columns.
#1864	[BUG]HostColumnarToGPU inefficient when only doing count()
#1862	[BUG] spark 3.2.0-snapshot integration test failed due to conf change
#1844	[BUG] branch-0.5 nightly IT FAILED on the The mortgage ETL test "Could not read footer for file: file:/xxx/xxx.snappy.parquet"
#1627	[BUG] GDS exception when restoring spilled buffer
#1802	[BUG] Many decimal integration test failures for 0.5

PRs


#2316	Update doc to note that single quoted json strings are not ok
#2319	Disable hash partitioning on arrays
#2318	Fix ColumnarToRowIterator handling of empty batches
#2304	Update CHANGELOG.md
#2301	Update doc to reflect nanosleep problem with 460.32.03
#2298	Update changelog for v0.5.0 release [skip ci]
#2293	update cudf version to 0.19.2
#2289	Update docs to warn against 450.80.02 driver with 10.x toolkit
#2285	Require single batch for full outer join streaming
#2281	Remove download section for unreleased 0.4.2
#2264	Add spark312 and spark320 versions of cache serializer
#2254	updated gcp docs with custom dataproc image instructions
#2247	Allow specifying a superclass for non-GPU execs
#2235	Fix distributed cache to read requested schema
#2261	Make CBO row count test more robust
#2237	update cudf version to 0.19.1
#2240	Get the correct 'PIPESTATUS' in bash [skip ci]
#2242	Add shuffle doc section on the periodicGC configuration
#2251	Fix issue when out of core sorting nested data types
#2204	Run nightly tests for ParquetCachedBatchSerializer
#2245	Fix pivot bug for decimalType
#2093	Initial implementation of row count estimates in cost-based optimizer
#2188	Support GPU broadcast exchange reuse to feed CPU BHJ when AQE is enabled
#2227	ParquetCachedBatchSerializer broadcast AllConfs instead of SQLConf to fix distributed mode
#2223	Adds subquery aggregate tests from SPARK-31620
#2222	Remove groupId already specified in parent pom
#2209	Fixed a few issues with out of core sort
#2218	Fix incorrect RegExpReplace children handling on Spark 3.1+
#2207	fix batch size default values in the tuning guide
#2208	Revert "add nightly cache tests (#2083)"
#2206	Fix shim301db build
#2192	Fix index-based access to the head elements
#2210	Avoid redundant collection conversions
#2190	JNI fixes for StringWordCount native UDF example
#2086	Updating documentation for data format support
#2172	Remove easy unused symbols
#2089	Update PandasUDF doc
#2195	fix cudf 0.19.0 download link [skip ci]
#2175	Branch 0.5 doc update
#2168	Simplify GpuExpressions w/ withResourceIfAllowed
#2055	Support PivotFirst
#2183	GpuParquetScan#readBufferToTable remove dead code
#2129	Fall back to CPU when parsing two-digit years
#2083	add nightly cache tests
#2151	add corresponding close call for HostMemoryOutputStream
#2169	Work around bug in Spark for integration test
#2130	Fix divide-by-zero in GpuAverage with ansi mode
#2149	Auto generate the supported types for the file formats
#2072	Disable CSV parsing by default and update tests to better show what is left
#2157	fix merge conflict for 0.4.2 [skip ci]
#2144	Allow array and struct types to pass thru when doing join
#2145	Avoid GPU shuffle for round-robin of unsortable types
#2021	Add in support for murmur3 hashing of structs
#2128	Add in Partition type check support
#2116	Add dynamic Spark configuration for Databricks
#2132	Log plugin and cudf versions on startup
#2135	Disable Spark 3.2 shim by default
#2125	enable auto-merge from 0.5 to 0.6 [skip ci]
#2120	Materialize Stream before serialization
#2119	Add more comprehensive documentation on supported date formats
#1717	Decimal32 support
#2114	Modified the Download page for 0.4.1 and updated doc to point to K8s guide
#2106	Fix some buffer leaks
#2097	fix the bound row project empty issue in row frame
#2099	Remove verbose log prints to make the build/test log clean
#2105	Cleanup prior Spark sessions in tests consistently
#2104	Clone apache spark source code to parse the git commit IDs
#2095	fix refcount when materializing device buffer from GDS
#2100	[BUG] add wget for fetching conda [skip ci]
#2096	Adjust images for integration tests
#2094	Changed name of parquet files for Mortgage ETL Integration test
#2035	Accelerate data transfer for map Pandas UDF plan
#2050	stream shuffle buffers from GDS to UCX
#2084	Enable ORC write by default
#2088	Upgrade ScalaTest plugin to respect JAVA_HOME
#1932	Create a getting started on K8s page
#2080	Improve error message after failed RMM shutdown
#2064	Optimize unnecessary columnar->row->columnar transitions with AQE
#2025	Update the doc for pandas udf on databricks
#2059	Add the flag 'TEST_TYPE' to avoid integration tests silently skipping some test cases
#2075	Remove debug println from CBO test
#2046	support casting Decimal to String
#1812	allow spilled buffers to be unspilled
#2061	Run the pandas udf using cudf on Databricks
#1893	Plug-in support for get_json_object
#2044	Use partition for GPU hash partitioning
#1954	Fix CBO bug where incompatible plans were produced with AQE on
#2049	Remove incompatable int overflow checking
#2056	Remove Spark 3.2 from premerge and nightly CI run
#1814	Struct to string casting functionality
#2037	Fix warnings from use of deprecated cudf methods
#2033	Bump up pre-merge OS from ubuntu 16 to ubuntu 18 [skip ci]
#1883	Enable sort for single-level nesting struct columns on GPU
#2016	Refactor logic for parallel testing
#2022	Update order by to not load native libraries when sorting
#2017	Add in murmur3 support for float, double, date and timestamp
#1981	Fix GpuSize
#1999	support casting string to decimal
#2006	Enable windowed `collect_list` by default
#2000	Use Spark's HybridRowQueue to avoid MemoryConsumer API shim
#2015	Fix bug where rkey buffer is getting advanced after the first handshake
#2007	Fix unknown column name error when filtering ORC file with no names
#2005	Update to new is_before_spark_311 function name
#1944	Support running scalar pandas UDF with array type.
#1991	Fixes creation of invalid DecimalType in GpuDivide.tagExprForGpu
#1958	Support legacy behavior of parameterless count
#1919	Add support for Structs for UnionExec
#2002	Pass dirs to getBlockData for a wrapped shuffle resolver
#1983	document building against different CUDA Toolkit versions
#1994	Merge 0.4 to 0.5 [skip ci]
#1982	Update ORC pushdown filter building to latest Spark logic
#1978	Add audit script to list commits from Spark
#1976	Temp fix for parquet write changes
#1970	add maven profiles for supported CUDA versions
#1951	Branch 0.5 doc remove numpartitions
#1967	Update FAQ for Dataset API and format supported versions
#1972	support GpuSize
#1966	add xml report for codecov
#1955	Fix typo in Arrow optimization config
#1956	Fix NPE in plugin shutdown
#1930	Relax cudf version check for patch-level versions
#1787	support distributed file path in cloud environment
#1961	change premege GPU_TYPE from secret to global env [skip ci]
#1957	Update Spark 3.1.2 shim for float upcast behavior
#1889	Decimal DIV changes
#1947	Move doc of Pandas UDF to additional-functionality
#1938	Add spark.executor.resource.gpu.amount=1 to YARN and K8s docs
#1937	Fix merge conflict with branch-0.4
#1878	spillable cache for GpuCartesianRDD
#1843	Refactor GpuGenerateExec and Explode
#1933	Split DB scripts to make them common for the build and IT pipeline
#1935	Update Alias SQL quoting and float-to-timestamp casting to match Spark 3.2
#1926	Consolidate RAT settings in parent pom
#1918	Minor code cleanup in dateTImeExpressions
#1906	Remove get call on timeZoneId
#1908	Remove the Scala version of Mortgage ETL tests from nightly test
#1894	Modified Download Page to re-order the items and change the format of download links
#1909	Avoid pinned memory for shuffle host buffers
#1891	Connect UCX endpoints early during app startup
#1877	remove docker build in pre-merge [skip ci]
#1830	Enable the tests for collect over window.
#1882	GpuArrowColumnarBatchBuilder retains the references of ArrowBuf until HostToGpuCoalesceIterator put them into device
#1868	Increase row limit when doing count() for HostColumnarToGpu
#1855	Expose row count statistics in GpuShuffleExchangeExec
#1875	Fix merge conflict with branch-0.4
#1841	Add in support for DateAddInterval
#1869	Fix tests for Spark 3.2.0 shim
#1858	fix shuffle manager doc on ucx library path
#1836	Add shim for Spark 3.1.2
#1852	Fix Part Suite Tests
#1616	Cost-based optimizer
#1834	Add shim for Spark 3.0.3
#1839	Refactor join code to reduce duplicated code
#1848	Fix merge conflict with branch-0.4
#1796	Have most of range partitioning run on the GPU
#1845	Fix fails on the mortgage ETL test
#1829	Cleanup unused Jenkins files and scripts
#1704	Create a shim for Spark 3.2.0 development
#1838	Make databricks build.sh more convenient for dev
#1835	Fix merge conflict with branch-0.4
#1808	Update mortgage tests to support reading multiple dataset formats
#1822	Fix conflict 0.4 to 0.5
#1807	Fix merge conflict between branch-0.4 and branch-0.5
#1788	Spill metrics everywhere
#1719	Add in out of core sort
#1728	Skip RAPIDS accelerated Java UDF tests if UDF fails to load
#1689	Update docs for plugin 0.5.0-SNAPSHOT and cudf 0.19-SNAPSHOT
#1682	init CI/CD dependencies branch-0.5

Release 0.4.1

Bugs Fixed


#1985	[BUG] broadcast exchange can fail on 0.4

PRs


#1995	update changelog 0.4.1 [skip ci]
#1990	Prepare for v0.4.1 release
#1988	broadcast exchange can fail when job group set

Release 0.4

Features


#1773	[FEA] Spark 3.0.2 release support
#80	[FEA] Support the struct SQL function
#76	[FEA] Support CreateArray
#1635	[FEA] RAPIDS accelerated Java UDF
#1333	[FEA] Support window operations on Decimal
#1419	[FEA] Support GPU accelerated UDF alternative for higher order function "aggregate" over window
#1580	[FEA] Support Decimal for ParquetCachedBatchSerializer
#1600	[FEA] Support ScalarSubquery
#1072	[FEA] Support for a custom DataSource V2 which supplies Arrow data
#906	[FEA] Clarify query explanation to directly state what will run on GPU
#1335	[FEA] Support CollectLimitExec for decimal
#1485	[FEA] Decimal Support for Parquet Write
#1329	[FEA] Decimal support for multiply int div, add, subtract and null safe equals
#1351	[FEA] Execute UDFs that provide a RAPIDS execution path
#1330	[FEA] Support Decimal Casts
#1353	[FEA] Example of RAPIDS UDF using custom GPU code
#1487	[FEA] Change spark 3.1.0 to 3.1.1
#1334	[FEA] Add support for count aggregate on decimal
#1325	[FEA] Add in join support for decimal
#1326	[FEA] Add in Broadcast support for decimal values
#37	[FEA] round and bround SQL functions
#78	[FEA] Support CreateNamedStruct function
#1331	[FEA] UnionExec and ExpandExec support for decimal
#1332	[FEA] Support CaseWhen, Coalesce and IfElse for decimal
#937	[FEA] have murmur3 hash function that matches exactly with spark
#1324	[FEA] Support Parquet Read of Decimal FIXED_LENGTH_BYTE_ARRAY
#1428	[FEA] Add support for unary decimal operations abs, floor, ceil, unary - and unary +
#1375	[FEA] Add log statement for what the concurrentGpuTasks tasks is set to on executor startup
#1352	[FEA] Example of RAPIDS UDF using cudf Java APIs
#1328	[FEA] Support sorting and shuffle of decimal
#1316	[FEA] Support simple DECIMAL aggregates

Performance


#1435	[FEA]Improve the file reading by using local file caching
#1738	[FEA] Reduce regex usage in CAST string to date/timestamp
#987	[FEA] Optimize CAST from string to temporal types by using cuDF is_timestamp function
#1594	[FEA] RAPIDS accelerated ScalaUDF
#103	[FEA] GPU version of TakeOrderedAndProject
#1024	Cleanup RAPIDS transport calls to `receive`
#1366	Seeing performance differences of multi-threaded/coalesce/perfile Parquet reader type for a single file
#1200	[FEA] Accelerate the scan speed for coalescing parquet reader when reading files from multiple partitioned folders

Bugs Fixed


#1885	[BUG] natural join on string key results in a data frame with spurious NULLs
#1785	[BUG] Rapids pytest integration tests FAILED on Yarn cluster with unrecognized arguments: `--std_input_path=src/test/resources/`
#999	[BUG] test_multi_types_window_aggs_for_rows_lead_lag fails against Spark 3.1.0
#1818	[BUG] unmoored doc comment warnings in GpuCast
#1817	[BUG] Developer build with local modifications fails during verify phase
#1644	[BUG] test_window_aggregate_udf_array_from_python fails on databricks
#1771	[BUG] Databricks AWS CI/CD failing to create cluster
#1157	[BUG] Fix regression supporting to_date on GPU with Spark 3.1.0
#716	[BUG] Cast String to TimeStamp issues
#1117	[BUG] CAST string to date returns wrong values for dates with out-of-range values
#1670	[BUG] Some TPC-DS queries fail with AQE when decimal types enabled
#1730	[BUG] Range Partitioning can crash when processing is in the order-by
#1726	[BUG] java url decode test failing on databricks, emr, and dataproc
#1651	[BUG] GDS exception when writing shuffle file
#1702	[BUG] check all tests marked xfail for Spark 3.1.1
#575	[BUG] Spark 3.1 FAILED join_test.py::test_broadcast_join_mixed[FullOuter][IGNORE_ORDER] failed
#577	[BUG] Spark 3.1 log arithmetic functions fail
#1541	[BUG] Tests fail in integration in distributed mode after allowing nested types through in sort and shuffle
#1626	[BUG] TPC-DS-like query 77 at scale=3TB fails with maxResultSize exceeded error
#1576	[BUG] loading SPARK-32639 example parquet file triggers a JVM crash
#1643	[BUG] TPC-DS-Like q10, q35, and q69 - slow or hanging at leftSemiJoin
#1650	[BUG] BenchmarkRunner does not include query name in JSON summary filename when running multiple queries
#1654	[BUG] TPC-DS-like query 59 at scale=3TB with AQE fails with join mismatch
#1274	[BUG] OutOfMemoryError - Maximum pool size exceeded while running 24 day criteo ETL Transform stage
#1497	[BUG] Spark-rapids v0.3.0 pytest integration tests with UCX on FAILED on Yarn cluster
#1534	[BUG] Spark 3.1.1 test failure in writing due to removal of InMemoryFileIndex.shouldFilterOut
#1155	[BUG] on shutdown don't print `Socket closed` exception when shutting down UCX.scala
#1510	[BUG] IllegalArgumentException during shuffle
#1513	[BUG] executor not fully initialized may get calls from Spark, in the process setting the `catalog` incorrectly
#1466	[BUG] Databricks build must run before the rapids nightly
#1456	[BUG] Databricks 0.4 parquet integration tests fail
#1400	[BUG] Regressions in spark-shell usage of benchmark utilities
#1119	[BUG] inner join fails with Column size cannot be negative
#1079	[BUG]The Scala UDF function cannot invoke the UDF compiler when it's passed to "explode"
#1298	TPCxBB query16 failed at UnsupportedOperationException: org.apache.parquet.column.values.dictionary.PlainValuesDictionary$PlainIntegerDictionary
#1271	[BUG] CastOpSuite and AnsiCastOpSuite failing with ArithmeticException on Spark 3.1
#84	[BUG] sort does not match spark for -0.0 and 0.0
#578	[BUG] Spark 3.1 qa_nightly_select_test.py Full join test failures
#586	[BUG] Spark3.1 tpch failures
#837	[BUG] Distinct count of floating point values differs with regular spark
#953	[BUG] 3.1.0 pos_explode tests are failing
#127	[BUG] String CSV parsing does not respect nullValues
#1203	[BUG] tpcds query 51 fails with join error on Spark 3.1.0
#750	[BUG] udf_cudf_test::test_with_column fails with IPC error
#1348	[BUG] Host columnar decimal conversions are failing
#1270	[BUG] Benchmark runner fails to produce report if benchmark fails due to an invalid query plan
#1179	[BUG] SerializeConcatHostBuffersDeserializeBatch may have thread issues
#1115	[BUG] Unchecked type warning in SparkQueryCompareTestSuite

PRs


#1963	Update changelog 0.4 [skip ci]
#1960	Replace sonatype staging link with maven central link
#1945	Update changelog 0.4 [skip ci]
#1910	Make hash partitioning match CPU
#1927	Change cuDF dependency to 0.18.1
#1934	Update documentation to use cudf version 0.18.1
#1871	Disable coalesce batch spilling to avoid cudf contiguous_split bug
#1849	Update changelog for 0.4
#1744	Fix NullPointerException on null partition insert
#1842	Update to note support for 3.0.2
#1832	Spark 3.1.1 shim no longer a snapshot shim
#1831	Spark 3.0.2 shim no longer a snapshot shim
#1826	Remove benchmarks
#1828	Update cudf dependency to 0.18
#1813	Fix LEAD/LAG failures in Spark 3.1.1
#1819	Fix scaladoc warning in GpuCast
#1820	[BUG] make modified check pre-merge only
#1780	Remove SNAPSHOT from test and integration_test READMEs
#1809	check if modified files after update_config/supported
#1804	Update UCX documentation for RX_QUEUE_LEN and Docker
#1810	Pandas UDF: Sort the data before computing the sum.
#1751	Exclude foldable expressions from GPU if constant folding is disabled
#1798	Add documentation about explain not on GPU when AQE is on
#1766	Branch 0.4 release docs
#1794	Build python output schema from udf expressions
#1783	Fix the collect_list over window tests failures on db
#1781	Better float/double cases for casting tests
#1790	Record row counts in benchmark runs that call collect
#1779	Add support of DateType and TimestampType for GetTimestamp expression
#1768	Updating getting started Databricks docs
#1742	Fix regression supporting to_date with Spark-3.1
#1775	Fix ambiguous ordering for some tests
#1760	Update GpuDataSourceScanExec and GpuBroadcastExchangeExec to fix audit issues
#1750	Detect task failures in benchmarks
#1767	Consistent Spark version for test and production
#1741	Reduce regex use in CAST
#1756	Skip RAPIDS accelerated Java UDF tests if UDF fails to load
#1716	Update RapidsShuffleManager documentation for branch 0.4
#1740	Disable ORC writes until bug can be fixed
#1747	Fix resource leaks in unit tests
#1725	Branch 0.4 FAQ reorg
#1718	CAST string to temporal type now calls isTimestamp
#1734	Disable range partitioning if computation is needed
#1723	Removed StructTypes support for ParquetCachedBatchSerializer as cudf doesn't support it yet
#1714	Add support for RAPIDS accelerated Java UDFs
#1713	Call GpuDeviceManager.shutdown when the executor plugin is shutting down
#1596	Added in Decimal support to ParquetCachedBatchSerializer
#1706	cleanup unused is_before_spark_310
#1685	Fix CustomShuffleReader replacement when decimal types enabled
#1699	Add docs about Spark 3.1 in standalone modes not needing extra class path
#1701	remove xfail for orc test_input_meta for spark 3.1.0
#1703	Remove xfail for spark 3.1.0 test_broadcast_join_mixed FullOuter
#1676	BenchmarkRunner option to generate query plan diagrams in DOT format
#1695	support alternate jar paths
#1694	increase mem and limit parallelism for pre-merge
#1691	add validate_execs_in_gpu_plan to pytest.ini
#1692	Add the integration test resources to the test tarball
#1677	When PTDS is enabled, print warning if the allocator is not ARENA
#1683	update changelog to verify autotmerge 0.5 setup [skip ci]
#1673	support auto-merge for branch 0.5 [skip ci]
#1681	Xfail the collect_list tests for databricks
#1678	Fix array/struct checks in Sort and HashAggregate and sorting tests in distributed mode
#1671	Allow metrics to be configurable by level
#1675	add run_pyspark_from_build.sh to the pytest distribution tarball
#1548	Support executing collect_list on GPU with windowing.
#1593	Avoid unnecessary Table instances after contiguous split
#1592	Add in support for Decimal divide
#1668	Implement way for python integration tests to validate Exec is in GPU plan
#1669	Add FAQ entries for executor-per-GPU questions
#1661	Enable Parquet test for file containing map struct key
#1664	Filter nulls for left semi and left anti join to work around cudf
#1665	Add better automated tests for Arrow columnar copy in HostColumnarToGpu
#1614	add alluxio getting start document
#1639	support GpuScalarSubquery
#1656	Move UDF to Catalyst Expressions to its own document
#1663	BenchmarkRunner - Include query name in JSON summary filename
#1655	Fix extraneous shuffles added by AQE
#1652	Fix typo in arrow optimized config name - spark.rapids.arrowCopyOptimizationEnabled
#1645	Run Databricks IT with python-xdist parallel, includes test fixes and xfail
#1649	Move building from source docs to contributing guide
#1637	Fail DivModLike on zero divisor in ANSI mode
#1646	Update links in rapids-udfs.md after moving to subfolder
#1641	Xfail struct and array order by tests on Dataproc
#1565	Add GPU accelerated array_contains operator
#1617	Enable nightly test checks for Apache Spark
#1636	RAPIDS accelerated Spark Scala UDF support
#1634	Fix databricks build since Arrow code added
#1599	Add division by zero tests for Spark 3.1 behavior
#1619	Update GpuFileSourceScanExec to be in sync with DataSourceScanExec
#1631	Explicitly add maven-jar-plugin version to improve incremental build time.
#1624	Update explain format to show what will and will not run on the GPU
#1622	Support faster copy for a custom DataSource V2 which supplies Arrow data
#1621	Additional functionality docs
#1618	update blossom-ci for security updates [skip ci]
#1562	add alluxio support
#1597	Documentation for Parquet serializer
#1611	Add in flag for integration tests to not skip required tests
#1609	Disable float round/bround by default
#1615	Add in window support for average
#1610	Limit length of spark app name in BenchmarkRunner
#1579	Support TakeOrderedAndProject
#1581	Support Decimal type for CollectLimitExec
#1591	Add support for running multiple queries in BenchmarkRunner
#1595	Fix Github documentation issue template
#1577	rename directory from spark310 to spark311
#1578	Test to track RAPIDS-side issues re SPARK-32639
#1583	fix request-action issue [skip ci]
#1555	Enable ANSI mode for CAST string to timestamp
#1531	Decimal Support for writing Parquet
#1545	Support comparing ORC data
#1570	Branch 0.4 doc cleanup
#1569	Add shim method shouldIgnorePath
#1564	Add in support for Decimal Multiply and DIV
#1561	Decimal support for add and subtract
#1560	support sum in window aggregation for decimal
#1546	Cleanup shutdown logging for UCX shuffle
#1551	RAPIDS-accelerated Hive UDFs support all types
#1543	Shuffle/transport enabled by default
#1552	Disable blackduck signature check
#1540	Handle ShuffleManager api calls when plugin is not fully initialized
#1547	Cleanup shuffle transport receive calls
#1512	Support window operations on Decimal
#1532	Support casting from decimal to decimal
#1542	Change the number of partitions to zero when a range is empty
#1506	Add --use-decimals flag to TPC-DS ConvertFiles
#1511	Remove unused Jenkinsfiles [skip ci]
#1505	Add least, greatest and eqNullSafe support for DecimalType
#1484	add doc for nsight systems bundled with cuda toolkit
#1478	Documentation for RAPIDS-accelerated Hive UDFs
#1477	Allow structs and arrays to pass through for Shuffle and Sort
#1489	Adds in some support for the array sql function
#1438	Cast from numeric types to decimal type
#1493	Moved ParquetRecordMaterializer to the shim package to follow convention
#1495	Fix merge conflict, merge branch 0.3 to branch 0.4 [skip ci]
#1472	Add an example RAPIDS-accelerated Hive UDF using native code
#1488	Rename Spark 3.1.0 shim to Spark 3.1.1 to match community
#1474	Fix link
#1476	DecimalType support for Aggregate Count
#1475	Join support for DecimalType
#1244	Support round and bround SQL functions
#1458	Add in support for struct and named_struct
#1465	DecimalType support for UnionExec and ExpandExec
#1450	Add dynamic configs for the spark-rapids IT pipelines
#1207	Spark SQL hash function using murmur3
#1457	Support reading decimal columns from parquet files on Databricks
#1455	Upgrade Scala Maven Plugin to 4.3.0
#1453	DecimalType support for IfElse and Coalesce
#1452	Support DecimalType for CaseWhen
#1444	Improve UX when running benchmarks from Spark shell
#1294	Support reading decimal columns from parquet files
#1153	Scala UDF will compile children expressions in Project
#1416	Optimize mvn dependency download scripts
#1430	Add project for testing code that requires Spark 3.1.0 or later
#1425	Add in Decimal support for abs, floor, ceil, unary - and unary +
#1427	Revert "Make the multi-threaded parquet reader the default"
#1420	Add udf jar to nightly integration tests
#1422	Log the number of concurrent gpu tasks allowed on Executor startup
#1401	Accelerate the coalescing parquet reader when reading files from multiple partitioned folders
#1413	Add config for cast float to integral types
#1313	Support spilling to disk directly via cuFile/GDS
#1411	Add udf-examples jar to databricks build
#1412	Fix a lot of tests marked with xfail for Spark 3.1.0 that no longer fail
#1414	Build merged code of HEAD and BASE branch for pre-merge [skip ci]
#1409	Add option to use decimals in tpc-ds csv to parquet conversion
#1410	Add Decimal support for In, InSet, AtLeastNNonNulls, GetArrayItem, GetStructField, and GenerateExec
#1408	Support RAPIDS-accelerated HiveGenericUDF
#1407	Update docs and tests for null CSV support
#1393	Support RAPIDS-accelerated HiveSimpleUDF
#1392	Turn on hash partitioning for decimal support
#1402	Better GPU Cast type checks
#1404	Fix branch 0.4 merge conflict
#1323	More advanced type checking and documentation
#1391	Remove extra null join filtering because cudf is fast for this now.
#1395	Fix branch-0.3 -> branch-0.4 automerge
#1382	Handle "MM[/-]dd" and "dd[/-]MM" datetime formats in UnixTimeExprMeta
#1390	Accelerated columnar to row/row to columnar for decimal
#1380	Adds in basic support for decimal sort, sum, and some shuffle
#1367	Reuse gpu expression conversion rules when checking sort order
#1349	Add canonicalization tests
#1368	Move to cudf 0.18-SNAPSHOT
#1361	Use the correct precision when reading spark columnar data.
#1273	Update docs and scripts to 0.4.0-SNAPSHOT
#1321	Refactor to stop inheriting from HashJoin
#1311	ParquetCachedBatchSerializer code cleanup
#1303	Add explicit outputOrdering for BHJ and SHJ in spark310 shim
#1299	Benchmark runner improved error handling

Release 0.3

Features


#1002	[FEA] RapidsHostColumnVectorCore should verify cudf data with respect to the expected spark type
#444	[FEA] Plugable Cache
#1158	[FEA] Better documentation on type support
#57	[FEA] Support INT96 for parquet reads and writes
#1003	[FEA] Reduce overlap between RapidsHostColumnVector and RapidsHostColumnVectorCore
#913	[FEA] In Pluggable Cache Support CalendarInterval while creating CachedBatches
#1092	[FEA] In Pluggable Cache handle nested types having CalendarIntervalType and NullType
#670	[FEA] Support NullType
#50	[FEA] support `spark.sql.legacy.timeParserPolicy`
#1144	[FEA] Remove Databricks 3.0.0 shim layer
#1096	[FEA] Implement parquet CreateDataSourceTableAsSelectCommand
#688	[FEA] udf compiler should be auto-appended to `spark.sql.extensions`
#502	[FEA] Support Databricks 7.3 LTS Runtime
#764	[FEA] Sanity checks for cudf jar mismatch
#1018	[FEA] Log details related to GPU memory fragmentation on GPU OOM
#619	[FEA] log whether libcudf and libcudfjni were built for PTDS
#905	[FEA] create AWS EMR 3.0.1 shim
#838	[FEA] Support window count for a column
#864	[FEA] config option to enable RMM arena memory resource
#430	[FEA] Audit: Parquet Writer support for TIMESTAMP_MILLIS
#818	[FEA] Create shim layer for AWS EMR
#608	[FEA] Parquet small file optimization improve handle merge schema

Performance


#446	[FEA] Test jucx in 1.9.x branch
#1038	[FEA] Accelerate the data transfer for plan `WindowInPandasExec`
#533	[FEA] Improve PTDS performance
#849	[FEA] Have GpuColumnarBatchSerializer return GpuColumnVectorFromBuffer instances
#784	[FEA] Allow Host Spilling to be more dynamic
#627	[FEA] Further parquet reading small file improvements
#5	[FEA] Support Adaptive Execution

Bugs Fixed


#1423	[BUG] Mortgage ETL sample failed with spark.sql.adaptive enabled on AWS EMR 6.2
#1369	[BUG] TPC-DS Query Failing on EMR 6.2 with AQE
#1344	[BUG] Spark-rapids Pytests failed on On Databricks cluster spark standalone mode
#1279	[BUG] TPC-DS query 2 failing with NPE
#1280	[BUG] TPC-DS query 93 failing with UnsupportedOperationException
#1308	[BUG] TPC-DS query 14a runs much slower on 0.3
#1284	[BUG] TPC-DS query 77 at scale=1TB fails with maxResultSize exceeded error
#1061	[BUG] orc_test.py is failing
#1197	[BUG] java.lang.NullPointerException when exporting delta table
#685	[BUG] In ParqueCachedBatchSerializer, serializing parquet buffers might blow up in certain cases
#1269	[BUG] GpuSubstring is not expected to be a part of a SortOrder
#1246	[BUG] Many TPC-DS benchmarks fail when writing to Parquet
#961	[BUG] ORC predicate pushdown should work with case-insensitive analysis
#962	[BUG] Loading columns from an ORC file without column names returns no data
#1245	[BUG] Code adding buffers to the spillable store should synchronize
#570	[BUG] Continue debugging OOM after ensuring device store is empty
#972	[BUG] total time metric is redundant with scan time
#1039	[BUG] UNBOUNDED window ranges on null timestamp columns produces incorrect results.
#1195	[BUG] AcceleratedColumnarToRowIterator queue empty
#1177	[BUG] leaks possible in the rapids shuffle if batches are received after the task completes
#1216	[BUG] Failure to recognize ORC file format when loaded via Hive
#898	[BUG] count reductions are failing on databricks because lack for Complete support
#1184	[BUG] test_window_aggregate_udf_array_from_python fails on databricks 3.0.1
#1151	[BUG]Add databricks 3.0.1 shim layer for GpuWindowInPandasExec.
#1199	[BUG] No data size in Input column in Stages page from Spark UI when using Parquet as file source
#1031	[BUG] dependency info properties file contains error messages
#1149	[BUG] Scaladoc warnings in GpuDataSource
#1185	[BUG] test_hash_multiple_mode_query failing
#724	[BUG] PySpark test_broadcast_nested_loop_join_special_case intermittent failure
#1164	[BUG] ansi_cast tests are failing in 3.1.0
#1110	[BUG] Special date "now" has wrong value on GPU
#1139	[BUG] Host columnar to GPU can be very slow
#1094	[BUG] unix_timestamp on GPU returns invalid data for special dates
#1098	[BUG] unix_timestamp on GPU returns invalid data for bad input
#1082	[BUG] string to timestamp conversion fails with split
#1140	[BUG] ConcurrentModificationException error after scala test suite completes
#1073	[BUG] java.lang.RuntimeException: BinaryExpressions must override either eval or nullSafeEval
#975	[BUG] BroadcastExchangeExec fails to fall back to CPU on driver node on GCP Dataproc
#773	[BUG] Investigate high task deserialization
#1035	[BUG] TPC-DS query 90 with AQE enabled fails with doExecuteBroadcast exception
#825	[BUG] test_window_aggs_for_ranges intermittently fails
#1008	[BUG] limit function is producing inconsistent result when type is Byte, Long, Boolean and Timestamp
#996	[BUG] TPC-DS benchmark via spark-submit does not provide option to disable appending .dat to path
#1006	[BUG] Spark3.1.0 changed BasicWriteTaskStats breaks BasicColumnarWriteTaskStatsTracker
#985	[BUG] missing metric `dataSize`
#881	[BUG] cannot disable Sort by itself
#812	[BUG] Test failures for 0.2 when run with multiple executors
#925	[BUG]Range window-functions with non-timestamp order-by expressions not falling back to CPU
#852	[BUG] BenchUtils.compareResults cannot compare partitioned files when ignoreOrdering=false
#868	[BUG] Rounding error when casting timestamp to string for timestamps before 1970
#880	[BUG] doing a window operation with an orderby for a single constant crashes
#776	[BUG] Integration test fails on spark 3.1.0-SNAPSHOT
#874	[BUG] `RapidsConf.scala` has some un-consistency for `spark.rapids.sql.format.parquet.multiThreadedRead`
#860	[BUG] we need to mark columns from received shuffle buffers as `GpuColumnVectorFromBuffer`
#122	[BUG] CSV Timestamp parseing is broken for TS < 1902 and TS > 2038
#810	[BUG] UDF Integration tests fail if pandas is not installed
#746	[BUG] cudf_udf_test.py is flakey
#811	[BUG] 0.3 nightly is timing out
#574	[BUG] Fix GpuTimeSub for Spark 3.1.0

PRs


#1496	Update changelog for v0.3.0 release [skip ci]
#1473	Update documentation for 0.3 release
#1371	Start Guide for RAPIDS on AWS EMR 6.2
#1446	Update changelog for 0.3.0 release [skip ci]
#1439	when AQE enabled we fail to fix up exchanges properly and EMR
#1433	fix pandas 1.2 compatible issue
#1424	Make the multi-threaded parquet reader the default since coalescing doesn't handle partitioned files well
#1389	Update project version to 0.3.0
#1387	Update cudf version to 0.17
#1370	[REVIEW] init changelog 0.3 [skip ci]
#1376	MetaUtils.getBatchFromMeta should return batches with GpuColumnVectorFromBuffer
#1358	auto-merge: instant merge after creation [skip ci]
#1359	Use SortOrder from shims.
#1343	Do not run UDFs when the partition is empty.
#1342	Fix and edit docs for standalone mode
#1350	fix GpuRangePartitioning canonicalization
#1281	Documentation added for testing
#1336	Fix missing post-shuffle coalesce with AQE
#1318	Fix copying GpuFileSourceScanExec node
#1337	Use UTC instead of GMT
#1307	Fallback to cpu when reading Delta log files for stats
#1310	Fix canonicalization of GpuFileSourceScanExec, GpuShuffleCoalesceExec
#1302	Add GpuSubstring handling to SortOrder canonicalization
#1265	Chunking input before writing a ParquetCachedBatch
#1278	Add a config to disable decimal types by default
#1272	Add Alias to shims
#1268	Adds in support docs for 0.3 release
#1235	Trigger reading and handling control data.
#1266	Updating Databricks getting started for 0.3 release
#1291	Increase pre-merge resource requests [skip ci]
#1275	Temporarily disable more CAST tests for Spark 3.1.0
#1264	Fix race condition in batch creation
#1260	Update UCX license info in NOTIFY-binary for 1.9 and RAPIDS plugin copyright dates
#1247	Ensure column names are valid when writing benchmark query results to file
#1240	Fix loading from ORC file with no column names
#1242	Remove compatibility documentation about unsupported INT96
#1192	[REVIEW] Support GpuFilter and GpuCoalesceBatches for decimal data
#1170	Add nested type support to MetaUtils
#1194	Drop redundant total time metric from scan
#1248	At BatchedTableCompressor.finish synchronize to allow for "right-size…
#1169	Use CUDF's "UNBOUNDED" window boundaries for time-range queries.
#1204	Avoid empty batches on columnar to row conversion
#1133	Refactor batch coalesce to be based solely on batch data size
#1237	In transport, limit pending transfer requests to fit within a bounce
#1232	Move SortOrder creation to shims
#1068	Write int96 to parquet
#1193	Verify shuffle of decimal columns
#1180	Remove batches if they are received after the iterator detects that t…
#1173	Support relational operators for decimal type
#1220	Support replacing ORC format when Hive is configured
#1219	Upgrade to jucx 1.9.0
#1081	Add option to upload benchmark summary JSON file
#1217	Aggregate reductions in Complete mode should use updateExpressions
#1218	Remove obsolete HiveStringType usage
#1214	changelog update 2020-11-30. Trigger automerge check [skip ci]
#1210	Support auto-merge for branch-0.4 [skip ci]
#1202	Fix a bug with the support for java.lang.StringBuilder.append.
#1213	Skip casting StringType to TimestampType for Spark 310
#1201	Replace only window expressions on databricks.
#1208	[BUG] Fix GHSL2020-239 [skip ci]
#1205	Fix missing input bytes read metric for Parquet
#1206	Update Spark 3.1 shim for ShuffleOrigin shuffle parameter
#1196	Rename ShuffleCoalesceExec to GpuShuffleCoalesceExec
#1191	Skip window array tests for databricks.
#1183	Support for CalendarIntervalType and NullType
#1150	udf spec
#1188	Add in tests for parquet nested pruning support
#1189	Enable NullType for First and Last in 3.0.1+
#1181	Fix resource leaks in unit tests
#1186	Fix compilation and scaladoc warnings
#1187	Updated documentation for distinct count compatibility
#1182	Close buffer catalog on device manager shutdown
#1137	Let GpuWindowInPandas declare ArrayType supported.
#1176	Add in support for null type
#1174	Fix race condition in SerializeConcatHostBuffersDeserializeBatch
#1175	Fix leaks seen in shuffle tests
#1138	[REVIEW] Support decimal type for GpuProjectExec
#1162	Set job descriptions in benchmark runner
#1172	Revert "Fix race condition (#1165)"
#1060	Show partition metrics for custom shuffler reader
#1152	Add spark301db shim layer for WindowInPandas.
#1167	Nulls out the dataframe if --gc-between-runs is set
#1165	Fix race condition in SerializeConcatHostBuffersDeserializeBatch
#1163	Add in support for GetStructField
#1166	Fix the cast tests for 3.1.0+
#1159	fix bug where 'now' had same value as 'today' for timestamps
#1161	Fix nightly build pipeline failure.
#1160	Fix some performance problems with columnar to columnar conversion
#1105	[REVIEW] Change ColumnViewAccess usage to work with ColumnView
#1148	Add in tests for Maps and extend map support where possible
#1154	Mark test as xfail until we can get a fix in
#1113	Support unix_timestamp on GPU for subset of formats
#1156	Fix warning introduced in iterator suite
#1095	Dependency info
#1145	Remove support for databricks 7.0 runtime - shim spark300db
#1147	Change the assert to require for handling TIMESTAMP_MILLIS in isDateTimeRebaseNeeded
#1132	Add in basic support to read structs from parquet
#1121	Shuffle/better error handling
#1134	Support saveAsTable for writing orc and parquet
#1124	Add shim layers for GpuWindowInPandasExec.
#1131	Add in some basic support for Structs
#1127	Add in basic support for reading lists from parquet
#1129	Fix resource leaks with new shuffle optimization
#1116	Optimize normal shuffle by coalescing smaller batches on host
#1102	Auto-register UDF extention when main plugin is set
#1108	Remove integration test pipelines on NGCC
#1123	Mark Pandas udf over window tests as xfail on databricks until they can be fixed
#1120	Add in support for filtering ArrayType
#1080	Support for CalendarIntervalType and NullType for ParquetCachedSerializer
#994	Packs bounce buffers for highly partitioned shuffles
#1112	Remove bad config from pytest setup
#1107	closeOnExcept -> withResources in MetaUtils
#1104	Support lists to/from the GPU
#1106	Improve mechanism for expected exceptions in tests
#1069	Accelerate the data transfer between JVM and Python for the plan 'GpuWindowInPandasExec'
#1099	Update how we deal with type checking
#1077	Improve AQE transitions for shuffle and coalesce batches
#1097	Cleanup some instances of excess closure serialization
#1090	Fix the integration build
#1086	Speed up test performance using pytest-xdist
#1084	Avoid issues where more scalars that expected show up in an expression
#1076	[FEA] Support Databricks 7.3 LTS Runtime
#1083	Revert "Get cudf/spark dependency from the correct .m2 dir"
#1062	Get cudf/spark dependency from the correct .m2 dir
#1078	Another round of fixes for mapping of DataType to DType
#1066	More fixes for conversion to ColumnarBatch
#1029	BenchmarkRunner should produce JSON summary file even when queries fail
#1055	Fix build warnings
#1064	Use array instead of List for from(Table, DataType)
#1057	Fix empty table broadcast requiring a GPU on driver node
#1047	Sanity checks for cudf jar mismatch
#1044	Accelerated row to columnar and columnar to row transitions
#1056	Add query number to Spark app name when running benchmarks
#1054	Log total RMM allocated on GPU OOM
#1053	Remove isGpuBroadcastNestedLoopJoin from shims
#1052	Allow for GPUCoalesceBatch to deal with Map
#1051	Add simple retry for URM dependencies [skip ci]
#1046	Fix broken links
#1017	Log whether PTDS is enabled
#1040	Update to cudf 0.17-SNAPSHOT and fix tests
#1042	Fix inconsistencies in AQE support for broadcast joins
#1037	Add in support for the SQL functions Least and Greatest
#1036	Increase number of retries when waiting for databricks cluster
#1034	[BUG] To honor spark.rapids.memory.gpu.pool=NONE
#854	Arbitrary function call in UDF
#1028	Update to cudf-0.16
#1023	Add --gc-between-run flag for TPC* benchmarks.
#1001	ColumnarBatch to CachedBatch and back
#990	Parquet coalesce file reader for local filesystems
#1014	Add --append-dat flag for TPC-DS benchmark
#991	Updated GCP Dataproc Mortgage-ETL-GPU.ipynb
#886	Spark BinaryType and cast to BinaryType
#1016	Change Hash Aggregate to allow pass-through on MapType
#984	Add support for MapType in selected operators
#1012	Update for new position parameter in Spark 3.1.0 RegExpReplace
#995	Add shim for EMR 3.0.1 and EMR 3.0.1-SNAPSHOT
#998	Update benchmark automation script
#1000	Always use RAPIDS shuffle when running TPCH and Mortgage tests
#981	Change databricks build to dynamically create a cluster
#986	Fix missing dataSize metric when using RAPIDS shuffle
#914	Write InternalRow to CachedBatch
#934	Iterator to make it easier to work with a window of blocks in the RAPIDS shuffle
#992	Skip post-clean if aborted before the image build stage in pre-merge [skip ci]
#988	Change in Spark caused the 3.1.0 CI to fail
#983	clean jenkins file for premerge on NGCC
#964	Refactor TPC benchmarks to reduce duplicate code
#978	Enable scalastyle checks for udf-compiler module
#949	Fix GpuWindowExec to work with a CPU SortExec
#973	Stop reporting totalTime metric for GpuShuffleExchangeExec
#968	XFail pos_explode tests until final fix can be put in
#970	Add legacy config to clear active Spark 3.1.0 session in tests
#918	Benchmark runner script
#915	Add option to control number of partitions when converting from CSV to Parquet
#944	Fix some issues with non-determinism
#935	Add in support/tests for a window count on a column
#940	Fix closeOnExcept suppressed exception handling
#942	fix github action env setup [skip ci]
#933	Update first/last tests to avoid non-determinisim and ordering differences
#931	Fix checking for nullable columns in window range query
#924	Benchmark guide update for command-line interface / spark-submit
#926	Move pandas_udf functions into the tests functions
#929	Pick a default tableId to use that is non 0 so that flatbuffers allow…
#928	Fix RapidsBufferStore NPE when no spillable buffers are available
#820	Benchmarking guide
#859	Compare partitioned files in order
#916	create new sparkContext explicitly in CPU notebook
#917	create new SparkContext in GPU notebook explicitly.
#919	Add label benchmark to performance subsection in changelog
#850	Add in basic support for lead/lag
#843	[REVIEW] Cache plugin to handle reading CachedBatch to an InternalRow
#904	Add command-line argument for benchmark result filename
#909	GCP preview version image name update
#903	update getting-started-gcp.md with new component list
#900	Turn off CollectLimitExec replacement by default
#907	remove configs from databricks that shouldn't be used by default
#893	Fix rounding error when casting timestamp to string for timestamps before 1970
#899	Mark reduction corner case tests as xfail on databricks until they can be fixed
#894	Replace whole-buffer slicing with direct refcounting
#891	Add config to dump heap on GPU OOM
#890	Clean up CoalesceBatch to use withResource
#892	Only manifest the current batch in cached block shuffle read iterator
#871	Add support for using the arena allocator
#889	Fix crash on scalar only orderby
#879	Update SpillableColumnarBatch to remove buffer from catalog on close
#888	Shrink detect scope to compile only [skip ci]
#885	[BUG] fix IT dockerfile arguments [skip ci]
#883	[BUG] fix IT dockerfile args ordering [skip ci]
#875	fix the non-consistency for `spark.rapids.sql.format.parquet.multiThreadedRead` in RapidsConf.scala
#862	Migrate nightly&integration pipelines to blossom [skip ci]
#872	Ensure that receive-side batches use GpuColumnVectorFromBuffer to avoid
#833	Add nvcomp LZ4 codec support
#870	Cleaned up tests and documentation for csv timestamp parsing
#823	Add command-line interface for TPC-* for use with spark-submit
#856	Move GpuWindowInPandasExec in shims layers
#756	Add stream-time metric
#832	Skip pandas tests if pandas cannot be found
#841	Fix a hanging issue when processing empty data.
#840	[REVIEW] Fixed failing cache tests
#848	Update task memory and disk spill metrics when buffer store spills
#851	Use contiguous table when deserializing columnar batch
#857	fix pvc scheduling issue
#853	Remove nodeAffinity from premerge pipeline
#796	Record spark plan SQL metrics to JSON when running benchmarks
#781	Add AQE unit tests
#824	Skip cudf_udf test by default
#839	First/Last reduction and cleanup of agg APIs
#827	Add Spark 3.0 EMR Shim layer
#816	[BUG] fix nightly is timing out
#782	Benchmark utility to perform diff of output from benchmark runs, allowing for precision differences
#813	Revert "Enable tests in udf_cudf_test.py"
#788	[FEA] Persist workspace data on PVC for premerge
#805	[FEA] nightly build trigger both IT on spark 300 and 301
#797	Allow host spill store to fit a buffer larger than configured max size
#807	Deploy integration-tests javadoc and sources
#777	Enable tests in udf_cudf_test.py
#790	CI: Update cudf python to 0.16 nightly
#772	Add support for empty array construction.
#783	Improved GpuArrowEvalPythonExec
#771	Various improvements to benchmarks
#763	[REVIEW] Allow CoalesceBatch to spill data that is not in active use
#727	Update cudf dependency to 0.16-SNAPSHOT
#726	parquet writer support for TIMESTAMP_MILLIS
#674	Unit test for GPU exchange re-use with AQE
#723	Update code coverage to find source files in new places
#766	Update the integration Dockerfile to reduce the image size
#762	Fixing conflicts in branch-0.3
#738	[auto-merge] branch-0.2 to branch-0.3 - resolve conflict
#722	Initial code changes to support spilling outside of shuffle
#693	Update jenkins files for 0.3
#692	Merge shims dependency to spark-3.0.1 into branch-0.3
#690	Update the version to 0.3.0-SNAPSHOT

Release 0.2

Features


#696	[FEA] run integration tests against SPARK-3.0.1
#455	[FEA] Support UCX shuffle with optimized AQE
#510	[FEA] Investigate libcudf features needed to support struct schema pruning during loads
#541	[FEA] Scala UDF:Support for null Value operands
#542	[FEA] Scala UDF: Support for Date and Time
#499	[FEA] disable any kind of warnings about ExecutedCommandExec not being on the GPU
#540	[FEA] Scala UDF: Support for String replaceFirst()
#340	[FEA] widen the rendered Jekyll pages
#602	[FEA] don't release with any -SNAPSHOT dependencies
#579	[FEA] Auto-merge between branches
#515	[FEA] Write tests for AQE skewed join optimization
#452	[FEA] Update HashSortOptimizerSuite to work with AQE
#454	[FEA] Update GpuCoalesceBatchesSuite to work with AQE enabled
#354	[FEA]Spark 3.1 FileSourceScanExec adds parameter optionalNumCoalescedBuckets
#566	[FEA] Add support for StringSplit with an array index.
#524	[FEA] Add GPU specific metrics to GpuFileSourceScanExec
#494	[FEA] Add some AQE-specific tests to the PySpark test suite
#146	[FEA] Python tests should support running with Adaptive Query Execution enabled
#465	[FEA] Audit: Update script to audit multiple versions of Spark
#488	[FEA] Ability to limit total GPU memory used
#70	[FEA] Support StringSplit
#403	[FEA] Add in support for GetArrayItem
#493	[FEA] Implement shuffle optimization when AQE is enabled
#500	[FEA] Add maven profiles for testing with AQE on or off
#471	[FEA] create a formal process for updating the github-pages branch
#233	[FEA] Audit DataWritingCommandExec
#240	[FEA] Audit Api validation script follow on - Optimize StringToTypeTag
#388	[FEA] Audit WindowExec
#425	[FEA] Add tests for configs in BatchScan Readers
#453	[FEA] Update HashAggregatesSuite to work with AQE
#184	[FEA] Enable NoScalaDoc scalastyle rule
#438	[FEA] Enable StringLPad
#232	[FEA] Audit SortExec
#236	[FEA] Audit ShuffleExchangeExec
#355	[FEA] Support Multiple Spark versions in the same jar
#385	[FEA] Support RangeExec on the GPU
#317	[FEA] Write test wrapper to run SQL queries via pyspark
#235	[FEA] Audit BroadcastExchangeExec
#234	[FEA] Audit BatchScanExec
#238	[FEA] Audit ShuffledHashJoinExec
#237	[FEA] Audit BroadcastHashJoinExec
#316	[FEA] Add some basic Dataframe tests for CoalesceExec
#145	[FEA] Scala tests should support running with Adaptive Query Execution enabled
#231	[FEA] Audit ProjectExec
#229	[FEA] Audit FileSourceScanExec

Performance


#326	[DISCUSS] Shuffle read-side error handling
#601	[FEA] Optimize unnecessary sorts when replacing SortAggregate
#333	[FEA] Better handling of reading lots of small Parquet files
#511	[FEA] Connect shuffle table compression to shuffle exec metrics
#15	[FEA] Multiple threads sharing the same GPU
#272	[DOC] Getting started guide for UCX shuffle

Bugs Fixed


#780	[BUG] Inner Join dropping data with bucketed Table input
#569	[BUG] left_semi_join operation is abnormal and serious time-consuming
#744	[BUG] TPC-DS query 6 now produces incorrect results.
#718	[BUG] GpuBroadcastHashJoinExec ArrayIndexOutOfBoundsException
#698	[BUG] batch coalesce can fail to appear between columnar shuffle and subsequent columnar operation
#658	[BUG] GpuCoalesceBatches collectTime metric can be underreported
#59	[BUG] enable tests for string literals in a select
#486	[BUG] GpuWindowExec does not implement requiredChildOrdering
#631	[BUG] Rows are dropped when AQE is enabled in some cases
#671	[BUG] Databricks hash_aggregate_test fails trying to canonicalize a WrappedAggFunction
#218	[BUG] Window function COUNT(x) includes null-values, when it shouldn't
#153	[BUG] Incorrect output from partial-only hash aggregates with multiple distincts and non-distinct functions
#656	[BUG] integration tests produce hive metadata files
#607	[BUG] Fix misleading "cannot run on GPU" warnings when AQE is enabled
#630	[BUG] GpuCustomShuffleReader metrics always show zero rows/batches output
#643	[BUG] race condition while registering a buffer and spilling at the same time
#606	[BUG] Multiple scans for same data source with TPC-DS query59 with delta format
#626	[BUG] parquet_test showing leaked memory buffer
#155	[BUG] Incorrect output from averages with filters in partial only mode
#277	[BUG] HashAggregateSuite failure when AQE is enabled
#276	[BUG] GpuCoalesceBatchSuite failure when AQE is enabled
#598	[BUG] Non-deterministic output from MapOutputTracker.getStatistics() with AQE on GPU
#192	[BUG] test_read_merge_schema fails on Databricks
#341	[BUG] Document compression formats for readers/writers
#587	[BUG] Spark3.1 changed FileScan which means or GpuScans need to be added to shim layer
#362	[BUG] Implement getReaderForRange in the RapidsShuffleManager
#528	[BUG] HashAggregateSuite "Avg Distinct with filter" no longer valid when testing against Spark 3.1.0
#416	[BUG] Fix Spark 3.1.0 integration tests
#556	[BUG] NPE when removing shuffle
#553	[BUG] GpuColumnVector build warnings from raw type access
#492	[BUG] Re-enable AQE integration tests
#275	[BUG] TpchLike query 2 fails when AQE is enabled
#508	[BUG] GpuUnion publishes metrics on the UI that are all 0
#269	Needed to add `--conf spark.driver.extraClassPath=`
#473	[BUG] PartMerge:countDistinct:sum fails sporadically
#531	[BUG] Temporary RMM workaround needs to be removed
#532	[BUG] NPE when enabling shuffle manager
#525	[BUG] GpuFilterExec reports incorrect nullability of output in some cases
#483	[BUG] Multiple scans for the same parquet data source
#382	[BUG] Spark3.1 StringFallbackSuite regexp_replace null cpu fall back test fails.
#489	[FEA] Fix Spark 3.1 GpuHashJoin since it now requires CodegenSupport
#441	[BUG] test_broadcast_nested_loop_join_special_case fails on databricks
#347	[BUG] Failed to read Parquet file generated by GPU-enabled Spark.
#433	`InSet` operator produces an error for Strings
#144	[BUG] spark.sql.legacy.parquet.datetimeRebaseModeInWrite is ignored
#323	[BUG] GpuBroadcastNestedLoopJoinExec can fail if there are no columns
#356	[BUG] Integration cache test for BroadcastNestedLoopJoin failure
#280	[BUG] Full Outer Join does not work on nullable keys
#149	[BUG] Spark driver fails to load native libs when running on node without CUDA

PRs


#826	Fix link to cudf-0.15-cuda11.jar
#815	Update documentation for Scala UDFs in 0.2 since you need two things
#802	Update 0.2 CHANGELOG
#793	Update Jenkins scripts for release
#798	Fix shims provider override config not being seen by executors
#785	Make shuffle run on CPU if we do a join where we read from bucketed table
#765	Add config to override shims provider class
#759	Add CHANGELOG for release 0.2
#758	Skip the udf test fails periodically.
#752	Fix snapshot plugin jar version in docs
#751	Correct the channel for cudf installation
#754	Filter nulls from joins where possible to improve performance
#732	Add a timeout for RapidsShuffleIterator to prevent jobs to hang infin…
#637	Documentation changes for 0.2 release
#747	Disable udf tests that fail periodically
#745	Revert Null Join Filter
#741	Fix issue with parquet partitioned reads
#733	Remove GPU Types from github
#720	Stop removing GpuCoalesceBatches from non-AQE queries when AQE is enabled
#729	Fix collect time metric in CoalesceBatches
#640	Support running Pandas UDFs on GPUs in Python processes.
#721	Add some more checks to databricks build scripts
#714	Move spark 3.0.1-shims out of snapshot-shims
#711	fix blossom checkout repo
#709	[BUG] fix unexpected indentation issue in blossom yml
#642	Init workflow for blossom-ci
#705	Enable configuration check for cast string to timestamp
#702	Update slack channel for Jenkins builds
#701	fix checkout-ref for automerge
#695	Fix spark-3.0.1 shim to be released
#668	refactor automerge to support merge for protected branch
#687	Include the UDF compiler in the dist jar
#689	Change shims dependency to spark-3.0.1
#677	Use multi-threaded parquet read with small files
#638	Add Parquet-based cache serializer
#613	Enable UCX + AQE
#684	Enable test for literal string values in a select
#686	Remove sorts when replacing sort aggregate if possible
#675	Added TimeAdd
#645	[window] Add GpuWindowExec requiredChildOrdering
#676	fixUpJoinConsistency rule now works when AQE is enabled
#683	Fix issues with cannonicalization of WrappedAggFunction
#682	Fix path to start-slave.sh script in docs
#673	Increase build timeouts on nightly and premerge builds
#648	add signoff-check use github actions
#593	Add support for isNaN and datetime related instructions in UDF compiler
#666	[window] Disable GPU for COUNT(exp) queries
#655	Implement AQE unit test for InsertAdaptiveSparkPlan
#614	Fix for aggregation with multiple distinct and non distinct functions
#657	Fix verify build after integration tests are run
#660	Add in neverReplaceExec and several rules for it
#639	BooleanType test shouldn't xfail
#652	Mark UVM config as internal until supported
#653	Move to the cudf-0.15 release
#647	Improve warnings about AQE nodes not supported on GPU
#646	Stop reporting zero metrics for GpuCustomShuffleReader
#644	Small fix for race in catalog where a buffer could get spilled while …
#623	Fix issues with canonicalization
#599	[FEA] changelog generator
#563	cudf and spark version info in artifacts
#633	Fix leak if RebaseHelper throws during Parquet read
#632	Copy function isSearchableType from Spark because signature changed in 3.0.1
#583	Add udf compiler unit tests
#617	Documentation updates for branch 0.2
#616	Add config to reserve GPU memory
#612	[REVIEW] Fix incorrect output from averages with filters in partial only mode
#609	fix minor issues with instructions for building ucx
#611	Added in profile to enable shims for SNAPSHOT releases
#595	Parquet small file reading optimization
#582	fix #579 Auto-merge between branches
#536	Add test for skewed join optimization when AQE is enabled
#603	Fix data size metric always 0 when using RAPIDS shuffle
#600	Fix calculation of string data for compressed batches
#597	Remove the xfail for parquet test_read_merge_schema on Databricks
#591	Add ucx license in NOTICE-binary
#596	Add Spark 3.0.2 to Shim layer
#594	Filter nulls from joins where possible to improve performance.
#590	Move GpuParquetScan/GpuOrcScan into Shim
#588	xfail the tpch spark 3.1.0 tests that fail
#572	Update buffer store to return compressed batches directly, add compression NVTX ranges
#558	Fix unit tests when AQE is enabled
#580	xfail the Spark 3.1.0 integration tests that fail
#565	Minor improvements to TPC-DS benchmarking code
#567	Explicitly disable AQE in one test
#571	Fix Databricks shim layer for GpuFileSourceScanExec and GpuBroadcastExchangeExec
#564	Add GPU decode time metric to scans
#562	getCatalog can be called from the driver, and can return null
#555	Fix build warnings for ColumnViewAccess
#560	Fix databricks build for AQE support
#557	Fix tests failing on Spark 3.1
#547	Add GPU metrics to GpuFileSourceScanExec
#462	Implement optimized AQE support so that exchanges run on GPU where possible
#550	Document Parquet and ORC compression support
#539	Update script to audit multiple Spark versions
#543	Add metrics to GpuUnion operator
#549	Move spark shim properties to top level pom
#497	Add UDF compiler implementations
#487	Add framework for batch compression of shuffle partitions
#544	Add in driverExtraClassPath for standalone mode docs
#546	Fix Spark 3.1.0 shim build error in GpuHashJoin
#537	Use fresh SparkSession when capturing to avoid late capture of previous query
#538	Revert "Temporary workaround for RMM initial pool size bug (#530)"
#517	Add config to limit maximum RMM pool size
#527	Add support for split and getArrayIndex
#534	Fixes bugs around GpuShuffleEnv initialization
#529	[BUG] Degenerate table metas were not getting copied to the heap
#530	Temporary workaround for RMM initial pool size bug
#526	Fix bug with nullability reporting in GpuFilterExec
#521	Fix typo with databricks shim classname SparkShimServiceProvider
#522	Use SQLConf instead of SparkConf when looking up SQL configs
#518	Fix init order issue in GpuShuffleEnv when RAPIDS shuffle configured
#514	Added clarification of RegExpReplace, DateDiff, made descriptive text consistent
#506	Add in basic support for running tpcds like queries
#504	Add ability to ignore tests depending on spark shim version
#503	Remove unused async buffer spill support
#501	disable codegen in 3.1 shim for hash join
#466	Optimize and fix Api validation script
#481	Codeowners
#439	Check a PR has been committed using git signoff
#319	Update partitioning logic in ShuffledBatchRDD
#491	Temporarily ignore AQE integration tests
#490	Fix Spark 3.1.0 build for HashJoin changes
#482	Prevent bad practice in python tests
#485	Show plan in assertion message if test fails
#480	Fix link from README to getting-started.md
#448	Preliminary support for keeping broadcast exchanges on GPU when AQE is enabled
#478	Fall back to CPU for binary as string in parquet
#477	Fix special case joins in broadcast nested loop join
#469	Update HashAggregateSuite to work with AQE
#475	Udf compiler pom followup
#434	Add UDF compiler skeleton
#474	Re-enable noscaladoc check
#461	Fix comments style to pass scala style check
#468	fix broken link
#456	Add closeOnExcept to clean up code that closes resources only on exceptions
#464	Turn off noscaladoc rule until codebase is fixed
#449	Enforce NoScalaDoc rule in scalastyle checks
#450	Enable scalastyle for shuffle plugin
#451	Databricks remove unneeded files and fix build to not fail on rm when file missing
#442	Shim layer support for Spark 3.0.0 Databricks
#447	Add scalastyle plugin to shim module
#426	Update BufferMeta to support multiple codec buffers per table
#440	Run mortgage test both with AQE on and off
#445	Added in StringRPad and StringLPad
#422	Documentation updates
#437	Fix bug with InSet and Strings
#435	Add in checks for Parquet LEGACY date/time rebase
#432	Fix batch use-after-close in partitioning, shuffle env init
#423	Fix duplicates includes in assembly jar
#418	CI Add unit tests running for Spark 3.0.1
#421	Make it easier to run TPCxBB benchmarks from spark shell
#413	Fix download link
#414	Shim Layer to support multiple Spark versions
#406	Update cast handling to deal with new libcudf casting limitations
#405	Change slave->worker
#395	Databricks doc updates
#401	Extended the FAQ
#398	Add tests for GpuPartition
#352	Change spark tgz package name
#397	Fix small bug in ShuffleBufferCatalog.hasActiveShuffle
#286	[REVIEW] Updated join tests for cache
#393	Contributor license agreement
#389	Added in support for RangeExec
#390	Ucx getting started
#391	Hide slack channel in Jenkins scripts
#387	Remove the term whitelist
#365	[REVIEW] Timesub tests
#383	Test utility to compare SQL query results between CPU and GPU
#380	Fix databricks notebook link
#378	Added in FAQ and fixed spelling
#377	Update heading in configs.md
#373	Modifying branch name to conform with rapidsai branch name change
#376	Add our session extension correctly if there are other extensions configured
#374	Fix rat issue for notebooks
#364	Update Databricks patch for changes to GpuSortMergeJoin
#371	fix typo and use regional bucket per GCP's update
#359	Karthik changes
#353	Fix broadcast nested loop join for the no column case
#313	Additional tests for broadcast hash join
#342	Implement build-side rules for shuffle hash join
#349	Updated join code to treat null equality properly
#335	Integration tests on spark 3.0.1-SNAPSHOT & 3.1.0-SNAPSHOT
#346	Update the Title Header for Fine Tuning
#344	Fix small typo in readme
#331	Adds iterator and client unit tests, and prepares for more fetch failure handling
#337	Fix Scala compile phase to allow Java classes referencing Scala classes
#332	Match GPU overwritten functions with SQL functions from FunctionRegistry
#339	Fix databricks build
#338	Move GpuPartitioning to a separate file
#310	Update release Jenkinsfile for Databricks
#330	Hide private info in Jenkins scripts
#324	Add in basic support for GpuCartesianProductExec
#328	Enable slack notification for Databricks build
#321	update databricks patch for GpuBroadcastNestedLoopJoinExec
#322	Add oss.sonatype.org to download the cudf jar
#320	Don't mount passwd/group to the container
#258	Enable running TPCH tests with AQE enabled
#318	Build docker image with Dockerfile
#309	Update databricks patch to latest changes
#312	Trigger branch-0.2 integration test
#307	[Jenkins] Update the release script and Jenkinsfile
#304	[DOC][Minor] Fix typo in spark config name.
#303	Update compatibility doc for -0.0 issues
#301	Add info about branches in README.md
#296	Added in basic support for broadcast nested loop join
#297	Databricks CI improvements and support runtime env parameter to xfail certain tests
#292	Move artifacts version in version-def.sh
#254	Cleanup QA tests
#289	Clean up GpuCollectLimitMeta and add in metrics
#287	Add in support for right join and fix issues build right
#273	Added releases to the README.md
#285	modify run_pyspark_from_build.sh to be bash 3 friendly
#281	Add in support for Full Outer Join on non-null keys
#274	Add RapidsDiskStore tests
#259	Add RapidsHostMemoryStore tests
#282	Update Databricks patch for 0.2 branch
#261	Add conditional xfail test for DISTINCT aggregates with NaN
#263	More time ops
#256	Remove special cases for contains, startsWith, and endWith
#253	Remove GpuAttributeReference and GpuSortOrder
#271	Update the versions for 0.2.0 properly for the databricks build
#162	Integration tests for corner cases in window functions.
#264	Add a local mvn repo for nightly pipeline
#262	Refer to branch-0.2
#255	Revert change to make dependencies of shaded jar optional
#257	Fix link to RAPIDS cudf in index.md
#252	Update to 0.2.0-SNAPSHOT and cudf-0.15-SNAPSHOT

Release 0.1

Features


#74	[FEA] Support ToUnixTimestamp
#21	[FEA] NormalizeNansAndZeros
#105	[FEA] integration tests for equi-joins

Bugs Fixed


#116	[BUG] calling replace with a NULL throws an exception
#168	[BUG] GpuUnitTests Date tests leak column vectors
#209	[BUG] Developers section in pom need to be updated
#204	[BUG] Code coverage docs are out of date
#154	[BUG] Incorrect output from partial-only averages with nulls
#61	[BUG] Cannot disable Parquet, ORC, CSV reading when using FileSourceScanExec

PRs


#249	Compatability -> Compatibility
#247	Add index.md for default doc page, fix table formatting for configs
#241	Let default branch to master per the release rule
#177	Fixed leaks in unit test and use ColumnarBatch for testing
#243	Jenkins file for Databricks release
#225	Make internal project dependencies optional for shaded artifact
#242	Add site pages
#221	Databricks Build Support
#215	Remove CudfColumnVector
#213	Add RapidsDeviceMemoryStore tests
#214	[REVIEW] Test failure to pass Attribute as GpuAttribute
#211	Add project leads to pom developer list
#210	Updated coverage docs
#195	Support public release for plugin jar
#208	Remove unneeded comment from pom.xml
#191	WindowExec handle different spark distributions
#181	Remove INCOMPAT for NormalizeNanAndZero, KnownFloatingPointNormalized
#196	Update Spark dependency to the released 3.0.0 artifacts
#206	Change groupID to 'com.nvidia' in IT scripts
#202	Fixed issue for contains when searching for an empty string
#201	Fix name of scan
#200	Fix issue with GpuAttributeReference not overrideing references
#197	Fix metrics for writes
#186	Fixed issue with nullability on concat
#193	Add RapidsBufferCatalog tests
#188	rebrand to com.nvidia instead of ai.rapids
#189	Handle AggregateExpression having resultIds parameter instead of a single resultId
#190	FileSourceScanExec can have logicalRelation parameter on some distributions
#185	Update type of parameter of GpuExpandExec to make it consistent
#172	Merge qa test to integration test
#180	Add MetaUtils unit tests
#171	Cleanup scaladoc warnings about missing links
#176	Updated join tests to cover more data.
#169	Remove dependency on shaded Spark artifact
#174	Added in fallback tests
#165	Move input metadata tests to pyspark
#173	Fix setting local mode for tests
#160	Integration tests for normalizing NaN/zeroes.
#163	Ignore the order locally for repartition tests
#157	Add partial and final only hash aggregate tests and fix nulls corner case for Average
#159	Add integration tests for joins
#158	Orc merge schema fallback and FileScan format configs
#164	Fix compiler warnings
#152	Moved cudf to 0.14 for CI
#151	Switch CICD pipelines to Github

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CHANGELOG.md

CHANGELOG.md

Change log

Release 0.5

Features

Performance

Bugs Fixed

PRs

Release 0.4.1

Bugs Fixed

PRs

Release 0.4

Features

Performance

Bugs Fixed

PRs

Release 0.3

Features

Performance

Bugs Fixed

PRs

Release 0.2

Features

Performance

Bugs Fixed

PRs

Release 0.1

Features

Bugs Fixed

PRs

Files

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

Change log

Release 0.5

Features

Performance

Bugs Fixed

PRs

Release 0.4.1

Bugs Fixed

PRs

Release 0.4

Features

Performance

Bugs Fixed

PRs

Release 0.3

Features

Performance

Bugs Fixed

PRs

Release 0.2

Features

Performance

Bugs Fixed

PRs

Release 0.1

Features

Bugs Fixed

PRs