Skip to content

Releases: NVIDIA/spark-rapids-tools

v24.02.0

24 Feb 20:42
Compare
Choose a tag to compare

Packages

Changes

User Tools

  • Fix missing config file for Dataproc GKE (#778)
  • [FEA] Qualification user_tools runs AutoTuner by default (#771)
  • [BUG] Fix databricks-aws user profiling tool error with --gpu_cluster argument (#707)

Core

  • [FEA] Qualification tool should mark WriteIntoDeltaCommand as supported (#801)
  • Qualification tool should mark SubqueryExec as IgnoreNoPerf (#798)
  • Generate cluster information from event logs in Qualification tool (#789)
  • Sync up supported ops for 24.02 plugin release (#796)
  • Qualification should mark empty2null as supported (#791)
  • Incorrect parsing of aggregates in DB queries (#790)
  • Qualification should mark WriteFiles as supported (#784)
  • Introduce GpuDevice abstraction and refactor AutoTuner (#740)
  • Consolidate unsupportedOperators into a single view (#766)
  • Speedup generator script fails after adding runtime_properties (#776)
  • Tools fail on DB10.4 clusters with IllegalArgException (#768)
  • Fix SparkPlanGraphCluster constructor for DB Platforms (#765)
  • Amendment to PR-763 (#764)
  • Fix SQLPLanMetric constructor for DB Platforms (#763)
  • Fix node constructor for DB platforms (#761)
  • Add penalty for stages with UDF's (#757)
  • Add support to appendDataExecV1 and overwriteByExprExecV1 (#756)
  • Qualification fails to detect sortMergeJoin with arguments (#754)
  • Fix Qualification crash during aggregation of stats (#753)
  • [FEA] Extend the list of operators to be ignored in Qualification (#745)
  • Remove ReusedSubquery from SparkPlanGraph construction (#741)
  • Update unsupported operator csv file's app duration column (#748)
  • [FEA] Qualification tool triggers the AutoTuner module (#739)
  • Disable support of GetJsonObject in Qualification tool (#737)
  • [FEA] AutoTuner warns that non-utf8 may not support some GPU expressions (#736)
  • [FEA] AutoTuner should not skip non-gpu eventlogs (#728)

Miscellaneous

  • Add auto-copyright for precommits (#732)

v23.12.3

12 Jan 21:20
Compare
Choose a tag to compare

Packages

Changes

Core

  • Add support of HiveTableScan and InsertIntoHive text-format (#723)
  • Fix compilation error with JDK11 (#720)
  • Generate an output file with runtime and build information (#705)
  • AutoTuner should poll maven-meta to retrieve the latest jar version (#711)
  • Profiling tool : Profiling tool throws NPE when appInfo is null and unchecked (#640)
  • Add support to parse_url host and protocol (#708)
  • [FEA] Profiling tool auto-tuner should consider spark.databricks.adaptive.autoOptimizeShuffle.enabled (#710)
  • [FEA] Profiler autotuner should only specify standard Spark versions for shuffle manager setting (#662)

Miscellaneous

  • [FEA] Enable AQE related recommendations in Profiler Auto-tuner (#688)

v23.12.2

27 Dec 23:34
Compare
Choose a tag to compare

Packages

Changes

User Tools

  • Polling maven-metadata.xml to pull the latest tools jar (#703)

Core

  • Update pom to fail on warnings (#701)

v23.12.1

23 Dec 19:24
Compare
Choose a tag to compare

v23.12.0

20 Dec 18:55
Compare
Choose a tag to compare

Packages

Changes

User Tools

  • Fix user qualification tool runtime error in get_platform_name for onprem platform (#684)
  • [FEA] User tool should pass --platform option/argument to Profiling tool (#679)
  • Fix incorrect processing of short flags for user tools cli (#677)
  • Updating new CLI name from ascli to spark_rapids (#673)
  • Bump pyarrow version (#664)
  • Improve new CLI testing ensuring complete coverage of arguments cases (#652)

Core

  • Qualification tool: Add more information for unsupported operators (#680)
  • Sync Execs and Expressions from spark-rapids resources (#691)
  • Support parsing of inprogress eventlogs (#686)
  • Enable features via config that are off by default in the profiler AutoTuner (#668)
  • Fix platform names as string constants and reduce redundancy in unit tests (#667)
  • Unified platform handling and fetching of operator score files (#661)
  • Qualification tool: Ignore some of the unsupported Execs from output (#665)

Miscellaneous

  • add markdown link checker (#672)

v23.10.1

16 Nov 16:49
Compare
Choose a tag to compare

Packages

Changes

User Tools

  • Updating tools docs to remove dead links and profiling docs to not require cluster/worker info (#651)
  • Updating autotuner to generation recommendation always, even without cluster info (#650)
  • Updating dataproc container cost to be multiplied by number of cores (#648)
  • [BUG] Support autoscaling clusters for user qualification tool on Databricks platforms (#647)
  • Support extra arguments in new user tools CLI (#646)
  • Improve logs with user tools and jar version details (#642)

Core

  • Profiling tool: Add support for driver log as input to generate unsupported operators report (#654)
  • Updating tools docs to remove dead links and profiling docs to not require cluster/worker info (#651)
  • Updating autotuner to generation recommendation always, even without cluster info (#650)
  • Qualification tool: Enhance mapping of Execs to stages (#634)

v23.10.0

30 Oct 18:54
Compare
Choose a tag to compare

Packages

Changes

User Tools

  • Fix system command processing during logging in user tools (#633)
  • Fix spinner animation blocking user input in diagnostic tool (#631)
  • Enable Dynamic 'Zone' Configuration for Dataproc User Tools (#629)

Core

  • Profiling tool : Update readSchema string parser (#635)
  • [FEA] Fix empty softwareProperties field in worker_info.yaml file for profiling tool (#623)

v23.08.2

19 Oct 17:13
Compare
Choose a tag to compare

Packages

Changes

User Tools

  • Add unit tests for Dataproc GKE with mock GKE cluster (#618)
  • Add support in user tools for running qualification on Dataproc GKE (#612)
  • [BUG] Update user tools to use latest Databricks CLI version 0.200+ (#614)
  • Add argprocessor unit test for checking error messages for onprem with no eventlogs (#605)
  • Updating docs for custom speedup factors for scale factor (#604)
  • [FEA] Add qualification user tool options to support external pricing (#595)
  • [DOC] Add documentation for qualification user tool pricing discount options (#596)
  • [FEA] Add user qualification tool options for specifying pricing discounts for CPU or GPU cluster, or both (#583)
  • Add diagnostic capabilities for Databricks (AWS/Azure) environments (#533)
  • Add verbose option to the CLI (#550)
  • [FEA] Remove URLs from pydantic error messages (#560)
  • Rename and change pyrapids to spark_rapids_tools (#570)
  • Fix sdk_monitor exception thrown by abfs protocol (#569)

Core

  • Generating speedup factors for Dataproc GKE L4 GPU instances (#617)
  • Qualification tool: Add penalty for row conversions (#471)
  • Add support in core tools for running qualification on Dataproc GKE (#613)
  • Sync up remaining updated execs and exprs from rapids-plugin (#602)
  • Adding speedup factors for Dataproc Serverless and docs fix (#603)
  • Add xxhash64 function as supported in qualification tools (#597)
  • Fix ProjectExecParser to include digits in expression names (#592)
  • [FEA] Add json_tuple function as supported in qualification tool (#589)
  • [FEA] Add flatten function as supported in qualification tool (#587)
  • [FEA] Sync up conv function with rapids-plugin resources (#573)

Miscellaneous

  • Bump urllib3 from 1.26.17 to 1.26.18 in /data_validation (#622)
  • Bump urllib3 from 1.26.14 to 1.26.17 in /data_validation (#606)
  • Ignore pylint errors to fix python tests (#611)

v23.08.1

12 Sep 19:57
Compare
Choose a tag to compare

Packages

Changes

User Tools

  • [DOC] Fix help command in documentation (#540)
  • Implement a cross-CSP storage driver (#485)
  • Build tools package as single artifact for restricted environments (#516)

Core

  • Remove memoryOverhead recommendations for Standalone Spark (#557)
  • [FEA] Add support to TIMESTAMP functions (#549)
  • Fix handling of current_database and ArrayBuffer (#556)
  • Add translate as supported expression in qualification tools (#546)
  • Adding TakeOrderedAndProject and BroadcastNestedLoopJoin, removing Project from speedup generation (#548)
  • Qualification should treat promote_precision as supported (#545)
  • Improve tool error message for files with text extensions (#544)
  • Improve parsing of aggregate expressions (#535)
  • Bump default build to use Spark-333 (#537)
  • Improve AutoTuner plugin recommendation for Fat mode (#543)
  • Updating speedup generation for more execs from NDS + validation script (#530)
  • [FEA] Reset speedup factors for qualification tool in EMR 6.12 environments (#529)
  • Add min, median and max columns to AccumProfileResults (#522)
  • [FEA] Reset speedup factors for qualification tool in Databricks 12.2 environments (#524)
  • Filter parser should check ignored-functions (#520)
  • Update speedup factors for qualification tool in Dataproc 2.1 environments (#509)

Miscellaneous

  • Changing max_value to total based on profiler core changes (#555)
  • Add platform encoding to plugins defined in pom (#526)

v23.08.0

25 Aug 17:08
Compare
Choose a tag to compare

Packages

Changes

User Tools

  • Support offline execution of user tools in restricted environments (#497)
  • Handle deprecation errors in python packaging (#513)
  • Adds profiling support for EMR in user tools. (#500)

Core

  • Fix unit-tests for Spark-340 and Add spark-versions to gh-workflow (#503)

Miscellaneous

  • fix gh-workflow for Python unit-tests (#505)
  • Refactoring the speedup factor generation to support WholeStageCodegen parsing and environment defaults (#493)
  • Try fix push issue in release action [skip ci] (#495)
  • Revert "Push to protected branch using third-party action (#492)" (#494)
  • Push to protected branch using third-party action (#492)
  • Add secrets in the release.yml (#491)
  • Add sign-off and token in release workflow (#490)