Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor ProfileResult classes to implement new interface design and add CSV output to Qual Tool #1043

Merged

Conversation

parthosa
Copy link
Collaborator

Fixes #1041.

#1000 introduced a new interface design for calculating ProfileResults types. This enabled qualification tools to generate these results as part of the raw_metrics folder.

This PR refactors the calculation of the remaining ProfileResult types to follow the same design pattern. These files are needed by the estimation model. This is a step toward using only the qualification tool for the estimation model.

Files generated by Qual Tool

  • application_information.csv
  • application_log_path_mapping.csv
  • data_source_information.csv
  • Properties files:
    • spark_properties.csv,
    • system_properties.csv
    • spark_rapids_parameters_set_explicitly.csv

Changes:

Java/Core

  • Refactored CollectInformation: Moved methods into the new framework ViewableTrait[R <: ProfileResult].
  • Created traits and objects for each ProfileResult case.

Testing:

  • Manually tested JAR on event logs to verify Qual Tool generates these files.
  • Profiling unit tests are unaffected as this is an internal refactor.
  • We do not have unit tests for the raw_metrics generated by Qual Tool. We can create followup issue for this.

… writers

Signed-off-by: Partho Sarthi <psarthi@nvidia.com>
@parthosa parthosa self-assigned this May 29, 2024
@parthosa parthosa added the core_tools Scope the core module (scala) label May 29, 2024
@parthosa parthosa marked this pull request as ready for review May 29, 2024 14:52
Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @parthosa !
LGTME

@amahussein amahussein merged commit 4255330 into NVIDIA:dev May 29, 2024
16 checks passed
@amahussein amahussein deleted the spark-rapids-tools-1041-add-missing-csvs branch May 29, 2024 19:28
@amahussein amahussein restored the spark-rapids-tools-1041-add-missing-csvs branch May 29, 2024 19:28
@parthosa parthosa deleted the spark-rapids-tools-1041-add-missing-csvs branch May 30, 2024 23:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core_tools Scope the core module (scala)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Generate all CSVs from Profiler in Qualification
2 participants