Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor core to converge Qualification and Profiling Tools implementation #980

Closed
7 tasks done
amahussein opened this issue Apr 30, 2024 · 1 comment
Closed
7 tasks done
Assignees
Labels
core_tools Scope the core module (scala) feature request New feature or request

Comments

@amahussein
Copy link
Collaborator

amahussein commented Apr 30, 2024

Describe the bug

Currently, both Qualification and Profiling tools have different data structure and different reporting output.
This issue is to track the refactor process to make both implementation common.

This is an important step toward adding heuristics as a postProcess phase to analysing the eventlogs
There are many benefits of doing so:

  • We won't need to run both tools to feed the estimation_model. The input to the estimation_model should be common between the two tools.
  • Better code maintainability. Fixing or adding a feature/improvement will be done once instead of twice.
  • Code reusability

Tasks

Preview Give feedback
  1. bug core_tools
    amahussein
  2. core_tools feature request
    amahussein
  3. bug core_tools
    amahussein
  4. core_tools
    amahussein
  5. bug core_tools
    amahussein
  6. 2 of 2
    core_tools feature request
    parthosa
  7. core_tools feature request
    parthosa
@amahussein amahussein added feature request New feature or request core_tools Scope the core module (scala) labels Apr 30, 2024
@amahussein amahussein self-assigned this Apr 30, 2024
@parthosa
Copy link
Collaborator

This is great. This refactoring was much needed to improve the maintainence.

amahussein added a commit to amahussein/spark-rapids-tools that referenced this issue May 1, 2024
Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>

Contributes to NVIDIA#980

- this code change aims at using common logic to create and update the application
  info inside Tools.
amahussein added a commit that referenced this issue May 2, 2024
* Refactor AppBase to use common AppMetaData between Q/P tools

Contributes to #980

---------

Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>
amahussein added a commit to amahussein/spark-rapids-tools that referenced this issue May 7, 2024
Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>

Contributes to NVIDIA#980
amahussein added a commit that referenced this issue May 16, 2024
* Refactor TaskEnd to be accessible by Q/P tools

Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>

Contributes to #980

* Add Stage Accumulables to the accumulable objects for Q tool

Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>

* Refactor the code to allow Qual tool to generate same CSV files as Prof

Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>

* Cleaning up some naming conventions

Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>

* Fix typo in job/stage qual agg metrics

Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>

* Remove redundant sort function in skewshuffle analyzer

Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>

* Fix typos and remove unused classes from ProfileClassWarehouse

Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>

---------

Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core_tools Scope the core module (scala) feature request New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants