-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix calculation of unsupported operators stage duration percentage #1006
Conversation
Signed-off-by: Partho Sarthi <psarthi@nvidia.com>
user_tools/src/spark_rapids_tools/tools/unsupported_ops_stage_duration.py
Outdated
Show resolved
Hide resolved
user_tools/src/spark_rapids_tools/tools/unsupported_ops_stage_duration.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Partho Sarthi <psarthi@nvidia.com>
This reverts commit c262009.
Signed-off-by: Partho Sarthi <psarthi@nvidia.com>
Signed-off-by: Partho Sarthi <psarthi@nvidia.com>
# Conflicts: # user_tools/src/spark_rapids_pytools/resources/qualification-conf.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @parthosa
I was not expecting a change in the Scala side.
Lets sync offline to better understanding the recent changes.
core/src/main/scala/com/nvidia/spark/rapids/tool/qualification/QualOutputWriter.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/com/nvidia/spark/rapids/tool/qualification/QualOutputWriter.scala
Outdated
Show resolved
Hide resolved
@parthosa Just double checking. |
Signed-off-by: Partho Sarthi <psarthi@nvidia.com>
# Conflicts: # core/src/test/resources/QualificationExpectations/jdbc_expectation.csv # core/src/test/resources/QualificationExpectations/read_dsv1_expectation.csv # core/src/test/resources/QualificationExpectations/read_dsv2_expectation.csv # core/src/test/resources/QualificationExpectations/write_format_expectation.csv
Signed-off-by: Partho Sarthi <psarthi@nvidia.com>
@amahussein Renamed the column to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that it turned to be more painful than it is originally thought :)
Thanks @parthosa !
LGTME.
Signed-off-by: Partho Sarthi <psarthi@nvidia.com>
Fixes #1003. This PR fixes the unsupported stage duration calculation by using
SQL Stage Durations Sum
as denominator instead ofApp Duration
.Additionally, handle the case when event logs do not have
App Name
.Changes:
SQL Stage Durations Sum
inrapids_4_spark_qualification_output.csv
.unsupported stage duration / sql stage durations sum
.Design
Total Stage Duration
in the mainrapids_4_spark_qualification_output.csv
instead ofrapids_4_spark_qualification_output_unsupportedOperators.csv
since the second file might not have entries for all apps (eg apps without any unsupported operators). To handle this we would have to add morefillna()
,division by NaN
etc.Test
CMD (need to specify dev JAR):
Output