-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix potential problems and AQE updates in Qual tool #1021
Conversation
Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me> Fixes NVIDIA#1019
core/src/main/scala/com/nvidia/spark/rapids/tool/analysis/AppSQLPlanAnalyzer.scala
Show resolved
Hide resolved
core/src/main/scala/com/nvidia/spark/rapids/tool/analysis/AppSQLPlanAnalyzer.scala
Show resolved
Hide resolved
core/src/main/scala/com/nvidia/spark/rapids/tool/analysis/AppSQLPlanAnalyzer.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/com/nvidia/spark/rapids/tool/analysis/AppSQLPlanAnalyzer.scala
Show resolved
Hide resolved
core/src/main/scala/com/nvidia/spark/rapids/tool/analysis/AppSQLPlanAnalyzer.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/com/nvidia/spark/rapids/tool/qualification/QualSQLPlanAnalyzer.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/sql/rapids/tool/AppBase.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/sql/rapids/tool/ToolUtils.scala
Outdated
Show resolved
Hide resolved
@@ -196,7 +196,7 @@ class ApplicationInfo( | |||
processEvents() | |||
|
|||
// Process SQL Plan Metrics after all events are processed | |||
val planMetricProcessor: AppSQLPlanAnalyzer = AppSQLPlanAnalyzer.processSQLPlan(this) | |||
val planMetricProcessor: AppSQLPlanAnalyzer = AppSQLPlanAnalyzer(this) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is appIndex not passed in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the code to pass appIndex to make it more readable.
The reason it was not passing appIndex before is that it is handled by AppSQLPlanAnalyzer.apply()
.
val sqlAnalyzer = app match {
case qApp: QualificationAppInfo =>
new QualSQLPlanAnalyzer(qApp, appIndex)
case pApp: ApplicationInfo =>
new AppSQLPlanAnalyzer(pApp, pApp.index)
}
core/src/main/scala/org/apache/spark/sql/rapids/tool/qualification/QualificationAppInfo.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tgravescs
I addressed the comments.
core/src/main/scala/com/nvidia/spark/rapids/tool/analysis/AppSQLPlanAnalyzer.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/com/nvidia/spark/rapids/tool/analysis/AppSQLPlanAnalyzer.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/com/nvidia/spark/rapids/tool/qualification/QualSQLPlanAnalyzer.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/sql/rapids/tool/ToolUtils.scala
Outdated
Show resolved
Hide resolved
@@ -196,7 +196,7 @@ class ApplicationInfo( | |||
processEvents() | |||
|
|||
// Process SQL Plan Metrics after all events are processed | |||
val planMetricProcessor: AppSQLPlanAnalyzer = AppSQLPlanAnalyzer.processSQLPlan(this) | |||
val planMetricProcessor: AppSQLPlanAnalyzer = AppSQLPlanAnalyzer(this) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the code to pass appIndex to make it more readable.
The reason it was not passing appIndex before is that it is handled by AppSQLPlanAnalyzer.apply()
.
val sqlAnalyzer = app match {
case qApp: QualificationAppInfo =>
new QualSQLPlanAnalyzer(qApp, appIndex)
case pApp: ApplicationInfo =>
new AppSQLPlanAnalyzer(pApp, pApp.index)
}
core/src/main/scala/org/apache/spark/sql/rapids/tool/qualification/QualificationAppInfo.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/sql/rapids/tool/AppBase.scala
Outdated
Show resolved
Hide resolved
private val sqlPlanNodeIdToStageIds: mutable.HashMap[(Long, Long), Set[Int]] = | ||
mutable.HashMap.empty[(Long, Long), Set[Int]] | ||
// A map between (SQL ID, Node ID) and the set of stage IDs | ||
// TODO: The Qualification should use this map instead of building a new set for each exec. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this have any performance impact and do we have a tracking issue or need one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is indeed a memory overhead because of allocating more objects in memory. This overhead was introduced in #1000; but that was the price to combine the 2 tools.
Depending on priorities, we will address the redundant information stored in QualificationAppInfo
- We do have [FEA] Improve performance of core module #367 as an umbrella for performance issues.
- There is a plan to set the dataBase storage to be a summer project for interns.
- there is Refactor core to converge Qualification and Profiling Tools implementation #980 which is still open and we can add subtask as we think is necessary to consider that done.
- The incremental refactor changes the code frequently. I find that filing issues for each possible improvement will be cumbersome and create of flood of overlapping issues. So, unless there is a bug, I mark possible improvements as TODO that we can later revisit depending on priorities.
* | ||
* It has the following effect on the visitor object: | ||
* 1- It updates the sqlIsDsOrRDD argument to True when the visited node is an RDD or Dataset. | ||
* 2- If the SLID is an RDD, the potentialProblems is cleared because once SQL is marked as RDD, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo Should this be 'SQLID'?
Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>
Signed-off-by: Ahmed Hussein (amahussein) a@ahussein.me
Fixes #1019
What this PR fixes:
Potential Problems
of filesql_duration_and_executor_cpu_time_percent.csv
generated by CSV file would be empty.SQLExecutionStart
" andAQEUpdate
. this bug was leaving some duplicate records in the SQL-to-problematicChanges:
QualificationAppInfo.processSQLPlan
since the logic is the same asAppSQLPlanAnalyzer.processSQLPlanMetrics()
RunningQualificationApp
to make sure thatQualificationAppInfo.processSQLPlan
is called beforeaggregateStats()
. Otherwise, the RunningQualificationApp would have empty dataSources/problematics/writeDataFormatsAppSQLPlanAnalyzer.processSQLPlanMetrics()
intovisitNode()
that is separate from the main loop.QualSQLPlanAnalyzer
that extendsAppSQLPlanAnalyzer
overriding thevisitNode()
to be able to update the WriteDataFromats