forked from apache/spark
-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-50372][CONNECT][SQL] Make all DF execution path collect observ…
…ed metrics ### What changes were proposed in this pull request? This PR fixes an issue that some of DataFrame execution paths would not process `ObservedMetrics`. The fix is done by injecting a lazy processing logic into the result iterator. The following private execution APIs are affected by this issue: - `SparkSession.execute(proto.Relation.Builder)` - `SparkSession.execute(proto.Command)` - `SparkSession.execute(proto.Plan)` The following user-facing API is affected by this issue: - `DataFrame.write.format("...").mode("...").save()` This PR also fixes an issue in which on the Server side, two observed metrics can be assigned to the same Plan ID when they are in the same plan (e.g., one observation is used as the input of another). The fix is to traverse the plan and find all observations with correct IDs. Another bug is discovered as a byproduct of introducing a new test case. Copying the PR comment here from SparkConnectPlanner.scala: > This fixes a bug where the input of a `CollectMetrics` can be processed two times, once in Line 1190 and once here/below. > > When the `input` contains another `CollectMetrics`, transforming it twice will cause two `Observation` objects (in the input) to be initialised and registered two times to the system. Since only one of them will be fulfilled when the query finishes, the one we'll be looking at may not have any data. > > This issue is highlighted in the test case `Observation.get is blocked until the query is finished ...`, where we specifically execute `observedObservedDf`, which is a `CollectMetrics` that has another `CollectMetrics` as its input. ### Why are the changes needed? To fix a bug. ### Does this PR introduce _any_ user-facing change? Yes, this bug is user-facing. ### How was this patch tested? New tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#48920 from xupefei/observation-notify-fix. Authored-by: Paddy Xu <xupaddy@gmail.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
- Loading branch information
1 parent
6c84f15
commit d0e2c06
Showing
8 changed files
with
137 additions
and
64 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters