-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: only add default value for final aggregation to fix the aggregate push down (partition) union case #35443
Changes from 5 commits
86998d1
7d68e32
7d57765
b87ea6c
ff92581
e799be1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -438,6 +438,16 @@ func (a *aggregationPushDownSolver) tryAggPushDownForUnion(union *LogicalUnionAl | |
if pushedAgg == nil { | ||
return nil | ||
} | ||
|
||
// Update the agg mode for the pushed down aggregation. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you explain why this update happens? I don't get the logic to change 'CompleteMode' -> 'Partial1Mode'... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The behaviour between partition / final is different.
In partial mode, the input is original data and turned into "xx 1" where xx is If you have this data as input:
In partial mode, the output is
In final mode, the output is
Back to the topic, we have push down agg to partition unions.
We get wrong result:
The correct result should be:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
for _, aggFunc := range pushedAgg.AggFuncs { | ||
if aggFunc.Mode == aggregation.CompleteMode { | ||
aggFunc.Mode = aggregation.Partial1Mode | ||
} else if aggFunc.Mode == aggregation.FinalMode { | ||
aggFunc.Mode = aggregation.Partial2Mode | ||
} | ||
} | ||
|
||
newChildren := make([]LogicalPlan, 0, len(union.Children())) | ||
for _, child := range union.Children() { | ||
newChild, err := a.pushAggCrossUnion(pushedAgg, union.Schema(), child) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1337,7 +1337,15 @@ func BuildFinalModeAggregation( | |
|
||
finalAggFunc.OrderByItems = byItems | ||
finalAggFunc.HasDistinct = aggFunc.HasDistinct | ||
finalAggFunc.Mode = aggregation.CompleteMode | ||
// In logical optimize phase, the Agg->PartitionUnion->TableReader may become | ||
// Agg1->PartitionUnion->Agg2->TableReader, and the Agg2 is a partial aggregation. | ||
// So in the push down here, we need to add a new if-condition check: | ||
// If the original agg mode is partial already, the finalAggFunc's mode become Partial2. | ||
if aggFunc.Mode == aggregation.CompleteMode { | ||
finalAggFunc.Mode = aggregation.CompleteMode | ||
} else if aggFunc.Mode == aggregation.Partial1Mode || aggFunc.Mode == aggregation.Partial2Mode { | ||
finalAggFunc.Mode = aggregation.Partial2Mode | ||
} | ||
} else { | ||
if aggFunc.Name == ast.AggFuncGroupConcat && len(aggFunc.OrderByItems) > 0 { | ||
// group_concat can only run in one phase if it has order by items but without distinct property | ||
|
@@ -1417,7 +1425,15 @@ func BuildFinalModeAggregation( | |
} | ||
} | ||
|
||
finalAggFunc.Mode = aggregation.FinalMode | ||
// In logical optimize phase, the Agg->PartitionUnion->TableReader may become | ||
// Agg1->PartitionUnion->Agg2->TableReader, and the Agg2 is a partial aggregation. | ||
// So in the push down here, we need to add a new if-condition check: | ||
// If the original agg mode is partial already, the finalAggFunc's mode become Partial2. | ||
if aggFunc.Mode == aggregation.CompleteMode { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do not understand what does this code block mean There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the old assumption, before the coprocessor push down, the agg mode is either Complete or Final. So after the push down, the parent agg become Final mode. But in the new assumption, before the coprocessor push down, the agg mode can be partial. So here the parent agg mode is set to different value accordingly. For final before, we set it to final. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you mean that when we are in the partition union mode. The Agg->PartitionUnion->TableReader may become Agg->PartitionUnion->Agg->TableReader, then we do the push down. So we need to add a new if-condition check. You can add some comments here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agg1->PartitionUnion->Agg2->TableReader In coprocessor push down, Agg2->TableReader become Agg3(root)->TableReader->Agg4(cop)->TableScan(cop), and the final plan become Agg1->PartitionUnion->Agg3(root)->TableReader->Agg4(cop)->TableScan(cop) In the past, Agg2 is always Complete or Final, but in this PR here Agg2 could be Partial There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I add some comment in the code so the reviewer know what happen. |
||
finalAggFunc.Mode = aggregation.FinalMode | ||
} else if aggFunc.Mode == aggregation.Partial1Mode || aggFunc.Mode == aggregation.Partial2Mode { | ||
finalAggFunc.Mode = aggregation.Partial2Mode | ||
} | ||
} | ||
|
||
finalAggFunc.Args = args | ||
|
@@ -1483,7 +1499,7 @@ func (p *basePhysicalAgg) convertAvgForMPP() *PhysicalProjection { | |
} | ||
// no avgs | ||
// for final agg, always add project due to in-compatibility between TiDB and TiFlash | ||
if len(p.schema.Columns) == len(newSchema.Columns) && !p.isFinalAgg() { | ||
if len(p.schema.Columns) == len(newSchema.Columns) && !p.IsFinalAgg() { | ||
return nil | ||
} | ||
// add remaining columns to exprs | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove this check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the previous assumption, there will be no partial mode agg before coprocessor push down...
But now I change the logical plan aggregation to partial, so in the coprocessor push down here, this line may meet partial mode agg.