Remove redundant check #23776

sopel39 · 2024-10-14T16:33:37Z

grouping keys are always part of output symbols

        ImmutableList.Builder<Symbol> outputs = ImmutableList.builder();
        outputs.addAll(groupingSets.getGroupingKeys());

Grouping keys are always part of output symbols

sopel39 · 2024-10-14T16:35:23Z

core/trino-main/src/main/java/io/trino/sql/planner/plan/AggregationNode.java

@@ -233,8 +232,7 @@ public boolean producesDistinctRows()
    {


This check is probably wrong, e.g: when there are multiple grouping sets.
Also, this check is too probably too strict, e.g: with regards to hash symbols. cc @Praveen2112

pettyjamesm

This change is correct, but is relying on the assumption that could in theory change at some point in the future. I can't think of a way to write a unit test that would catch that change in assumptions (an assert groupingSets.getGroupingKeys().containsAll(outputs) might work, except asserts are disallowed by the style guide), but it's probably worth adding adding an inline comment to clarify the assumption.

Approving in advance since the requested comment change is minor. I agree with your comment though that the check should also inspect the grouping set count for correctness.

sopel39 · 2024-10-16T14:49:55Z

This change is correct, but is relying on the assumption that could in theory change at some point in the future. I can't think of a way to write a unit test that would catch that change in assumptions (an assert groupingSets.getGroupingKeys().containsAll(outputs) might work, except asserts are disallowed by the style guide), but it's probably worth adding adding an inline comment to clarify the assumption.

I really think we should use plan (derived) properties for this rather then explicitly query AggregationNode

pettyjamesm · 2024-10-16T16:53:43Z

I really think we should use plan (derived) properties for this rather then explicitly query AggregationNode

I see the argument for using derived properties, but the performance for property derivations is extremely slow. I don't think it's practical to use it in their current form as part of an iterative optimizer rule.

sopel39 · 2024-10-18T17:48:44Z

I see the argument for using derived properties, but the performance for property derivations is extremely slow. I don't think it's practical to use it in their current form as part of an iterative optimizer rule.

Have you investigate why it's the case? The advantage of property derivation is that it can represent properties of subset of symbols, e.g: when hash symbol is not present or subset of symbols was already distinct before aggregation

pettyjamesm · 2024-10-18T18:01:07Z

Property derivations traverses the whole sub-tree underneath the given node, so if you do in an iterative optimizer you end up with exponential runtime very quickly.

sopel39 · 2024-10-21T10:37:46Z

Property derivations traverses the whole sub-tree underneath the given node, so if you do in an iterative optimizer you end up with exponential runtime very quickly.

I don't think visiting subtreee is the biggest problem here. Did you check if it's a problem with metadata fetching?

sopel39 requested review from martint, pettyjamesm and raunaqmorarka October 14, 2024 16:33

cla-bot bot added the cla-signed label Oct 14, 2024

Remove redundant check

0af76a5

Grouping keys are always part of output symbols

sopel39 force-pushed the ks/remove_redundant branch from 31cc80a to 0af76a5 Compare October 14, 2024 16:34

sopel39 commented Oct 14, 2024

View reviewed changes

pettyjamesm approved these changes Oct 16, 2024

View reviewed changes

sopel39 merged commit bf536a8 into trinodb:master Oct 16, 2024
92 checks passed

sopel39 deleted the ks/remove_redundant branch October 16, 2024 15:53

github-actions bot added this to the 462 milestone Oct 16, 2024

mosabua mentioned this pull request Oct 16, 2024

Add Trino 462 release notes #23745

Merged

pettyjamesm mentioned this pull request Oct 16, 2024

Cleanup AggreggationNode#producesDistinctRows #23806

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove redundant check #23776

Remove redundant check #23776

sopel39 commented Oct 14, 2024

sopel39 Oct 14, 2024

pettyjamesm left a comment

sopel39 commented Oct 16, 2024

pettyjamesm commented Oct 16, 2024

sopel39 commented Oct 18, 2024 •

edited

Loading

pettyjamesm commented Oct 18, 2024

sopel39 commented Oct 21, 2024

Remove redundant check #23776

Remove redundant check #23776

Conversation

sopel39 commented Oct 14, 2024

sopel39 Oct 14, 2024

Choose a reason for hiding this comment

pettyjamesm left a comment

Choose a reason for hiding this comment

sopel39 commented Oct 16, 2024

pettyjamesm commented Oct 16, 2024

sopel39 commented Oct 18, 2024 • edited Loading

pettyjamesm commented Oct 18, 2024

sopel39 commented Oct 21, 2024

sopel39 commented Oct 18, 2024 •

edited

Loading