Specialize Avg and Sum accumulators (#6842) #7358

tustvold · 2023-08-21T23:25:55Z

Which issue does this PR close?

Part of #6842

Rationale for this change

This makes it easier to see what is going on, and avoids using ScalarValue arithmetic

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

tustvold · 2023-08-21T23:26:45Z

datafusion/physical-expr/src/aggregate/average.rs

-        let delta = sum_batch(values, &self.sum.get_datatype())?;
-        self.sum = self.sum.sub(&delta)?;
+        if let Some(x) = sum(values) {
+            self.sum = Some(self.sum.unwrap() - x);


This feels instinctively wrong, as it will accumulate errors over time... I'm not really sure what to do about this though...

I think this is expected for floats (otherwise we would need to keep intermediate values). Switching to decimal should allow for precise values.

FYI @ozankabak @metesynnada

We will be looking into this.

The calculation looks correct. There is indeed some error accumulation when doing incremental calculations, but it is unavoidable (and very very rarely causes issues in practice)

tustvold · 2023-08-21T23:27:50Z

datafusion/physical-expr/src/aggregate/average.rs

+    use datafusion_expr::aggregate_function::sum_type_of_avg;
+    use datafusion_expr::type_coercion::aggregates::avg_return_type;
+
+    fn test_with_pre_cast(array: ArrayRef, expected: ScalarValue) {


This change is necessary because generic_test_op would call Avg::new which would then not have the correct arguments. This replicates the logic in build_in

tustvold · 2023-08-21T23:28:56Z

datafusion/physical-expr/src/aggregate/average.rs

+        // instantiate specialized accumulator based for the type
+        match (&self.sum_data_type, &self.rt_data_type) {
+            (Float64, Float64) => {
+                Ok(Box::new(AvgAccumulator::new(self.pre_cast_to_sum_type)))


This whole precast thing seems like a hack, imo this should be being handled by the type coercion machinery, not internal to the aggregator...

tustvold · 2023-08-22T13:05:48Z

Marking as draft as I'd like to get #7369 in first

Dandandan · 2023-08-22T15:39:09Z

datafusion/physical-expr/src/aggregate/sum_distinct.rs

+
+impl<T: ToByteSlice> std::hash::Hash for Hashable<T> {
+    fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
+        self.0.to_byte_slice().hash(state)


In other cases I think we only do this for floats and use the state.hash_one(self.0) in other cases (for primitives it's faster)

This does use ahash, it overrides the BuildHasher used by the HashSet. I think we could definitely do something better here, but in the absence of benchmarks I'm keen to go with simple and we can always optimise it later down the line. Regardless this should be significantly faster than the prior approach

Edit: If someone really care about the performance of DistinctSum, implementing a GroupsAccumulator will likely yield far greater performance than any incremental tweaking of this Accumulator-based version

Edit: If someone really care about the performance of DistinctSum, implementing a GroupsAccumulator will likely yield far greater performance than any incremental tweaking of this Accumulator-based version

yeah true :)

Dandandan · 2023-08-22T16:16:34Z

datafusion/physical-expr/src/aggregate/sum_distinct.rs

-        Ok(Box::new(DistinctSumAccumulator::try_new(&self.data_type)?))
+        macro_rules! helper {
+            ($t:ty, $dt:expr) => {
+                Ok(Box::new(DistinctSumAccumulator::<$t>::try_new(&$dt)?))


I think we can do the same for DistinctCountAccumulator

Quite possibly, right now I'm just doing the minimum to be able to remove the ScalarValue arithmetic kernels 😅

DistinctCountAccumulator also uses ScalarValue ;)

But not arithmetic 😄

Aha 😁 let me follow up this PR then ;)

alamb

Thank you @tustvold -- these changes make sense to me.

I think we should run some basic performance tests too if we have any that cover queries that use these functions (like SUM DISTINCT and using sliding sum in window functions). Maybe @metesynnada or @ozankabak know of some benchmarks we can run that cover it

alamb · 2023-08-22T17:15:20Z

datafusion/physical-expr/src/aggregate/average.rs

+                target_scale: *target_scale,
+            })),
+            _ => not_impl_err!(
+                "AvgGroupsAccumulator for ({} --> {})",


Suggested change

"AvgGroupsAccumulator for ({} --> {})",

"AvgAccumulator for ({} --> {})",

alamb · 2023-08-22T17:21:41Z

datafusion/physical-expr/src/aggregate/sum.rs

        }
+        downcast_sum!(self, helper)


the multiple levels of macros is concise I'll give you that but I do find it hard to follow. Maybe that is ok as we don't expect this to be changing

alamb · 2023-08-22T17:22:57Z

datafusion/physical-expr/src/aggregate/sum.rs

-            $FN,
-        )))
-    }};
+/// Sum only supports a subset of numeric types, instead relying on type coercion


It might help to document what this macro does (aka calls helper macro given a s (what is s? The aggregate?))

ozankabak · 2023-08-22T18:37:44Z

I think we should run some basic performance tests too if we have any that cover queries that use these functions (like SUM DISTINCT and using sliding sum in window functions). Maybe @metesynnada or @ozankabak know of some benchmarks we can run that cover it

We will discuss this tomorrow and circle back to you

Dandandan

Looks great 👍

tustvold · 2023-08-22T19:38:03Z

Running the TPCH benchmarks

┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      main ┃ specialize-avg-accumulator ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  671.28ms │                   653.98ms │     no change │
│ QQuery 2     │  147.40ms │                   136.89ms │ +1.08x faster │
│ QQuery 3     │  259.99ms │                   252.70ms │     no change │
│ QQuery 4     │  150.08ms │                   145.15ms │     no change │
│ QQuery 5     │  347.39ms │                   348.76ms │     no change │
│ QQuery 6     │  152.53ms │                   147.19ms │     no change │
│ QQuery 7     │  595.27ms │                   560.30ms │ +1.06x faster │
│ QQuery 8     │  382.90ms │                   368.43ms │     no change │
│ QQuery 9     │  624.58ms │                   593.67ms │     no change │
│ QQuery 10    │  462.84ms │                   452.43ms │     no change │
│ QQuery 11    │  138.77ms │                   127.82ms │ +1.09x faster │
│ QQuery 12    │  221.17ms │                   213.70ms │     no change │
│ QQuery 13    │  389.77ms │                   375.43ms │     no change │
│ QQuery 14    │  209.97ms │                   209.11ms │     no change │
│ QQuery 15    │  149.18ms │                   144.03ms │     no change │
│ QQuery 16    │  134.44ms │                   114.72ms │ +1.17x faster │
│ QQuery 17    │  707.01ms │                   694.61ms │     no change │
│ QQuery 18    │ 1066.54ms │                  1086.68ms │     no change │
│ QQuery 19    │  384.12ms │                   367.69ms │     no change │
│ QQuery 20    │  343.48ms │                   317.19ms │ +1.08x faster │
│ QQuery 21    │  850.10ms │                   833.78ms │     no change │
│ QQuery 22    │  107.96ms │                   100.84ms │ +1.07x faster │
└──────────────┴───────────┴────────────────────────────┴───────────────┘

tustvold added the api change Changes the API exposed to users of the crate label Aug 21, 2023

github-actions bot added physical-expr Physical Expressions optimizer Optimizer rules core Core DataFusion crate labels Aug 21, 2023

tustvold commented Aug 21, 2023

View reviewed changes

github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Aug 22, 2023

tustvold marked this pull request as draft August 22, 2023 13:05

github-actions bot added the logical-expr Logical plan and expressions label Aug 22, 2023

tustvold changed the title ~~Specialize AvgAccumulator (#6842)~~ Specialize Avg and Sum (#6842) Aug 22, 2023

Specialize SUM and AVG (apache#6842)

1fd74c8

tustvold force-pushed the specialize-avg-accumulator branch from 63d9971 to 1fd74c8 Compare August 22, 2023 14:24

github-actions bot removed the logical-expr Logical plan and expressions label Aug 22, 2023

Specialize Distinct Sum

870c865

tustvold marked this pull request as ready for review August 22, 2023 15:36

Dandandan reviewed Aug 22, 2023

View reviewed changes

tustvold mentioned this pull request Aug 22, 2023

Use datum arithmetic scalar value #7375

Merged

Dandandan reviewed Aug 22, 2023

View reviewed changes

alamb changed the title ~~Specialize Avg and Sum (#6842)~~ Specialize Avg and Sum accumulators (#6842) Aug 22, 2023

alamb approved these changes Aug 22, 2023

View reviewed changes

Dandandan approved these changes Aug 22, 2023

View reviewed changes

tustvold added 2 commits August 22, 2023 20:52

Review feedback

25005e8

Update sqllogictest

8be361c

tustvold merged commit 6c785d1 into apache:main Aug 23, 2023
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specialize Avg and Sum accumulators (#6842) #7358

Specialize Avg and Sum accumulators (#6842) #7358

tustvold commented Aug 21, 2023 •

edited

Loading

tustvold Aug 21, 2023

Dandandan Aug 22, 2023

metesynnada Aug 22, 2023

ozankabak Aug 22, 2023

tustvold Aug 21, 2023 •

edited

Loading

tustvold Aug 21, 2023

tustvold commented Aug 22, 2023

Dandandan Aug 22, 2023 •

edited

Loading

tustvold Aug 22, 2023 •

edited

Loading

Dandandan Aug 22, 2023

Dandandan Aug 22, 2023

tustvold Aug 22, 2023

Dandandan Aug 22, 2023

tustvold Aug 22, 2023

Dandandan Aug 22, 2023

alamb left a comment

alamb Aug 22, 2023

alamb Aug 22, 2023

alamb Aug 22, 2023

ozankabak commented Aug 22, 2023

Dandandan left a comment

tustvold commented Aug 22, 2023

	"AvgGroupsAccumulator for ({} --> {})",
	"AvgAccumulator for ({} --> {})",

Specialize Avg and Sum accumulators (#6842) #7358

Specialize Avg and Sum accumulators (#6842) #7358

Conversation

tustvold commented Aug 21, 2023 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tustvold Aug 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tustvold commented Aug 22, 2023

Dandandan Aug 22, 2023 • edited Loading

Choose a reason for hiding this comment

tustvold Aug 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ozankabak commented Aug 22, 2023

Dandandan left a comment

Choose a reason for hiding this comment

tustvold commented Aug 22, 2023

tustvold commented Aug 21, 2023 •

edited

Loading

tustvold Aug 21, 2023 •

edited

Loading

Dandandan Aug 22, 2023 •

edited

Loading

tustvold Aug 22, 2023 •

edited

Loading