chore: iterative scalar procesing #2314

tychoish · 2023-12-27T22:05:37Z

This should address some of the concerns #2304

Pro: we iterate through arrays (when we're already going to deal with
scalar values) exactly once, and there's no intermediate data
structure. It's pretty simple from the perspective of the call sites.

Con: I had to copy paste a bunch of datafusion casting code in (though
I'm not worried about it) but I haven't made the tests compil yet
either.

universalmind303 · 2023-12-28T15:28:53Z

crates/sqlbuiltins/src/functions/scalars/mod.rs

-            }
+            ColumnarValue::Array(arr) => Ok(ColumnarValue::Array(scalar_iter_to_array(
+                (0..arr.len()).map(|idx| -> Result<ScalarValue, ExtensionError> {
+                    Ok(op(ScalarValue::try_from_array(arr, idx)?)?)


I'd prefer to avoid using try_from_array. There's quite a bit of work being done in there that'll slow things down in a hot loop like this.

I'd suggest that we pass in the return type to the get_nth function. Then we can try to extract the value directly, skipping all of the checks that happen in try_from_array. Similarly, scalar_iter_to_array could skip peeking at the elements if it knew the return type.

tychoish · 2023-12-28T17:07:52Z

I found this gem in the try_unary docs:

Note: LLVM is currently unable to effectively vectorize fallible operations

Which means, that if we want to have ScalarUDFs that are fallible, we're not going to get too much benefit out of too much extra optimization because as long as we want to accept the possibility of errors (which I think we should for some of these options), we're really just optimizing things for health.

chore: iterative scalar procesing

856f14b

tychoish mentioned this pull request Dec 27, 2023

fix: scalar udf function argument handling #2304

Merged

fix formatting

2aa2277

tychoish changed the base branch from main to tycho/scalar-value-handling December 27, 2023 22:20

tychoish added 2 commits December 27, 2023 17:21

fixup

5594034

test fixes

78af9ce

tychoish requested a review from scsmithr December 28, 2023 05:55

tychoish added 4 commits December 28, 2023 01:13

beep

9cf2058

avoid passing type info

c7c01e0

fixup

108ce98

fixup cast

9c462c5

universalmind303 approved these changes Dec 28, 2023

View reviewed changes

tychoish merged commit 7435a23 into tycho/scalar-value-handling Dec 28, 2023
13 checks passed

tychoish deleted the tycho/lazy-scalar-handling branch December 28, 2023 17:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: iterative scalar procesing #2314

chore: iterative scalar procesing #2314

tychoish commented Dec 27, 2023

universalmind303 Dec 28, 2023

tychoish commented Dec 28, 2023

chore: iterative scalar procesing #2314

chore: iterative scalar procesing #2314

Conversation

tychoish commented Dec 27, 2023

universalmind303 Dec 28, 2023

Choose a reason for hiding this comment

tychoish commented Dec 28, 2023