-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Function to create ArrayRef
from an iterator of ScalarValues
#381
Conversation
Shouldn't we use the Builder API to create an array of the values from the statistics instead of |
@jorgecarleitao could you please elaborate on what you have in mind here? The reason |
ah, sorry, I misunderstand. The idea is to generalize over the parquet min/max. Yeap, makes total sense 👍; sorry for the noise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great 👍 it should also be very useful for code around aggregations
this is definitely helpful while implementing #375 , I was this close of trying to implement this by myself :-) |
let mut scalars = scalars.into_iter().peekable(); | ||
|
||
// figure out the type based on the first element | ||
let data_type = match scalars.peek() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it make sense to have an iter of a mixture of null and present values? in that case the first value can be null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never mind - even if for null the data type info is present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah -- this confused me a little -- it is convenient that ScalarValue
is always typed even when it is None
I'll plan to merge this when the CI is green |
Codecov Report
@@ Coverage Diff @@
## master #381 +/- ##
==========================================
+ Coverage 74.88% 74.90% +0.01%
==========================================
Files 146 146
Lines 24368 24481 +113
==========================================
+ Hits 18249 18338 +89
- Misses 6119 6143 +24
Continue to review full report at Codecov.
|
@alamb there is still a typo in the code |
Co-authored-by: Daniël Heres <danielheres@gmail.com>
Co-authored-by: Daniël Heres <danielheres@gmail.com>
e4cf55e
to
b2f771e
Compare
Which issue does this PR close?
This is part of #363 where I am trying to evaluate pruning predicates on record batches that represent the min/max values of statistics.
Rationale for this change
The proposed interface for providing min/max values is as
ScalarValue
s and I need a way to take a bunch ofScalarValues
and turn them into an Array.You can see how it is used in #380
What changes are included in this PR?
ScalarValue::iter_to_array
, tests for sameAre there any user-facing changes?
ScalarValue::iter_to_array
is now available