Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add OptionalColumn #1685

Closed
wants to merge 4 commits into from
Closed

add OptionalColumn #1685

wants to merge 4 commits into from

Conversation

PSeitz
Copy link
Contributor

@PSeitz PSeitz commented Nov 17, 2022

Perf regression doesn't seem to bad for the Option<T> handling on aggregations

`OptionalColumn`

test aggregation::bucket::range::bench::bench_range_100_buckets                                                          ... bench:     281,768 ns/iter (+/- 9,287)
test aggregation::bucket::range::bench::bench_range_10_buckets                                                           ... bench:     147,220 ns/iter (+/- 3,923)
test aggregation::bucket::term_agg::bench::bench_term_buckets_1_000_000_of_1_000_000                                     ... bench:  47,098,129 ns/iter (+/- 10,184,616)
test aggregation::bucket::term_agg::bench::bench_term_buckets_1_000_000_of_50                                            ... bench:   6,354,741 ns/iter (+/- 325,868)
test aggregation::bucket::term_agg::bench::bench_term_buckets_1_000_000_of_50_000                                        ... bench:  23,413,895 ns/iter (+/- 30,229,363)
test aggregation::bucket::term_agg::bench::bench_term_buckets_500_of_1_000_000                                           ... bench:       3,408 ns/iter (+/- 25)
test aggregation::tests::bench::bench_aggregation_average_f64                                                            ... bench:   8,305,089 ns/iter (+/- 608,815)
test aggregation::tests::bench::bench_aggregation_average_u64                                                            ... bench:   7,281,907 ns/iter (+/- 1,218,260)
test aggregation::tests::bench::bench_aggregation_average_u64_and_f64                                                    ... bench:  11,812,402 ns/iter (+/- 1,074,546)
test aggregation::tests::bench::bench_aggregation_histogram_only                                                         ... bench:  17,453,856 ns/iter (+/- 2,392,068)
test aggregation::tests::bench::bench_aggregation_histogram_only_hard_bounds                                             ... bench:  12,800,159 ns/iter (+/- 785,770)
test aggregation::tests::bench::bench_aggregation_histogram_with_avg                                                     ... bench:  44,424,944 ns/iter (+/- 6,043,104)
test aggregation::tests::bench::bench_aggregation_range_only                                                             ... bench:  11,061,966 ns/iter (+/- 1,144,104)
test aggregation::tests::bench::bench_aggregation_stats_f64                                                              ... bench:   9,667,992 ns/iter (+/- 924,933)
test aggregation::tests::bench::bench_aggregation_sub_tree                                                               ... bench:  18,767,313 ns/iter (+/- 2,498,334)
test aggregation::tests::bench::bench_aggregation_terms_few                                                              ... bench:  20,472,983 ns/iter (+/- 3,305,615)
test aggregation::tests::bench::bench_aggregation_terms_many                                                             ... bench:  60,062,208 ns/iter (+/- 15,852,762)


`Column`
test aggregation::bucket::range::bench::bench_range_100_buckets                                                          ... bench:     269,527 ns/iter (+/- 1,683)
test aggregation::bucket::range::bench::bench_range_10_buckets                                                           ... bench:     143,649 ns/iter (+/- 2,813)
test aggregation::bucket::term_agg::bench::bench_term_buckets_1_000_000_of_1_000_000                                     ... bench:  42,852,966 ns/iter (+/- 951,437)
test aggregation::bucket::term_agg::bench::bench_term_buckets_1_000_000_of_50                                            ... bench:   6,404,863 ns/iter (+/- 37,177)
test aggregation::bucket::term_agg::bench::bench_term_buckets_1_000_000_of_50_000                                        ... bench:   9,021,600 ns/iter (+/- 295,172)
test aggregation::bucket::term_agg::bench::bench_term_buckets_500_of_1_000_000                                           ... bench:       3,495 ns/iter (+/- 62)
test aggregation::tests::bench::bench_aggregation_average_f64                                                            ... bench:   8,440,023 ns/iter (+/- 243,789)
test aggregation::tests::bench::bench_aggregation_average_u64                                                            ... bench:   7,188,238 ns/iter (+/- 160,140)
test aggregation::tests::bench::bench_aggregation_average_u64_and_f64                                                    ... bench:  10,033,640 ns/iter (+/- 961,496)
test aggregation::tests::bench::bench_aggregation_histogram_only                                                         ... bench:  12,011,845 ns/iter (+/- 833,812)
test aggregation::tests::bench::bench_aggregation_histogram_only_hard_bounds                                             ... bench:  12,792,696 ns/iter (+/- 1,097,020)
test aggregation::tests::bench::bench_aggregation_histogram_with_avg                                                     ... bench:  30,211,392 ns/iter (+/- 709,005)
test aggregation::tests::bench::bench_aggregation_range_only                                                             ... bench:   9,173,227 ns/iter (+/- 70,369)
test aggregation::tests::bench::bench_aggregation_stats_f64                                                              ... bench:   8,859,926 ns/iter (+/- 59,453)
test aggregation::tests::bench::bench_aggregation_sub_tree                                                               ... bench:  15,856,888 ns/iter (+/- 818,285)
test aggregation::tests::bench::bench_aggregation_terms_few                                                              ... bench:  19,676,977 ns/iter (+/- 824,691)
test aggregation::tests::bench::bench_aggregation_terms_many                                                             ... bench:  48,286,725 ns/iter (+/- 1,715,196)

#1678

@codecov-commenter
Copy link

codecov-commenter commented Nov 18, 2022

Codecov Report

Merging #1685 (caca4cd) into main (e758080) will decrease coverage by 0.15%.
The diff coverage is 81.75%.

❗ Current head caca4cd differs from pull request most recent head 208e5fd. Consider uploading reports for the commit 208e5fd to get more accurate results

@@            Coverage Diff             @@
##             main    #1685      +/-   ##
==========================================
- Coverage   94.04%   93.89%   -0.16%     
==========================================
  Files         256      257       +1     
  Lines       49169    49391     +222     
==========================================
+ Hits        46243    46374     +131     
- Misses       2926     3017      +91     
Impacted Files Coverage Δ
fastfield_codecs/src/bitpacked.rs 81.81% <0.00%> (-16.62%) ⬇️
fastfield_codecs/src/blockwise_linear.rs 86.30% <0.00%> (-12.92%) ⬇️
fastfield_codecs/src/linear.rs 90.11% <0.00%> (-8.61%) ⬇️
fastfield_codecs/src/main.rs 0.50% <0.00%> (-0.01%) ⬇️
fastfield_codecs/src/optional_column.rs 70.00% <70.00%> (ø)
src/aggregation/bucket/range.rs 96.16% <75.00%> (-0.87%) ⬇️
src/aggregation/bucket/histogram/histogram.rs 99.14% <90.56%> (-0.47%) ⬇️
fastfield_codecs/src/lib.rs 98.91% <93.75%> (+0.02%) ⬆️
fastfield_codecs/src/compact_space/mod.rs 96.62% <100.00%> (-0.15%) ⬇️
fastfield_codecs/src/gcd.rs 100.00% <100.00%> (ø)
... and 27 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

/// For instance, the min value does not take in account of possible
/// deleted document. All values are however guaranteed to be higher than
/// `.min_value()`.
fn min_value(&self) -> Option<T>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should make it T and force this trait to have at least one value?
same for max_value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we would need to return Option<dyn OptionalColumn>. It's a valid alternative, but the handling may be more complex


/// Temporary wrapper to migrate to optional column
pub(crate) struct ToOptionalColumn<T> {
column: Arc<dyn Column<T>>,
Copy link
Collaborator

@fulmicoton fulmicoton Nov 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this implies two dispatches.
I think we can:

  • make it a generic
  • add a trait boundary to Column: Column: Clone
  • add a to_optional to Column

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started with a generic, but did run into compiler issues with the to_full method. It requires additional trait bounds Clone + 'static. I chose the easy solution since it's only temporary.

impl<C: Column + Clone + 'static> OptionalColumn for ToOptionalColumn<C>{
    fn to_full(&self) -> Arc<dyn Column>{
        Arc::new(self.underlying.clone())
    }
}

PSeitz and others added 2 commits November 21, 2022 03:42
Co-authored-by: Paul Masurel <paul@quickwit.io>
@PSeitz PSeitz closed this Mar 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants