Skip to content

Commit

Permalink
feat: add bit-width, cardinality and data-size to datablock statistics (
Browse files Browse the repository at this point in the history
#2986)

The statistics here is different from arrow array statistics. 

One way to think about the concept here is that the `array statistics`
are `logical statistics` and `datablock statistics` are `physical
statistics`

It is data type agnostic and it aims to facilitate encoding selection
and to provide a centralized calculation of encoding parameter
  • Loading branch information
broccoliSpicy authored Oct 14, 2024
1 parent d207aa8 commit 8f95fbe
Show file tree
Hide file tree
Showing 17 changed files with 1,560 additions and 14 deletions.
2 changes: 1 addition & 1 deletion rust/lance-datagen/src/generator.rs
Original file line number Diff line number Diff line change
Expand Up @@ -373,7 +373,7 @@ where
}

#[derive(Copy, Clone, Debug)]
pub struct Seed(u64);
pub struct Seed(pub u64);
pub const DEFAULT_SEED: Seed = Seed(42);

impl From<u64> for Seed {
Expand Down
1 change: 1 addition & 0 deletions rust/lance-encoding/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ rand.workspace = true
tempfile.workspace = true
test-log.workspace = true
criterion = { workspace = true }
rand_xoshiro = "0.6.0"

[build-dependencies]
prost-build.workspace = true
Expand Down
Loading

0 comments on commit 8f95fbe

Please sign in to comment.