Skip to content

Commit

Permalink
docs(common): add more docs for DataChunk (risingwavelabs#8736)
Browse files Browse the repository at this point in the history
  • Loading branch information
kwannoel authored Mar 23, 2023
1 parent f6ccfd5 commit 96aa23d
Show file tree
Hide file tree
Showing 4 changed files with 49 additions and 7 deletions.
14 changes: 12 additions & 2 deletions src/common/src/array/column.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,18 @@ use risingwave_pb::data::PbColumn;
use super::{Array, ArrayError, ArrayResult, I64Array};
use crate::array::{ArrayImpl, ArrayRef};

/// Column is owned by `DataChunk`. It consists of logic data type and physical array
/// implementation.
/// A [`Column`] consists of its logical data type
/// and its corresponding physical array implementation,
/// The array contains all the datums bound to this [`Column`].
/// [`Column`] is owned by [`DataChunk`].
///
/// For instance, in this [`DataChunk`],
/// for column `v1`, [`ArrayRef`] will contain: [1,1,1]
/// | v1 | v2 |
/// |----|----|
/// | 1 | a |
/// | 1 | b |
/// | 1 | c |
#[derive(Clone, Debug, PartialEq)]
pub struct Column {
array: ArrayRef,
Expand Down
31 changes: 29 additions & 2 deletions src/common/src/array/data_chunk.rs
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,23 @@ use crate::util::hash_util::finalize_hashers;
use crate::util::iter_util::{ZipEqDebug, ZipEqFast};
use crate::util::value_encoding::{serialize_datum_into, ValueRowSerializer};

/// `DataChunk` is a collection of arrays with visibility mask.
/// [`DataChunk`] is a collection of Columns,
/// a with visibility mask for each row.
/// For instance, we could have a [`DataChunk`] of this format.
/// | v1 | v2 | v3 |
/// |----|----|----|
/// | 1 | a | t |
/// | 2 | b | f |
/// | 3 | c | t |
/// | 4 | d | f |
///
/// Our columns are v1, v2, v3.
/// Then, if the Visibility Mask hides rows 2 and 4,
/// We will only have these rows visible:
/// | v1 | v2 | v3 |
/// |----|----|----|
/// | 1 | a | t |
/// | 3 | c | t |
#[derive(Clone, PartialEq)]
#[must_use]
pub struct DataChunk {
Expand Down Expand Up @@ -170,7 +186,18 @@ impl DataChunk {
}

/// `compact` will convert the chunk to compact format.
/// Compact format means that `visibility == None`.
/// Compacting removes the hidden rows, and returns a new visibility
/// mask which indicates this.
///
/// `compact` has trade-offs:
///
/// Cost:
/// It has to rebuild the each column, meaning it will incur cost
/// of copying over bytes from the original column array to the new one.
///
/// Benefit:
/// The main benefit is that the data chunk is smaller, taking up less memory.
/// We can also save the cost of iterating over many hidden rows.
pub fn compact(self) -> Self {
match &self.vis2 {
Vis::Compact(_) => self,
Expand Down
3 changes: 2 additions & 1 deletion src/common/src/array/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -496,7 +496,8 @@ macro_rules! impl_array_builder {
}
}

/// Append a [`Datum`] or [`DatumRef`] multiple times, return error while type not match.
/// Append a [`Datum`] or [`DatumRef`] multiple times,
/// panicking if the datum's type does not match the array builder's type.
pub fn append_datum_n(&mut self, n: usize, datum: impl ToDatumRef) {
match datum.to_datum_ref() {
None => match self {
Expand Down
8 changes: 6 additions & 2 deletions src/common/src/array/vis.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,15 @@ use itertools::repeat_n;

use crate::buffer::{Bitmap, BitmapBuilder};

/// `Vis` is a visibility bitmap of rows. When all rows are visible, it is considered compact and
/// is represented by a single cardinality number rather than that many of ones.
/// `Vis` is a visibility bitmap of rows.
#[derive(Clone, PartialEq, Debug)]
pub enum Vis {
/// Non-compact variant.
/// Certain rows are hidden using this bitmap.
Bitmap(Bitmap),

/// Compact variant which just stores cardinality of rows.
/// This can be used when all rows are visible.
Compact(usize), // equivalent to all ones of this size
}

Expand Down

0 comments on commit 96aa23d

Please sign in to comment.