Source files:
common/src/types
RisingWave adapts a relational data model with extensive support for semi-structured data. Relational tables, including tables and materialized views, consist of a list of named, strong-typed columns.
Tables created by users have an implicit, auto-generated row-id column as their primary key; while for materialized views, the primary key is derived from queries. For example, the primary key of an aggregation (group-by) materialized view is the specified group keys.
NULL
values mean missing or unknown fields. Currently, all columns are implicitly nullable.
Primitive data types:
- Booleans:
BOOLEAN
- Integers:
SMALLINT
(16-bit),INT
(32-bit),BIGINT
(64-bit) - Decimals:
NUMERIC
- Floating-point numbers:
REAL
,DOUBLE
- Strings:
VARCHAR
- Temporals:
DATE
,TIMESTAMP
,TIMESTAMP WITH TIME ZONE
,TIME
,INTERVAL
Composite data types (WIP):
Struct
: A structure with a list of named, strong-typed fields.List
: A variable-length list of values with same data type.
Source files:
common/src/array
In-memory data is encoded in arrays for vectorized execution. For variable-length data like strings, generally we use another offset array to mark the start of encoded values in a byte buffer.
A Data Chunk consists of multiple columns and a visibility array to mark each row as visible or not, which helps filtering some rows while keeping other data arrays unchanged.
A Stream Chunk consists of columns, visibility array and an additional ops
column to mark the operation of row, which can be one of Delete
, Insert
, UpdateDelete
and UpdateInsert
.
Source files:
utils/memcomparable
,utils/value-encoding
RisingWave stores user data in shared key-value storage called 'Hummock'. Tables, materialized views and checkpoints of internal streaming operators are encoded into key-value entries. Every field of a row aka. cell is encoded as a key-value entry, except NULL
values are omitted.
Considering that ordering matters in some cases like result set of an order-by query, fields of keys must preserve the order of original values after being encoded into bytes. This is what memcomparable
for. For example, integers must be encoded in big-endien and the sign bit must be flipped to preserve order. In contrast, the encoding of values does not need to preserve order.