Skip to content

Commit

Permalink
feat: mvp for lance version 0.2 reader / writer (#1965)
Browse files Browse the repository at this point in the history
The motivation and bigger picture are covered in more detail in
#1929

This PR builds on top of #1918 and
#1964 to create a new version of
the Lance file format.

There is still much to do, but this end-to-end MVP should provide the
overall structure for the work.

It can currently read and write primitive columns and list columns and
supports some very basic encodings.
  • Loading branch information
westonpace authored Apr 9, 2024
1 parent cd7f274 commit 3ac0074
Show file tree
Hide file tree
Showing 6 changed files with 830 additions and 0 deletions.
16 changes: 16 additions & 0 deletions protos/file.proto
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,22 @@ syntax = "proto3";

package lance.file;

// A file descriptor that describes the contents of a Lance file
message FileDescriptor {
// The schema of the file
Schema schema = 1;
// The number of rows in the file
uint64 length = 2;
}

// A schema which describes the data type of each of the columns
message Schema {
// All fields in this file, including the nested fields.
repeated lance.file.Field fields = 1;
// Schema metadata.
map<string, bytes> metadata = 5;
}

// Metadata of one Lance file.
message Metadata {
// 4 was used for StatisticsMetadata in the past, but has been moved to prevent
Expand Down
2 changes: 2 additions & 0 deletions rust/lance-file/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,11 @@ arrow-schema.workspace = true
arrow-select.workspace = true
async-recursion.workspace = true
async-trait.workspace = true
byteorder.workspace = true
bytes.workspace = true
datafusion-common.workspace = true
futures.workspace = true
lance-datagen.workspace = true
num_cpus.workspace = true
num-traits.workspace = true
object_store.workspace = true
Expand Down
1 change: 1 addition & 0 deletions rust/lance-file/src/format.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,5 @@ pub mod metadata;
/// These version/magic values are written at the end of Lance files (e.g. versions/1.version)
pub const MAJOR_VERSION: i16 = 0;
pub const MINOR_VERSION: i16 = 2;
pub const MINOR_VERSION_NEXT: u16 = 3;
pub const MAGIC: &[u8; 4] = b"LANC";
2 changes: 2 additions & 0 deletions rust/lance-file/src/v2.rs
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
pub(crate) mod io;
pub mod reader;
pub mod writer;
Loading

0 comments on commit 3ac0074

Please sign in to comment.