-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for encoding directives, arg groups, length-prefixed e-expressions #796
Conversation
* Adds support for switching encodings midstream * Adds support for parameter cardinality modifiers * Adds support for reading argument groups * Adds support for length-prefixed e-expressions * Optimizes and simplifies macro evaluation * Simplifies struct expansion logic * Adds static analysis optimizations to template evaluation * Adds support for lazily evaluating top-level e-expressions * Adds more Ion 1.1 encodings to the `read_many_structs` benchmark * Shrinks many data types Switching encodings midstream, initial support for encoding directives ParameterCardinality, transcribe raw binary 1.1 to text, version switching refactor transcription, groundwork for variadic params Adds non-required cardinalities Read support for binary arg grouping bitmap binary reader support for arg groups minor optimizations allows benchmarks to manually register templates reworked e-expr parsing Binary Ion v1.1 uses &'top for value, eexp types minor optimizations 2 clippy suggestions per-type read paths for string, int better Reworks EncodedValue offset calculation to save on size Adds support for length-prefixed e-exprs simplified struct iteration, rewrote eexp parsing further streamlining of read_sequence_value_expr Optimization for singleton template macros Reorganizes read_many_structs benchmark OneShotMacroEvaluator, EExpArgIterator trait, fixed ion-tests integration upgradeable MacroEvaluator (OneShot -> Stacked) Using new macro evaluator at root level Struct expansion uses new macro evaluator Optimizations for macro expansion * Adds `read_resolved` method to the LazyExpandedValue level so all Ion-type specific logic can be handled once instead of twice. * `MacroExpansion::initialize` now delegates the entire implementation the invocation's source kinds instead of dispatching on every method call. * Universal singleton macro expansion optimization support. Revised TemplateCompiler storage Shrank macro evaluator sizes simplified macro evaluation logic +"optimal path" eexp benchmark, more method inlining s/quote/literal/, clean up read_many_structs benchmark remove erroneous profile json Adds try_next! macro for Option<Result<_>> early returns cleanup clippy suggestions fix doc links more clippy suggestions
Miri is failing due to #794. I'd like to investigate that after this PR is merged. |
1ca6a53
to
7655771
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ PR Tour 🧭
@@ -21,7 +21,7 @@ edition = "2021" | |||
# We need at least 1.65 for GATs[1] and 1.67 for `ilog`[2] | |||
# [1] https://blog.rust-lang.org/2022/11/03/Rust-1.65.0.html | |||
# [2] https://blog.rust-lang.org/2023/01/26/Rust-1.67.0.html#stabilized-apis | |||
rust-version = "1.67" | |||
rust-version = "1.77" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ Bumping this again while we still can--1.77 added some nice const
generics functionality for breaking apart byte arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while we still can
There's a cfg_version RFC that will help with this.
Also, if we haven't already done so, we should define a MSRV policy for this crate.
@@ -3,27 +3,262 @@ use criterion::{criterion_group, criterion_main}; | |||
#[cfg(not(feature = "experimental"))] | |||
mod benchmark { | |||
use criterion::Criterion; | |||
|
|||
pub fn criterion_benchmark(_c: &mut Criterion) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ I've added a variety of binary 1.1 encodings to the read_many_structs
benchmark.
let text_1_1_data = r#"(:event 1670446800245 418 "scheduler-thread-6" "example-client-1" "aws-us-east-5f-18b4fa" (: "region 4" "2022-12-07T20:59:59.744000Z"))"#.repeat(num_values); | ||
let mut binary_1_1_data = vec![0xE0u8, 0x01, 0x01, 0xEA]; // IVM | ||
#[rustfmt::skip] | ||
let mut binary_1_1_data_body: Vec<u8> = vec![ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ The binary writer doesn't yet support argument groups, so I've hand-encoded the binary for now.
@@ -126,13 +134,23 @@ impl<'top> HasRange for LazyRawAnyVersionMarker<'top> { | |||
} | |||
|
|||
impl<'top> RawVersionMarker<'top> for LazyRawAnyVersionMarker<'top> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ In order to support switching encodings mid-stream, I reworked the RawVersionMarker
trait and its impls. I'll call it out again further on. The version()
method now returns an IonVersion
enum, while major_minor()
returns the (major, minor)
version tuple.
use crate::{Decoder, HasRange, HasSpan, IonResult, LazyExpandedValue, Span}; | ||
|
||
#[derive(Copy, Clone, Debug)] | ||
pub struct EExpArg<'top, D: Decoder> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ These encoding-agnostic types should also be in their own file/module.
"Ion version {major}.{minor} is not supported" | ||
)); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ This is now supported; the logic lives in the LazyAnyRawReader
.
} | ||
}; | ||
Ok(value_ref) | ||
self.expanded_value.read_resolved() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ As described earlier, putting the resolution logic closer to the read logic allowed the compiler to make much better optimizations.
lazy::expanded::compiler::TemplateCompiler, | ||
lazy::expanded::template::TemplateMacro, | ||
lazy::expanded::template::TemplateBodyExpr, | ||
lazy::expanded::template::TemplateBodyExprKind, | ||
lazy::expanded::macro_table::Macro, | ||
lazy::expanded::macro_evaluator::MacroEvaluator, | ||
lazy::expanded::macro_evaluator::MacroExpansionKind, | ||
lazy::expanded::macro_table::MacroKind, | ||
lazy::expanded::macro_table::MacroTable, | ||
lazy::expanded::EncodingContext, | ||
lazy::any_encoding::IonVersion, | ||
lazy::binary::raw::reader::LazyRawBinaryReader_1_0, | ||
lazy::binary::raw::v1_1::reader::LazyRawBinaryReader_1_1, | ||
lazy::expanded::macro_evaluator::RawEExpression, | ||
lazy::expanded::macro_evaluator::ValueExpr, | ||
lazy::expanded::macro_evaluator::MacroExpr, | ||
lazy::expanded::macro_evaluator::MacroExprKind, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ Additional types exposed behind the experimental-reader-writer
feature.
.map(|p| p.replace('/', &PATH_SEPARATOR.to_string())) | ||
.map(|p| p.replace('/', PATH_SEPARATOR)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ (wise) clippy suggestion.
@@ -21,7 +21,7 @@ edition = "2021" | |||
# We need at least 1.65 for GATs[1] and 1.67 for `ilog`[2] | |||
# [1] https://blog.rust-lang.org/2022/11/03/Rust-1.65.0.html | |||
# [2] https://blog.rust-lang.org/2023/01/26/Rust-1.67.0.html#stabilized-apis | |||
rust-version = "1.67" | |||
rust-version = "1.77" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while we still can
There's a cfg_version RFC that will help with this.
Also, if we haven't already done so, we should define a MSRV policy for this crate.
} | ||
|
||
#[derive(Debug, Default, Copy, Clone, PartialEq)] | ||
pub enum IonVersion { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a common/utility type that should be exported from the crate root, and maybe even be defined in the root module so that it's easily discoverable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is re-exported at the root, but I agree it should live in its own module and/or at the root. I'd like to leave that for another PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving so that we can review PR feedback changes in smaller PRs.
read_many_structs
benchmark