Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduces an application-level lazy writer #745

Merged
merged 3 commits into from
Apr 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions benches/write_many_structs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ fn write_struct_with_string_values(value_writer: impl ValueWriter) -> IonResult<
black_box("2022-12-07T20:59:59.744000Z"),
],
)?;
struct_.end()
struct_.close()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This was a holdover naming consistency fix that @popematt identified in #741.

}

fn write_struct_with_symbol_values(value_writer: impl ValueWriter) -> IonResult<()> {
Expand Down Expand Up @@ -75,7 +75,7 @@ fn write_struct_with_symbol_values(value_writer: impl ValueWriter) -> IonResult<
symbol_id(black_box(25)),
],
)?;
struct_.end()
struct_.close()
}

fn write_eexp_with_symbol_values(value_writer: impl ValueWriter) -> IonResult<()> {
Expand Down
2 changes: 1 addition & 1 deletion examples/write_log_events.rs
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,7 @@ mod example {
.write(14, RawSymbolToken::SymbolId(18))? // log level
.write(15, RawSymbolToken::SymbolId(19))? // format
.write(16, &event.parameters)?;
struct_.end()
struct_.close()
}
}

Expand Down
15 changes: 15 additions & 0 deletions src/element/writer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ use crate::ion_writer::IonWriter;
use crate::result::IonResult;
use crate::{Element, IonType, TextKind, Value};

use crate::lazy::encoding::BinaryEncoding_1_1;
#[cfg(feature = "experimental-lazy-reader")]
use {
crate::lazy::encoder::{LazyEncoder, LazyRawWriter},
Expand All @@ -18,6 +19,7 @@ use {
/// Writer configuration to provide format and Ion version details to writer through encoding
/// This will be used to create a writer without specifying which writer methods to use
#[cfg(feature = "experimental-lazy-reader")]
#[derive(Clone, Debug)]
pub struct WriteConfig<E: Encoding> {
pub(crate) kind: WriteConfigKind,
phantom_data: PhantomData<E>,
Expand Down Expand Up @@ -51,6 +53,16 @@ impl WriteConfig<BinaryEncoding_1_0> {
}
}

#[cfg(feature = "experimental-lazy-reader")]
impl WriteConfig<BinaryEncoding_1_1> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Prior to this, there was not a function to construct a default WriteConfig for the binary 1.1 writer.

pub fn new() -> Self {
Self {
kind: WriteConfigKind::Binary(BinaryWriteConfig),
phantom_data: Default::default(),
}
}
}

#[cfg(feature = "experimental-lazy-reader")]
impl Default for WriteConfig<BinaryEncoding_1_0> {
fn default() -> Self {
Expand All @@ -59,18 +71,21 @@ impl Default for WriteConfig<BinaryEncoding_1_0> {
}

/// Writer configuration type enum for text and binary configuration
#[derive(Clone, Debug)]
pub(crate) enum WriteConfigKind {
Text(TextWriteConfig),
Binary(BinaryWriteConfig),
}

/// Text writer configuration with text kind to be used to create a writer
#[derive(Clone, Debug)]
pub(crate) struct TextWriteConfig {
text_kind: TextKind,
}

/// Binary writer configuration to be used to create a writer
// TODO: Add appropriate binary configuration if required for 1.1
#[derive(Clone, Debug)]
pub(crate) struct BinaryWriteConfig;

/// Serializes [`Element`] instances into some kind of output sink.
Expand Down
39 changes: 19 additions & 20 deletions src/lazy/encoder/annotate.rs
Original file line number Diff line number Diff line change
@@ -1,30 +1,30 @@
use crate::lazy::encoder::value_writer::ValueWriter;
use crate::lazy::encoder::annotation_seq::AnnotationSeq;
use crate::lazy::encoder::value_writer::{AnnotatableWriter, ValueWriter};
use crate::lazy::encoder::write_as_ion::WriteAsIon;
use crate::raw_symbol_token_ref::AsRawSymbolTokenRef;
use crate::IonResult;

/// Associates a value to serialize with a sequence of annotations.
pub struct Annotated<'a, T: ?Sized, A> {
pub struct Annotated<'a, T: ?Sized, A: 'a> {
value: &'a T,
annotations: &'a [A],
annotations: A,
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This trait was modified to use an implementation of AnnotationSeq, a new trait that is defined/described in the next file.


/// Provides implementors with an extension method ([`annotate`](Annotate::annotated_with)) that allows
/// Provides implementors with an extension method ([`annotate`](Annotatable::annotated_with)) that allows
/// them to be serialized with an associated sequence of annotations.
pub trait Annotate {
pub trait Annotatable {
/// Pairs a reference to the provided value with a slice containing annotations.
///
/// ```
///# use ion_rs::IonResult;
///# fn main() -> IonResult<()> {
/// use ion_rs::{Element, IonData};
/// use ion_rs::lazy::encoder::text::LazyRawTextWriter_1_0;
/// use ion_rs::lazy::encoder::annotate::Annotate;
/// use ion_rs::lazy::encoder::annotate::Annotatable;
///
/// let mut buffer = vec![];
/// let mut writer = LazyRawTextWriter_1_0::new(&mut buffer);
///
/// writer.write(42_usize.annotated_with(&["foo", "bar", "baz"]))?.flush()?;
/// writer.write(42_usize.annotated_with(["foo", "bar", "baz"]))?.flush()?;
///
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it have to take ownership of the annotations now? Based on the trait impls for AnnotationSequence, it looks like it should still work to borrow the slice here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would work with either &[...] or [...], but (subjectively) I liked the simpler syntax.

Does it have to take ownership of the annotations now?

There's some depth here I should explain.

When the application writer creates a container writer, it needs to resolve its annotations before it creates the raw-level value writer that will encode them. However, the resolved annotations need a place to live. If I resolve foo to $10, there's now a RawSymbolTokenRef::SymbolId(10) living on the stack. I can't pass a reference to it into the raw container writer because the container writer will outlive that stack frame. The current solution for this is for annotated value/container writers to own a SmallVec<[RawSymbolTokenRef<'_>; 2] that can own the newly created references and avoids heap allocating so long as there are 2 or fewer annotations. In the unlikely event that this is a bottleneck, we can re-evaluate it.

(I don't think this was exactly your question, but I realized I meant to explain it as part of the PR tour and then didn't.)

/// let expected = Element::read_one("foo::bar::baz::42")?;
/// let actual = Element::read_one(&buffer)?;
Expand All @@ -33,21 +33,20 @@ pub trait Annotate {
///# Ok(())
///# }
/// ```
fn annotated_with<'a, A: AsRawSymbolTokenRef>(
&'a self,
annotations: &'a [A],
) -> Annotated<'a, Self, A>;
fn annotated_with<'a, A: 'a>(&'a self, annotations: A) -> Annotated<'a, Self, A>
where
&'a A: AnnotationSeq<'a>;
}

// Any Rust value that can be serialized as an Ion value can call `annotate`.
impl<T> Annotate for T
impl<T> Annotatable for T
where
T: ?Sized + WriteAsIon,
{
fn annotated_with<'a, A: AsRawSymbolTokenRef>(
&'a self,
annotations: &'a [A],
) -> Annotated<'a, Self, A> {
fn annotated_with<'a, A: 'a>(&'a self, annotations: A) -> Annotated<'a, Self, A>
where
&'a A: AnnotationSeq<'a>,
{
Annotated {
value: self,
annotations,
Expand All @@ -57,13 +56,13 @@ where

// The `Annotated` struct implements `WriteAsIon` by serializing its sequence of annotations
// and then invoking the inner value's implementation of `WriteAsIon`.
impl<'annotations, T, A> WriteAsIon for Annotated<'annotations, T, A>
impl<'annotations, T, A: 'annotations> WriteAsIon for Annotated<'annotations, T, A>
where
for<'x> &'x A: AnnotationSeq<'x>,
T: WriteAsIon,
A: AsRawSymbolTokenRef,
{
fn write_as_ion<V: ValueWriter>(&self, writer: V) -> IonResult<()> {
let value_writer = <V as ValueWriter>::with_annotations(writer, self.annotations);
let value_writer = <V as AnnotatableWriter>::with_annotations(writer, &self.annotations)?;
self.value.write_as_ion(value_writer)
}
}
113 changes: 113 additions & 0 deletions src/lazy/encoder/annotation_seq.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
use crate::{RawSymbolTokenRef, SymbolId};
use smallvec::SmallVec;

/// A sequence of annotations.
///
/// When the sequence is two or fewer annotations, it will not require a heap allocation.
pub type AnnotationsVec<'a> = SmallVec<[RawSymbolTokenRef<'a>; 2]>;

/// Types that can be viewed as an annotations sequence.
///
/// Examples include `SymbolId`, `&str`, and iterables of those types.
pub trait AnnotationSeq<'a> {
fn into_annotations_vec(self) -> AnnotationsVec<'a>;
}

impl<'a> AnnotationSeq<'a> for &'a str {
/// Converts the value into an `AnnotationsVec`.
fn into_annotations_vec(self) -> AnnotationsVec<'a> {
let mut vec = AnnotationsVec::new();
vec.push(RawSymbolTokenRef::Text(self.into()));
vec
}
}

impl<'a> AnnotationSeq<'a> for &'a &str {
fn into_annotations_vec(self) -> AnnotationsVec<'a> {
let mut vec = AnnotationsVec::new();
vec.push(RawSymbolTokenRef::Text((*self).into()));
vec
}
}

impl<'a> AnnotationSeq<'a> for SymbolId {
fn into_annotations_vec(self) -> AnnotationsVec<'a> {
let mut vec = AnnotationsVec::new();
vec.push(RawSymbolTokenRef::SymbolId(self));
vec
}
}

impl<'a> AnnotationSeq<'a> for &'a SymbolId {
fn into_annotations_vec(self) -> AnnotationsVec<'a> {
let mut vec = AnnotationsVec::new();
vec.push(RawSymbolTokenRef::SymbolId(*self));
vec
}
}

impl<'a> AnnotationSeq<'a> for RawSymbolTokenRef<'a> {
fn into_annotations_vec(self) -> AnnotationsVec<'a> {
let mut vec = AnnotationsVec::new();
vec.push(self);
vec
}
}

impl<'a> AnnotationSeq<'a> for AnnotationsVec<'a> {
fn into_annotations_vec(self) -> AnnotationsVec<'a> {
self
}
}

impl<'a, T> AnnotationSeq<'a> for Vec<T>
where
T: Into<RawSymbolTokenRef<'a>>,
{
fn into_annotations_vec(self) -> AnnotationsVec<'a> {
let mut annotations = AnnotationsVec::new();
for token in self {
annotations.push(token.into());
}
annotations
}
}

impl<'a, T> AnnotationSeq<'a> for &'a [T]
where
for<'b> &'b T: Into<RawSymbolTokenRef<'b>>,
{
fn into_annotations_vec(self) -> AnnotationsVec<'a> {
let mut annotations = AnnotationsVec::new();
for token in self {
annotations.push(token.into());
}
annotations
}
}

impl<'a, T, const N: usize> AnnotationSeq<'a> for [T; N]
where
T: Into<RawSymbolTokenRef<'a>>,
{
fn into_annotations_vec(self) -> AnnotationsVec<'a> {
let mut annotations = AnnotationsVec::new();
for token in self {
annotations.push(token.into());
}
annotations
}
}

impl<'a, T, const N: usize> AnnotationSeq<'a> for &'a [T; N]
where
for<'b> &'b T: Into<RawSymbolTokenRef<'b>>,
{
fn into_annotations_vec(self) -> AnnotationsVec<'a> {
let mut annotations = AnnotationsVec::new();
for token in self {
annotations.push(token.into());
}
annotations
}
}
2 changes: 1 addition & 1 deletion src/lazy/encoder/binary/v1_0/container_writers.rs
Original file line number Diff line number Diff line change
Expand Up @@ -335,7 +335,7 @@ impl<'value, 'top> MakeValueWriter for BinaryStructWriter_1_0<'value, 'top> {
}

impl<'value, 'top> StructWriter for BinaryStructWriter_1_0<'value, 'top> {
fn end(self) -> IonResult<()> {
fn close(self) -> IonResult<()> {
self.container_writer.end()
}
}
11 changes: 10 additions & 1 deletion src/lazy/encoder/binary/v1_0/mod.rs
Original file line number Diff line number Diff line change
@@ -1,12 +1,21 @@
use crate::lazy::encoder::binary::v1_0::writer::LazyRawBinaryWriter_1_0;
use crate::lazy::encoder::LazyEncoder;
use crate::lazy::encoder::{LazyEncoder, SymbolCreationPolicy};
use crate::lazy::encoding::BinaryEncoding_1_0;
use crate::WriteConfig;
use std::io::Write;

mod container_writers;
pub mod value_writer;
pub mod writer;

impl LazyEncoder for BinaryEncoding_1_0 {
const SUPPORTS_TEXT_TOKENS: bool = false;
const DEFAULT_SYMBOL_CREATION_POLICY: SymbolCreationPolicy =
SymbolCreationPolicy::RequireSymbolId;

type Writer<W: Write> = LazyRawBinaryWriter_1_0<W>;

fn default_write_config() -> WriteConfig<Self> {
WriteConfig::<Self>::new()
}
Comment on lines +12 to +20
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Encodings now define a method for constructing a default WriteConfig that can in turn be used to construct raw writers and application writers.

}
Loading
Loading