Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduces an application-level lazy writer #745

Merged
merged 3 commits into from
Apr 22, 2024
Merged

Introduces an application-level lazy writer #745

merged 3 commits into from
Apr 22, 2024

Conversation

zslayton
Copy link
Contributor

@zslayton zslayton commented Apr 18, 2024

Adds an ApplicationWriter that wraps the various lazy raw writer types and creates new symbol table entries as needed. The new writer is now being used in our ion-tests integration.

I intend to rename the ApplicationWriter once the legacy writer is removed.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link
Contributor Author

@zslayton zslayton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ PR Tour

The name ApplicationWriter will be changed to something pithier after other writer implementations are removed.

@@ -41,7 +41,7 @@ fn write_struct_with_string_values(value_writer: impl ValueWriter) -> IonResult<
black_box("2022-12-07T20:59:59.744000Z"),
],
)?;
struct_.end()
struct_.close()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This was a holdover naming consistency fix that @popematt identified in #741.

@@ -51,6 +53,16 @@ impl WriteConfig<BinaryEncoding_1_0> {
}
}

#[cfg(feature = "experimental-lazy-reader")]
impl WriteConfig<BinaryEncoding_1_1> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Prior to this, there was not a function to construct a default WriteConfig for the binary 1.1 writer.

value: &'a T,
annotations: &'a [A],
annotations: A,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This trait was modified to use an implementation of AnnotationSeq, a new trait that is defined/described in the next file.

Comment on lines +12 to +20
const SUPPORTS_TEXT_TOKENS: bool = false;
const DEFAULT_SYMBOL_CREATION_POLICY: SymbolCreationPolicy =
SymbolCreationPolicy::RequireSymbolId;

type Writer<W: Write> = LazyRawBinaryWriter_1_0<W>;

fn default_write_config() -> WriteConfig<Self> {
WriteConfig::<Self>::new()
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Encodings now define a method for constructing a default WriteConfig that can in turn be used to construct raw writers and application writers.

Comment on lines +627 to +630
.write(1.annotated_with(4))?
.write(false.annotated_with([5]))?
.write(3f32.annotated_with([6, 7]))?
.write("foo".annotated_with([8, 5]))?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Now that we have the AnnotationSeq trait, annotations can be an individual symbol ID, string, or [] of those rather than always needing a &[...] of those.

@@ -634,25 +635,33 @@ impl<'value, 'top> BinaryValueWriter_1_1<'value, 'top> {

impl<'value, 'top> Sealed for BinaryValueWriter_1_1<'value, 'top> {}

impl<'value, 'top> AnnotatableWriter for BinaryValueWriter_1_1<'value, 'top> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This refactoring with_annotations out into its own trait made it possible to have annotated container writers.

@@ -105,6 +105,7 @@ struct TextContainerWriter_1_0<'a, W: Write> {
// used for more informative error messages.
ion_type: IonType,
value_delimiter: &'static str,
trailing_delimiter: &'static str,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Container writers (like scalar writers) are responsible for writing the trailing delimiter when they appear in a container that requires one.

Comment on lines 51 to 66
match format {
Format::Text(kind) => {
let mut writer = match kind {
TextKind::Compact => TextWriterBuilder::default().build(&mut buffer),
TextKind::Lines => TextWriterBuilder::lines().build(&mut buffer),
TextKind::Pretty => TextWriterBuilder::pretty().build(&mut buffer),
_ => unimplemented!("No text writer available for requested TextKind {:?}", kind),
}?;
let write_config = WriteConfig::<TextEncoding_1_0>::new(kind);
let mut writer = ApplicationWriter::with_config(write_config, buffer)?;
writer.write_elements(elements)?;
writer.flush()?;
buffer = writer.close()?;
println!(
"Serialized as {kind:?}:\n{}",
std::str::from_utf8(buffer.as_slice()).unwrap()
);
}
Format::Binary => {
let mut binary_writer = BinaryWriterBuilder::new().build(&mut buffer)?;
let mut binary_writer = ApplicationWriter::<BinaryEncoding_1_0, _>::new(buffer)?;
binary_writer.write_elements(elements)?;
binary_writer.flush()?;
buffer = binary_writer.close()?;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ The application writer is now used as the serializer in the Ion tests, ensuring its compliance for all (1.0) formats.

@zslayton zslayton marked this pull request as ready for review April 19, 2024 16:01
///
/// let mut buffer = vec![];
/// let mut writer = LazyRawTextWriter_1_0::new(&mut buffer);
///
/// writer.write(42_usize.annotated_with(&["foo", "bar", "baz"]))?.flush()?;
/// writer.write(42_usize.annotated_with(["foo", "bar", "baz"]))?.flush()?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it have to take ownership of the annotations now? Based on the trait impls for AnnotationSequence, it looks like it should still work to borrow the slice here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would work with either &[...] or [...], but (subjectively) I liked the simpler syntax.

Does it have to take ownership of the annotations now?

There's some depth here I should explain.

When the application writer creates a container writer, it needs to resolve its annotations before it creates the raw-level value writer that will encode them. However, the resolved annotations need a place to live. If I resolve foo to $10, there's now a RawSymbolTokenRef::SymbolId(10) living on the stack. I can't pass a reference to it into the raw container writer because the container writer will outlive that stack frame. The current solution for this is for annotated value/container writers to own a SmallVec<[RawSymbolTokenRef<'_>; 2] that can own the newly created references and avoids heap allocating so long as there are 2 or fewer annotations. In the unlikely event that this is a bottleneck, we can re-evaluate it.

(I don't think this was exactly your question, but I realized I meant to explain it as part of the PR tour and then didn't.)

src/lazy/encoder/binary/v1_0/writer.rs Show resolved Hide resolved
src/lazy/encoder/writer.rs Show resolved Hide resolved
@zslayton zslayton merged commit 040914a into main Apr 22, 2024
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants