Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable inspect to display symbol table encodings #18

Merged
merged 1 commit into from
Jan 10, 2022

Conversation

zslayton
Copy link
Contributor

@zslayton zslayton commented Jan 10, 2022

ion-rs v0.8.0 introduced a SystemReader which allows applications to
inspect local symbol tables in the stream with the same granularity with
which they can inspect user-level Ion values.

This PR upgrades the beta inspect command to use the new
SystemReader instead of the now-defunct SystemEventHandler callback
API. This allows inspect to show the complete contents of any symbol
table it encounters.

Before

image

After

image

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

`ion-rs` v0.8.0 introduced a `SystemReader` which allows applications to
inspect local symbol tables in the stream with the same granularity with
which they can inspect user-level Ion values.

This PR upgrades the `beta inspect` command to use the new
`SystemReader` instead of the now-defunct `SystemEventHandler` callback
API. This allows `inspect` to show the complete contents of any symbol
table it encounters.
use memmap::MmapOptions;

const ABOUT: &str = "Displays hex-encoded binary Ion alongside its equivalent text for human-friendly debugging.";
const ABOUT: &str =
"Displays hex-encoded binary Ion alongside its equivalent text for human-friendly debugging.";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this diff contains whitespace changes introduced by a cargo fmt run that I forgot to isolate into its own commit. 🤦 Mea culpa. I'll add comments indicating the important logic changes.

// Create a type alias to simplify working with a shared, mutable reference to our output stream.
type OutputRef = Rc<RefCell<dyn io::Write>>;
// Create a type alias to simplify working with a shared reference to our output stream.
type OutputRef = Box<dyn io::Write>;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we're not using a callback system, there's no need to share the output stream between multiple owners. We can discard the wrapper types that enabled that sharing (Rc<RefCell<_>>).

@@ -113,29 +103,35 @@ pub fn run(_command_name: &str, matches: &ArgMatches<'static>) -> Result<()> {
limit_bytes = usize::MAX
}

let output: OutputRef;
let mut output: OutputRef;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we're not sharing the output stream via Rc<RefCell<_>>, we need to use explicit mutability.


write_header(&output)?;
write_header(output)?;
let mut inspector = IonInspector::new(ion_data, output, bytes_to_skip, limit_bytes);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer have to do Rc::clone here.

const IVM_HEX: &str = "e0 01 00 ea";
const IVM_TEXT: &str = "// Ion 1.0 Version Marker";
// System events (IVM, symtabs) are always at the top level.
const SYSTEM_EVENT_INDENTATION: &str = "";

impl SystemEventHandler for SystemLevelEventSummarizer {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new SystemReader in ion-rs obviated the need for the SystemEventHandler trait.

@@ -344,7 +281,35 @@ impl<'input> IonInspector<'input> {
// appear each time some number of values is skipped.
let mut bytes_skipped_this_level = 0;

while let Some((ion_type, _is_null)) = self.reader.next()? {
loop {
let (ion_type, _is_null) = match self.reader.next()? {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we're using a SystemReader, we have to distinguish between a few types of data that might appear each time that we call next:

  • None: No more data at this level of nesting.
  • Some(VersionMarker(_)): We found an Ion Version Marker.
  • Some(SymbolTableData(_)): We found an Ion value that was part of a Local Symbol Table (LST) and is not user data.
  • Some(Value(_)): We found user-level data.

The handling for the last two types is identical; in either case we want to display the bytes that were used to encode that particular value, whether it's user-level or not.

@@ -401,8 +372,11 @@ impl<'input> IonInspector<'input> {
self.inspect_level()?;
self.reader.step_out()?;
// Print the container's closing delimiter: }, ), or ]
// FIXME: This should also print a trailing `,` if the parent context is
// a list or struct. See this issue for details:
// https://github.com/amzn/ion-cli/issues/17
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discovered this unrelated bug while working on this PR: #17

@@ -419,13 +393,12 @@ impl<'input> IonInspector<'input> {
}

fn increase_indentation(&mut self) {
// Remove a level's worth of indentation from the buffer.
// Add a level's worth of indentation to the buffer.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old version of this comment was a copy/paste error.

@@ -435,19 +408,26 @@ impl<'input> IonInspector<'input> {
}

fn write_field_if_present(&mut self) -> IonResult<()> {
if let Some(field_id) = self.reader.field_id() {
if let Some(field_token) = self.reader.raw_field_name_token() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reader now has APIs that allow an application to see the original encoding of the field name, be it a symbol ID ($12) or text (foo).

@@ -435,19 +408,26 @@ impl<'input> IonInspector<'input> {
}

fn write_field_if_present(&mut self) -> IonResult<()> {
if let Some(field_id) = self.reader.field_id() {
if let Some(field_token) = self.reader.raw_field_name_token() {
let field_id = field_token.local_sid().expect("No SID for field name.");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inspect only ever reads binary Ion. In Ion v1.0, symbols MUST be local SIDs.

@zslayton zslayton merged commit 7228e85 into master Jan 10, 2022
@zslayton zslayton deleted the system-reader-inspect branch January 10, 2022 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants