Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid iloop externalizing diagnostics for invalid references #1028

Merged
merged 2 commits into from
Sep 5, 2023

Conversation

brson
Copy link
Contributor

@brson brson commented Aug 31, 2023

What

Don't store Vals with broken object references in the diagnostic event log.

Why

This avoids an infinite loop / recursion. There may be better solutions.

This is a followup to stellar/rs-soroban-sdk#1068 where I was experimenting with feeding the Env objects with invalid references.

The Host has a check_val_integrity function that is called on every Val it sees in order to prevent bad Vals from entering the environment, in particular it errors when it sees objects with broken references.

This function ends up calling visit_obj_untyped. When that function fails to find an object body it returns this

            Err(self.err(
                ScErrorType::Object,
                ScErrorCode::MissingValue,
                "unknown object reference",
                &[obj.to_val()],
            ))

When diagnostics are on, the effect of this is to log an "unknown object reference" event that contains the broken object, then immediately generate an error; while generating that error it attempts to externalize the diagnostic it just logged;
that diagnostic contains a broken object, and while externalizing it will generate more errors, etc. forever.

It is not clear to me why this doesn't smash the stack, but instead iloops (maybe the XDR DepthLimiter is helping). Here is a ~3k frame backtrace captured in gdb: https://gist.github.com/brson/aa2799208123895316228ca0cb425317

Known limitations

  • This loses information about the broken object.
  • There may be other ways to get into similar situations with the event log.

@brson
Copy link
Contributor Author

brson commented Sep 1, 2023

I may not be seeing it smash the stack because it is potentially generating exponentially increasing numbers of diagnostics at each recursion level. Just taking a long time to hit the end of the stack.

Copy link
Member

@leighmcculloch leighmcculloch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏🏻 Great find. One suggestion inline, but I don't feel strongly about it if others don't think that's a good move. Regardless, thanks for finding and reporting.

soroban-env-host/src/host_object.rs Outdated Show resolved Hide resolved
@leighmcculloch leighmcculloch added this pull request to the merge queue Sep 5, 2023
Merged via the queue into stellar:main with commit 243a362 Sep 5, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants