Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper Revert reasons #342

Merged
merged 3 commits into from
Apr 21, 2021

Conversation

cburgdorf
Copy link
Collaborator

@cburgdorf cburgdorf commented Mar 26, 2021

What was wrong?

As described in #288 we do currently not support revert reason strings in assert statements.

How was it fixed?

  1. Added a bit machinery to test solidity fixtures and added a new category of tests for such cases
  2. Added tests to prove how solidity handles revert reason strings
  3. Added a new runtime function revert_with_reason_string which follows the error encoding that solidity uses: https://docs.soliditylang.org/en/latest/control-structures.html#revert
  4. Added tests for this new runtime method and as a byproduct refactored some helpers to allow detailed testing of reverts as well as setting up a runtime that exposes static strings
  5. Hooked up the mapping of assert to use the new runtime method when a string reason is given
  6. Added several test cases to prove the functionality of assert strings with static strings, longer strings, passed in strings

@codecov-io
Copy link

Codecov Report

Merging #342 (7626e6d) into master (dd017d8) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #342   +/-   ##
=======================================
  Coverage   92.60%   92.60%           
=======================================
  Files          56       56           
  Lines        3882     3882           
=======================================
  Hits         3595     3595           
  Misses        287      287           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dd017d8...7626e6d. Read the comment docs.

@cburgdorf cburgdorf force-pushed the christoph/feat/rever_reasons branch 2 times, most recently from 2dc9387 to 02c614e Compare April 15, 2021 09:04
@@ -225,6 +280,36 @@ pub fn deploy_contract(
panic!("Failed to create contract")
}

pub fn compile_solidity_contract(name: &str, solidity_src: &str) -> Result<(String, String), ()> {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@satyamakgec This is all just WIP code in this PR but I wanted to make you aware of it because it adds a way to run tests against solidity code (which I needed to prove some assumptions about revert reason encoding). Since the differential contract testing will also need to run solidity code I thought you might find it useful to look at.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are a saviour @cburgdorf I was worried about this today when I was doing researching and planning over the differential testing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very useful tools!

@cburgdorf cburgdorf force-pushed the christoph/feat/rever_reasons branch 2 times, most recently from 9a69ccf to 1864523 Compare April 19, 2021 20:42
@codecov-commenter
Copy link

codecov-commenter commented Apr 19, 2021

Codecov Report

Merging #342 (90601cb) into master (32c3f00) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head 90601cb differs from pull request most recent head cf736bf. Consider uploading reports for the commit cf736bf to get more accurate results
Impacted file tree graph

@@           Coverage Diff           @@
##           master     #342   +/-   ##
=======================================
  Coverage   92.79%   92.80%           
=======================================
  Files          59       59           
  Lines        4012     4015    +3     
=======================================
+ Hits         3723     3726    +3     
  Misses        289      289           
Impacted Files Coverage Δ
compiler/src/yul/mappers/functions.rs 100.00% <ø> (ø)
compiler/src/yul/runtime/functions/data.rs 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 32c3f00...cf736bf. Read the comment docs.

@cburgdorf cburgdorf force-pushed the christoph/feat/rever_reasons branch 3 times, most recently from f2b37f1 to 4e27a29 Compare April 20, 2021 10:01
function revert_with_reason_string(reason) {
(let ptr := alloc(4))
// Function selector for Error(string)
(mstore(ptr, 3963877391197344453575983046348115674221700746820753546331534351508065746944))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why I can't do this instead:

(let ptr := alloc_mstoren(3963877391197344453575983046348115674221700746820753546331534351508065746944, 4))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's my understanding of what's happening here:

3963877391197344453575983046348115674221700746820753546331534351508065746944 is the decimal number equivalent to the selector (0x08C379A0) padded with 28 zero bytes on the right. We're allocating 4 bytes and then writing the decimal selector to 32 bytes of memory. Since the number in bytes is all zeros on the right side, we don't actually end up writing any bits past the highest allocated memory address, so things are technically fine.

What we want is:

let ptr := alloc_mstoren(0x08C379A0, 4)

Also,

let ptr := alloc_mstoren(3963877391197344453575983046348115674221700746820753546331534351508065746944, 4)

is equivalent to:

let ptr := alloc_mstoren(0x08C379A000000000000000000000000000000000000000000000000000000000, 4)

which will only write the 4 least-significant bytes to memory (zero).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooooh! Now it all makes sense. THANK YOU 🙏

mcopys(),
mloadn(),
mstoren(),
revert_with_reason_string(),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the only new one. I just sorted them alphabetically

(mstore(ptr, 3963877391197344453575983046348115674221700746820753546331534351508065746944))

// Write the (fixed) data offset into the next 32 bytes of memory
(let __ := alloc_mstoren(0x0000000000000000000000000000000000000000000000000000000000000020, 32))
Copy link
Collaborator Author

@cburgdorf cburgdorf Apr 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to confess that I don't fully grasp the rationale for this ☝️ I tried to follow the conversation in ethereum/EIPs#838 but it doesn't go into much detail (or not enough for me to properly understand it). I think this data offset of 32 is just meant to say that the actual reason string is found at an offset of 32 bytes simply because thats how ABI encoded strings are defined. But to me it seems pointless to have this. After all Error(string) returns a string so it should be clear that there's a 32 data offset but maybe I'm just misunderstanding the whole thing haha. Anyway, this is how the encoding works and the tests prove that it is working in the same way as solidity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, just how ABI encoding works. It makes sense when there's more items being encoded.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense when there's more items being encoded.

Do you have an example where the dataoffset becomes useful? Let's say we have Error(AnyOtherABIType). Why would I need the dataoffset for anything? Wouldn't it work just as fine without it? Whatever follows the selector would be the data that is decoded as the ABI type that is specified with the selector.

Copy link
Member

@g-r-a-n-t g-r-a-n-t Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're encoding something like (string, string), we need to provide the offsets to both pieces of dynamically sized data. This way we only need to read the offset, size, and the data when reading each string. If offsets were not provided, we would need to read each dynamic size and sum them together to get the offset. This would make implementations much more error prone.

For reference, the encoding would look roughly like this:

[offset_1, offset_2], [(size_1, data_1), (size_2, data_2)]

With a single dynamic element, it's not necessary to provide this offset, but it's best to make a few simple rules with encoding schemes and stick with them - even if it seems silly at times.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if offsets were not provided, we would need to read each dynamic size and sum them together to get the offset

But isn't that what I would also have to do if I'm reading the output of pub foo -> (string, string)? In that case, the returned data will also only be [(size_1, data_1), (size_2, data_2)] without providing me the offsets separately, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The encoding of (string, string) would include the offsets of each string item. Then to decode, we read the offset, size, and data for each string.

Is there something I'm missing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The examples are quite helpful.

evm::Capture::Exit((evm::ExitReason::Revert(_), _))
));
match exit1 {
evm::Capture::Exit((evm::ExitReason::Revert(_), output)) => assert_eq!(output.len(), 0),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without a revert string the data will simply be empty (NOT the same as having a revert string of "")

test_runtime_functions_revert(
&mut executor,
Runtime::default()
.with_data(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I introduced the Runtime object with the builder pattern to not introduce a data parameter to every test that most of them would not end up using.

We could actually go a bit further and incorporate the testing functions on the Runtime object, too so that the code becomes something like:

test_runtime_functions_revert(
            Runtime::default()
                .with_data(
                    vec![yul::Data { name: keccak::full(reason.as_bytes()), value: reason.to_owned() }]
                )
                .with_test_statements(
                statements! {
                    (let reason := load_data_string((dataoffset([literal_expression! { (reason_id) }])), (datasize([literal_expression! { (reason_id) }]))))
                    (revert_with_reason_string(reason))
                })
                .execute(executor)
                .expect_revert_with(reason)
        );

I'm happy to create an issue / PR if you happen to like it @g-r-a-n-t

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a more rusty way of doing things 👍 I'm open to refactoring this way.


let exit = harness.capture_call(&mut executor, method, &[]);

let expected_reason = format!("0x{}", hex::encode(encode_error_reason(reason)));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is proving that our encode_error_reason is producing the same encoding that solidity does

@cburgdorf cburgdorf marked this pull request as ready for review April 20, 2021 10:25
@cburgdorf
Copy link
Collaborator Author

@g-r-a-n-t Celebrating your return with this PR to review

🏄‍♂️

Copy link
Member

@g-r-a-n-t g-r-a-n-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Added some clarification around the selector stuff and suggested using pop in a few places.

function revert_with_reason_string(reason) {
(let ptr := alloc(4))
// Function selector for Error(string)
(mstore(ptr, 3963877391197344453575983046348115674221700746820753546331534351508065746944))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's my understanding of what's happening here:

3963877391197344453575983046348115674221700746820753546331534351508065746944 is the decimal number equivalent to the selector (0x08C379A0) padded with 28 zero bytes on the right. We're allocating 4 bytes and then writing the decimal selector to 32 bytes of memory. Since the number in bytes is all zeros on the right side, we don't actually end up writing any bits past the highest allocated memory address, so things are technically fine.

What we want is:

let ptr := alloc_mstoren(0x08C379A0, 4)

Also,

let ptr := alloc_mstoren(3963877391197344453575983046348115674221700746820753546331534351508065746944, 4)

is equivalent to:

let ptr := alloc_mstoren(0x08C379A000000000000000000000000000000000000000000000000000000000, 4)

which will only write the 4 least-significant bytes to memory (zero).

(mstore(ptr, 3963877391197344453575983046348115674221700746820753546331534351508065746944))

// Write the (fixed) data offset into the next 32 bytes of memory
(let __ := alloc_mstoren(0x0000000000000000000000000000000000000000000000000000000000000020, 32))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, just how ABI encoding works. It makes sense when there's more items being encoded.

(let reason_size := mloadn(reason, 32))

//Copy the whole reason string (length + data) to the current segment of memory
(__ := mcopym(reason , (add(reason_size, 32))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also pop these values right off the stack.

(pop((mcopym(reason , (add(reason_size, 32))))))

All function arguments are matched as token trees, so expressions like mcopym(reason , (add(reason_size, 32))) need to be wrapped in parens. Biggest downside with these macros imo.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, thanks. Does it make any practical difference whether to assign them to an unused variable or pop them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The optimizer will probably convert these to pops anyway.

Even if we weren't running the optimizer, I don't think it would make any difference in terms of gas. We would just keep these values on the stack a bit longer.

@@ -225,6 +280,36 @@ pub fn deploy_contract(
panic!("Failed to create contract")
}

pub fn compile_solidity_contract(name: &str, solidity_src: &str) -> Result<(String, String), ()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very useful tools!

}

#[allow(dead_code)]
impl Runtime {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense 👍

test_runtime_functions_revert(
&mut executor,
Runtime::default()
.with_data(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a more rusty way of doing things 👍 I'm open to refactoring this way.

@cburgdorf cburgdorf force-pushed the christoph/feat/rever_reasons branch from 4e27a29 to cf736bf Compare April 21, 2021 08:39
@cburgdorf cburgdorf merged commit 22771e7 into ethereum:master Apr 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants