Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime decoding checks #440

Merged
merged 2 commits into from
Jul 13, 2021
Merged

Runtime decoding checks #440

merged 2 commits into from
Jul 13, 2021

Conversation

g-r-a-n-t
Copy link
Member

@g-r-a-n-t g-r-a-n-t commented Jun 7, 2021

We now check the following when decoding data:

  • The size of the encoded data fits within the size range known at compile-time.
  • Values are correctly padded.
    • unsigned integers, addresses, and bools are checked to have correct left zero padding
    • the sizes of signed integers are checked
    • bytes and strings are checked to have correct right zero padding
  • Data section offsets are consistent with the size of preceding values in the data section.
  • The dynamic size of strings does not exceed their maximum size.
  • The dynamic size of byte arrays (u8[n]) is equal to the size of the array.

Decoding function walk through:

function abi_decode_data_uint256_bytes_100_string_42_bool_address_bytes_100_calldata(head_start, data_end) -> return_val_0, return_val_1, return_val_2, return_val_3, return_val_4, return_val_5 {
let encoding_size := sub(data_end, head_start)
if or(lt(encoding_size, 544), gt(encoding_size, 608)) { revert(0, 0) }
let head_offset_0 := 0
let head_offset_1 := 32
let head_offset_2 := 64
let head_offset_3 := 96
let head_offset_4 := 128
let head_offset_5 := 160
let decoded_val_0 := abi_decode_component_uint256_calldata(head_start, head_offset_0)
let decoded_val_1, data_start_offset_1, data_end_offset_1 := abi_decode_component_bytes_100_calldata(head_start, head_offset_1)
let decoded_val_2, data_start_offset_2, data_end_offset_2 := abi_decode_component_string_42_calldata(head_start, head_offset_2)
let decoded_val_3 := abi_decode_component_bool_calldata(head_start, head_offset_3)
let decoded_val_4 := abi_decode_component_address_calldata(head_start, head_offset_4)
let decoded_val_5, data_start_offset_5, data_end_offset_5 := abi_decode_component_bytes_100_calldata(head_start, head_offset_5)
if iszero(eq(data_start_offset_1, 192)) { revert(0, 0) }
if iszero(eq(data_start_offset_2, data_end_offset_1)) { revert(0, 0) }
if iszero(eq(data_start_offset_5, data_end_offset_2)) { revert(0, 0) }
if iszero(eq(encoding_size, data_end_offset_5)) { revert(0, 0) }
return_val_0 := decoded_val_0
return_val_1 := decoded_val_1
return_val_2 := decoded_val_2
return_val_3 := decoded_val_3
return_val_4 := decoded_val_4
return_val_5 := decoded_val_5
}

Note: At the moment, we have separate decode_data_... and decode_component_tuple_... functions. In the future when we add support for recursive encoding, we will just be using a decode_tuple_... function.

encoding size check

     let encoding_size := sub(data_end, head_start) 
     if or(lt(encoding_size, 544), gt(encoding_size, 608)) { revert(0, 0) } 

We check that the size of the entire encoding fits within known bounds.

head offset section

     let head_offset_0 := 0 
     let head_offset_1 := 32 
     let head_offset_2 := 64 
     let head_offset_3 := 96 
     let head_offset_4 := 128 
     let head_offset_5 := 160 

Head offsets are known at compile-time, so we just assign a literal value.

component decoding section

     let decoded_val_0 := abi_decode_component_uint256_calldata(head_start, head_offset_0) 
     let decoded_val_1, data_start_offset_1, data_end_offset_1 := abi_decode_component_bytes_100_calldata(head_start, head_offset_1) 
     let decoded_val_2, data_start_offset_2, data_end_offset_2 := abi_decode_component_string_42_calldata(head_start, head_offset_2) 
     let decoded_val_3 := abi_decode_component_bool_calldata(head_start, head_offset_3) 
     let decoded_val_4 := abi_decode_component_address_calldata(head_start, head_offset_4) 
     let decoded_val_5, data_start_offset_5, data_end_offset_5 := abi_decode_component_bytes_100_calldata(head_start, head_offset_5) 

We call the decoding function for each component value. The decoding functions return the decoded value along with data section offsets for dynamically sized values.

data offset checks section

     if iszero(eq(data_start_offset_1, 192)) { revert(0, 0) } 
     if iszero(eq(data_start_offset_2, data_end_offset_1)) { revert(0, 0) } 
     if iszero(eq(data_start_offset_5, data_end_offset_2)) { revert(0, 0) } 
     if iszero(eq(encoding_size, data_end_offset_5)) { revert(0, 0) } 

We check that the offsets given in the last step are consistent with one another. Basically we're checking that each encoding starts where the last one ends. For the first offset, we check that it starts where the head section ends (192 in this case).

return section

     return_val_0 := decoded_val_0 
     return_val_1 := decoded_val_1 
     return_val_2 := decoded_val_2 
     return_val_3 := decoded_val_3 
     return_val_4 := decoded_val_4 
     return_val_5 := decoded_val_5 

To-Do

  • OPTIONAL: Update Spec if applicable

  • Add entry to the release notes (may forgo for trivial changes)

  • Clean up commit history

@codecov-commenter
Copy link

codecov-commenter commented Jul 6, 2021

Codecov Report

Merging #440 (d08da22) into master (618f267) will increase coverage by 0.31%.
The diff coverage is 97.02%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #440      +/-   ##
==========================================
+ Coverage   87.32%   87.64%   +0.31%     
==========================================
  Files          77       77              
  Lines        5081     5235     +154     
==========================================
+ Hits         4437     4588     +151     
- Misses        644      647       +3     
Impacted Files Coverage Δ
crates/yulgen/src/mappers/functions.rs 97.50% <ø> (ø)
crates/test-utils/src/lib.rs 83.22% <84.61%> (+0.79%) ⬆️
crates/analyzer/src/namespace/types.rs 81.81% <86.36%> (+0.75%) ⬆️
crates/yulgen/src/names/abi.rs 98.64% <100.00%> (ø)
crates/yulgen/src/operations/abi.rs 100.00% <100.00%> (ø)
crates/yulgen/src/operations/data.rs 100.00% <100.00%> (ø)
crates/yulgen/src/runtime/abi_dispatcher.rs 100.00% <100.00%> (ø)
crates/yulgen/src/runtime/functions/abi.rs 99.20% <100.00%> (-0.80%) ⬇️
crates/yulgen/src/runtime/functions/contracts.rs 100.00% <100.00%> (ø)
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 618f267...d08da22. Read the comment docs.

@cburgdorf cburgdorf mentioned this pull request Jul 8, 2021
crates/analyzer/src/namespace/types.rs Outdated Show resolved Hide resolved
crates/tests/src/features.rs Outdated Show resolved Hide resolved
crates/yulgen/src/operations/abi.rs Show resolved Hide resolved
crates/yulgen/src/operations/abi.rs Show resolved Hide resolved
crates/yulgen/src/runtime/functions/abi.rs Outdated Show resolved Hide resolved
crates/yulgen/src/runtime/functions/abi.rs Show resolved Hide resolved
crates/yulgen/src/runtime/functions/abi.rs Outdated Show resolved Hide resolved
crates/yulgen/src/runtime/functions/abi.rs Show resolved Hide resolved
@g-r-a-n-t g-r-a-n-t force-pushed the abi-checks branch 2 times, most recently from fb72692 to 4053b13 Compare July 12, 2021 23:49
crates/yulgen/src/operations/abi.rs Outdated Show resolved Hide resolved
crates/yulgen/src/runtime/abi_dispatcher.rs Outdated Show resolved Hide resolved
@g-r-a-n-t g-r-a-n-t marked this pull request as ready for review July 13, 2021 01:08
if iszero(eq(data_start_offset_1, 192)) { revert(0, 0) }
if iszero(eq(data_start_offset_2, data_end_offset_1)) { revert(0, 0) }
if iszero(eq(data_start_offset_5, data_end_offset_2)) { revert(0, 0) }
if iszero(eq(encoding_size, data_end_offset_5)) { revert(0, 0) }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cburgdorf the Solidity code I looked at didn't revert with any specific values, but it looks like they plan on adding it. I'll create an issue to track this.

Copy link
Collaborator

@cburgdorf cburgdorf Jul 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't notice until now that you had a comment on this. Anyway, 0x99 is good for now 👍

max: yul::Expression,
},
}

/// Returns an expression that encodes the given values and returns a pointer to
/// the encoding.
pub fn encode<T: AbiEncoding>(types: &[T], vals: Vec<yul::Expression>) -> yul::Expression {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbillig I'll follow up this PR with another one tidying up these traits.

Copy link
Collaborator

@cburgdorf cburgdorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍
My only comment would be to think about introducing our own panic code for when we revert from these checks and use revert_with_panic(..). I basically think we should never revert with zero data unless for when a user writes revert with no extra data. This should improve the debugging story for developers as tooling can map these panic codes to a user friendly error which might save a developer from having to manually step to a debugger to find out where the revert came from.

// add a byte
let mut tampered_data = data.clone();
tampered_data.push(42);
harness.test_call_reverts(&mut executor, tampered_data);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you said that solidity also just reverts with zero data for these checks but I wonder if that's just a todo on their end and if we should just introduce a panic code to be used for these reverts. From a developer perspective I think reverts with zero information are very annoying and a single panic code used for all calldata violation can already be quite helpful during debugging.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure it's a todo.

Any suggestions for the specific panic code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the panic code 0x99 for now.

b9fbd32

pub fn check_left_padding(size_bits: yul::Expression, val: yul::Expression) -> yul::Statement {
statement! {
if (iszero((is_left_padded([size_bits], [val])))) {
(revert(0, 0))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just re-stating what I commented above on the test code. I think we should consider introducing our own panic code and then do a revert_with_panic(<some-code>). Basically I think the only time where we should revert with zero data is when someone actually uses revert with no data. In all other circumstances we should revert with a panic code and we should go with whatever panic code Solidity uses and if they don't use one we make up our own.

@g-r-a-n-t g-r-a-n-t merged commit 40d2fb0 into ethereum:master Jul 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants