Runtime decoding checks #440

g-r-a-n-t · 2021-06-07T22:48:28Z

We now check the following when decoding data:

The size of the encoded data fits within the size range known at compile-time.
Values are correctly padded.
- unsigned integers, addresses, and bools are checked to have correct left zero padding
- the sizes of signed integers are checked
- bytes and strings are checked to have correct right zero padding
Data section offsets are consistent with the size of preceding values in the data section.
The dynamic size of strings does not exceed their maximum size.
The dynamic size of byte arrays (u8[n]) is equal to the size of the array.

Decoding function walk through:

fe/crates/yulgen/tests/snapshots/yulgen__abi_decode_data_u256_bytes_string_bool_address_bytes_calldata_function.snap

Lines 6 to 31 in 169e95f

    
           function abi_decode_data_uint256_bytes_100_string_42_bool_address_bytes_100_calldata(head_start, data_end) -> return_val_0, return_val_1, return_val_2, return_val_3, return_val_4, return_val_5 { 
        
               let encoding_size := sub(data_end, head_start) 
        
               if or(lt(encoding_size, 544), gt(encoding_size, 608)) { revert(0, 0) } 
        
               let head_offset_0 := 0 
        
               let head_offset_1 := 32 
        
               let head_offset_2 := 64 
        
               let head_offset_3 := 96 
        
               let head_offset_4 := 128 
        
               let head_offset_5 := 160 
        
               let decoded_val_0 := abi_decode_component_uint256_calldata(head_start, head_offset_0) 
        
               let decoded_val_1, data_start_offset_1, data_end_offset_1 := abi_decode_component_bytes_100_calldata(head_start, head_offset_1) 
        
               let decoded_val_2, data_start_offset_2, data_end_offset_2 := abi_decode_component_string_42_calldata(head_start, head_offset_2) 
        
               let decoded_val_3 := abi_decode_component_bool_calldata(head_start, head_offset_3) 
        
               let decoded_val_4 := abi_decode_component_address_calldata(head_start, head_offset_4) 
        
               let decoded_val_5, data_start_offset_5, data_end_offset_5 := abi_decode_component_bytes_100_calldata(head_start, head_offset_5) 
        
               if iszero(eq(data_start_offset_1, 192)) { revert(0, 0) } 
        
               if iszero(eq(data_start_offset_2, data_end_offset_1)) { revert(0, 0) } 
        
               if iszero(eq(data_start_offset_5, data_end_offset_2)) { revert(0, 0) } 
        
               if iszero(eq(encoding_size, data_end_offset_5)) { revert(0, 0) } 
        
               return_val_0 := decoded_val_0 
        
               return_val_1 := decoded_val_1 
        
               return_val_2 := decoded_val_2 
        
               return_val_3 := decoded_val_3 
        
               return_val_4 := decoded_val_4 
        
               return_val_5 := decoded_val_5 
        
           }

Note: At the moment, we have separate decode_data_... and decode_component_tuple_... functions. In the future when we add support for recursive encoding, we will just be using a decode_tuple_... function.

encoding size check

     let encoding_size := sub(data_end, head_start) 
     if or(lt(encoding_size, 544), gt(encoding_size, 608)) { revert(0, 0) }

We check that the size of the entire encoding fits within known bounds.

head offset section

     let head_offset_0 := 0 
     let head_offset_1 := 32 
     let head_offset_2 := 64 
     let head_offset_3 := 96 
     let head_offset_4 := 128 
     let head_offset_5 := 160

Head offsets are known at compile-time, so we just assign a literal value.

component decoding section

     let decoded_val_0 := abi_decode_component_uint256_calldata(head_start, head_offset_0) 
     let decoded_val_1, data_start_offset_1, data_end_offset_1 := abi_decode_component_bytes_100_calldata(head_start, head_offset_1) 
     let decoded_val_2, data_start_offset_2, data_end_offset_2 := abi_decode_component_string_42_calldata(head_start, head_offset_2) 
     let decoded_val_3 := abi_decode_component_bool_calldata(head_start, head_offset_3) 
     let decoded_val_4 := abi_decode_component_address_calldata(head_start, head_offset_4) 
     let decoded_val_5, data_start_offset_5, data_end_offset_5 := abi_decode_component_bytes_100_calldata(head_start, head_offset_5)

We call the decoding function for each component value. The decoding functions return the decoded value along with data section offsets for dynamically sized values.

data offset checks section

     if iszero(eq(data_start_offset_1, 192)) { revert(0, 0) } 
     if iszero(eq(data_start_offset_2, data_end_offset_1)) { revert(0, 0) } 
     if iszero(eq(data_start_offset_5, data_end_offset_2)) { revert(0, 0) } 
     if iszero(eq(encoding_size, data_end_offset_5)) { revert(0, 0) }

We check that the offsets given in the last step are consistent with one another. Basically we're checking that each encoding starts where the last one ends. For the first offset, we check that it starts where the head section ends (192 in this case).

return section

     return_val_0 := decoded_val_0 
     return_val_1 := decoded_val_1 
     return_val_2 := decoded_val_2 
     return_val_3 := decoded_val_3 
     return_val_4 := decoded_val_4 
     return_val_5 := decoded_val_5

To-Do

OPTIONAL: Update Spec if applicable
Add entry to the release notes (may forgo for trivial changes)
Clean up commit history

crates/yulgen/src/runtime/functions/abi.rs

codecov-commenter · 2021-07-06T20:29:27Z

Codecov Report

Merging #440 (d08da22) into master (618f267) will increase coverage by 0.31%.
The diff coverage is 97.02%.

@@            Coverage Diff             @@
##           master     #440      +/-   ##
==========================================
+ Coverage   87.32%   87.64%   +0.31%     
==========================================
  Files          77       77              
  Lines        5081     5235     +154     
==========================================
+ Hits         4437     4588     +151     
- Misses        644      647       +3

Impacted Files	Coverage Δ
crates/yulgen/src/mappers/functions.rs	`97.50% <ø> (ø)`
crates/test-utils/src/lib.rs	`83.22% <84.61%> (+0.79%)`	⬆️
crates/analyzer/src/namespace/types.rs	`81.81% <86.36%> (+0.75%)`	⬆️
crates/yulgen/src/names/abi.rs	`98.64% <100.00%> (ø)`
crates/yulgen/src/operations/abi.rs	`100.00% <100.00%> (ø)`
crates/yulgen/src/operations/data.rs	`100.00% <100.00%> (ø)`
crates/yulgen/src/runtime/abi_dispatcher.rs	`100.00% <100.00%> (ø)`
crates/yulgen/src/runtime/functions/abi.rs	`99.20% <100.00%> (-0.80%)`	⬇️
crates/yulgen/src/runtime/functions/contracts.rs	`100.00% <100.00%> (ø)`
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 618f267...d08da22. Read the comment docs.

crates/analyzer/src/namespace/types.rs

crates/tests/src/features.rs

crates/yulgen/src/operations/abi.rs

crates/yulgen/src/runtime/functions/abi.rs

crates/yulgen/tests/snapshots/yulgen__abi_pack_calldata_function.snap

crates/yulgen/tests/snapshots/yulgen__abi_pack_mem_function.snap

crates/yulgen/src/operations/abi.rs

crates/yulgen/src/runtime/abi_dispatcher.rs

g-r-a-n-t · 2021-07-13T01:21:12Z

...napshots/yulgen__abi_decode_data_u256_bytes_string_bool_address_bytes_calldata_function.snap

+    if iszero(eq(data_start_offset_1, 192)) { revert(0, 0) }
+    if iszero(eq(data_start_offset_2, data_end_offset_1)) { revert(0, 0) }
+    if iszero(eq(data_start_offset_5, data_end_offset_2)) { revert(0, 0) }
+    if iszero(eq(encoding_size, data_end_offset_5)) { revert(0, 0) }


@cburgdorf the Solidity code I looked at didn't revert with any specific values, but it looks like they plan on adding it. I'll create an issue to track this.

Oh I didn't notice until now that you had a comment on this. Anyway, 0x99 is good for now 👍

g-r-a-n-t · 2021-07-13T01:24:01Z

crates/yulgen/src/operations/abi.rs

+        max: yul::Expression,
+    },
+}
+
 /// Returns an expression that encodes the given values and returns a pointer to
 /// the encoding.
 pub fn encode<T: AbiEncoding>(types: &[T], vals: Vec<yul::Expression>) -> yul::Expression {


@sbillig I'll follow up this PR with another one tidying up these traits.

cburgdorf

Looks good to me 👍
My only comment would be to think about introducing our own panic code for when we revert from these checks and use revert_with_panic(..). I basically think we should never revert with zero data unless for when a user writes revert with no extra data. This should improve the debugging story for developers as tooling can map these panic codes to a user friendly error which might save a developer from having to manually step to a debugger to find out where the revert came from.

cburgdorf · 2021-07-13T10:19:33Z

crates/tests/src/features.rs

+            // add a byte
+            let mut tampered_data = data.clone();
+            tampered_data.push(42);
+            harness.test_call_reverts(&mut executor, tampered_data);


I think you said that solidity also just reverts with zero data for these checks but I wonder if that's just a todo on their end and if we should just introduce a panic code to be used for these reverts. From a developer perspective I think reverts with zero information are very annoying and a single panic code used for all calldata violation can already be quite helpful during debugging.

I'm pretty sure it's a todo.

Any suggestions for the specific panic code?

Using the panic code 0x99 for now.

b9fbd32

cburgdorf · 2021-07-13T10:25:20Z

crates/yulgen/src/operations/abi.rs

+pub fn check_left_padding(size_bits: yul::Expression, val: yul::Expression) -> yul::Statement {
+    statement! {
+        if (iszero((is_left_padded([size_bits], [val])))) {
+            (revert(0, 0))


Just re-stating what I commented above on the test code. I think we should consider introducing our own panic code and then do a revert_with_panic(<some-code>). Basically I think the only time where we should revert with zero data is when someone actually uses revert with no data. In all other circumstances we should revert with a panic code and we should go with whatever panic code Solidity uses and if they don't use one we make up our own.

g-r-a-n-t marked this pull request as draft June 14, 2021 21:49

g-r-a-n-t force-pushed the abi-checks branch from 76e7f52 to bdedff6 Compare June 14, 2021 21:52

g-r-a-n-t mentioned this pull request Jun 15, 2021

Bump Yultsur and fix tests #453

Merged

3 tasks

g-r-a-n-t force-pushed the abi-checks branch 2 times, most recently from 42ffdbb to 361c36d Compare June 15, 2021 16:46

g-r-a-n-t force-pushed the abi-checks branch from cc9a860 to 6c0aa01 Compare June 29, 2021 05:38

g-r-a-n-t mentioned this pull request Jul 2, 2021

ABI refactor and removal of bytes #472

Merged

3 tasks

g-r-a-n-t force-pushed the abi-checks branch from 8d93108 to 87dcec6 Compare July 2, 2021 21:21

g-r-a-n-t commented Jul 5, 2021

View reviewed changes

crates/yulgen/src/runtime/functions/abi.rs Outdated Show resolved Hide resolved

cburgdorf mentioned this pull request Jul 8, 2021

Implement panics #476

Merged

g-r-a-n-t commented Jul 12, 2021

View reviewed changes

g-r-a-n-t force-pushed the abi-checks branch 2 times, most recently from fb72692 to 4053b13 Compare July 12, 2021 23:49

g-r-a-n-t commented Jul 13, 2021

View reviewed changes

crates/yulgen/src/operations/abi.rs Outdated Show resolved Hide resolved

crates/yulgen/src/runtime/abi_dispatcher.rs Outdated Show resolved Hide resolved

g-r-a-n-t force-pushed the abi-checks branch from 4053b13 to 169e95f Compare July 13, 2021 00:20

Runtime ABI decoding checks.

c1415e2

g-r-a-n-t force-pushed the abi-checks branch from 169e95f to c1415e2 Compare July 13, 2021 00:48

g-r-a-n-t marked this pull request as ready for review July 13, 2021 01:08

g-r-a-n-t commented Jul 13, 2021

View reviewed changes

g-r-a-n-t requested review from cburgdorf and sbillig July 13, 2021 01:24

cburgdorf approved these changes Jul 13, 2021

View reviewed changes

Decode checks panic with temporary code 0x99

b9fbd32

g-r-a-n-t merged commit 40d2fb0 into ethereum:master Jul 13, 2021

g-r-a-n-t mentioned this pull request Jul 13, 2021

Add calldata runtime checks #434

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime decoding checks #440

Runtime decoding checks #440

g-r-a-n-t commented Jun 7, 2021 •

edited

Loading

codecov-commenter commented Jul 6, 2021 •

edited

Loading

g-r-a-n-t Jul 13, 2021

cburgdorf Jul 13, 2021 •

edited

Loading

g-r-a-n-t Jul 13, 2021

cburgdorf left a comment

cburgdorf Jul 13, 2021

g-r-a-n-t Jul 13, 2021

g-r-a-n-t Jul 13, 2021

cburgdorf Jul 13, 2021

	function abi_decode_data_uint256_bytes_100_string_42_bool_address_bytes_100_calldata(head_start, data_end) -> return_val_0, return_val_1, return_val_2, return_val_3, return_val_4, return_val_5 {
	let encoding_size := sub(data_end, head_start)
	if or(lt(encoding_size, 544), gt(encoding_size, 608)) { revert(0, 0) }
	let head_offset_0 := 0
	let head_offset_1 := 32
	let head_offset_2 := 64
	let head_offset_3 := 96
	let head_offset_4 := 128
	let head_offset_5 := 160
	let decoded_val_0 := abi_decode_component_uint256_calldata(head_start, head_offset_0)
	let decoded_val_1, data_start_offset_1, data_end_offset_1 := abi_decode_component_bytes_100_calldata(head_start, head_offset_1)
	let decoded_val_2, data_start_offset_2, data_end_offset_2 := abi_decode_component_string_42_calldata(head_start, head_offset_2)
	let decoded_val_3 := abi_decode_component_bool_calldata(head_start, head_offset_3)
	let decoded_val_4 := abi_decode_component_address_calldata(head_start, head_offset_4)
	let decoded_val_5, data_start_offset_5, data_end_offset_5 := abi_decode_component_bytes_100_calldata(head_start, head_offset_5)
	if iszero(eq(data_start_offset_1, 192)) { revert(0, 0) }
	if iszero(eq(data_start_offset_2, data_end_offset_1)) { revert(0, 0) }
	if iszero(eq(data_start_offset_5, data_end_offset_2)) { revert(0, 0) }
	if iszero(eq(encoding_size, data_end_offset_5)) { revert(0, 0) }
	return_val_0 := decoded_val_0
	return_val_1 := decoded_val_1
	return_val_2 := decoded_val_2
	return_val_3 := decoded_val_3
	return_val_4 := decoded_val_4
	return_val_5 := decoded_val_5
	}

Runtime decoding checks #440

Runtime decoding checks #440

Conversation

g-r-a-n-t commented Jun 7, 2021 • edited Loading

Decoding function walk through:

To-Do

codecov-commenter commented Jul 6, 2021 • edited Loading

Codecov Report

g-r-a-n-t Jul 13, 2021

Choose a reason for hiding this comment

cburgdorf Jul 13, 2021 • edited Loading

Choose a reason for hiding this comment

g-r-a-n-t Jul 13, 2021

Choose a reason for hiding this comment

cburgdorf left a comment

Choose a reason for hiding this comment

cburgdorf Jul 13, 2021

Choose a reason for hiding this comment

g-r-a-n-t Jul 13, 2021

Choose a reason for hiding this comment

g-r-a-n-t Jul 13, 2021

Choose a reason for hiding this comment

cburgdorf Jul 13, 2021

Choose a reason for hiding this comment

g-r-a-n-t commented Jun 7, 2021 •

edited

Loading

codecov-commenter commented Jul 6, 2021 •

edited

Loading

cburgdorf Jul 13, 2021 •

edited

Loading