Fix `unmarshal` bugs #409

Vindaar · 2024-06-28T19:51:07Z

I encountered 3 issues working on the updated Go / Rust APIs. The (now more) static nature of the Go API (using fixed size arrays) showed me that (at least) one case succeeds that is supposed to fail. That test case is:

https://github.com/mratsim/constantine/blob/master/tests/protocol_ethereum_evm_precompiles/eip-2537/fail-add_G1_bls.json#L27-L31

The test is supposed to fail, because the X coordinate of the first point given in the input is way larger than the prime field size (see the string starting with a 10 byte). Investigating led me to realize that the BigInt parsing logic ignored the false return value of the unmarshal. The actual unmarshal call failed with the intended error that the size is too large for the target BigInt. But because we ignored that return value, the not-fully-parsed BigInt ended up being a valid point on the curve.

After fixing this issue, I noticed that the io_limbs.nim unmarshalling logic also had an issue, in the sense that it returns false even if the remaining data in the input source string is entirely zero. But of course that must be allowed (and again, this was also hidden by the bug above). I've put in a -- not that great -- manual check for all bytes still left in the input string. If they are all zero, we simply return true. We might want to do this in a different way, but at the moment I'm not sure what is appropriate.

The other issue (to be fixed later) is that our Nim EVM precompiles tests always use the Expected field's length as the result buffer length. That causes the functions to prematurely fail, which hid the actual bug described above. This last bug means that we got the "correct" test result by accident, because the resulting buffer size was wrong, failing the wrong test; just for the wrong reasons.

Edit: Note on `{.discardable.}`

We might want to reconsider whether we want to use {.discardable.} in the library. Maybe too dangerous to accidentally ignore such parsing failures etc.?

Because `unmarshal` is marked as `discardable` we silently ignored the return value of `unmarshal`, which in some test cases indicated that the source value does not fit into target bigint

If a given source `src` is given to `unmarshal` where the most significant bytes are all zero beyond the size that fits into the destination BigInt, we still return a successful parse. Previously we would just return false whenever we found more bytes than can possibly fit (this leads to issues with e.g. EVM precompiles tests, which come as strings that are longer). Note: We can probably do the all zero check in a better way (i.e. using the fact that endianness does not matter and just compare all relevant bytes, but not sure about the most elegant way to do that, taking the dynamic sizes into account)

mratsim · 2024-06-28T20:32:20Z

Great finds, will have a look over the weekend.

We might want to reconsider whether we want to use {.discardable.} in the library. Maybe too dangerous to accidentally ignore such parsing failures etc.?

Yes, it's fine if the destination is a fixed sized array but for var openArray, it's not OK.

mratsim · 2024-06-28T20:36:56Z

constantine/serialization/io_limbs.nim

+        # significant two bytes are all zero.
+        var allZero = true
+        for jidx in src_idx ..< dst.len:
+          if src[jidx] != 0: allZero = false


I'll have to review more in-depth that case but we can't leak the position of non-zero bytes as this breaks constant-time property.
And even leaking if something is zero is tricky.

mratsim · 2024-06-28T20:37:30Z

constantine/serialization/io_limbs.nim

+        # significant two bytes are all zero.
+        var allZero = true
+        for jidx in countdown(src_idx, 0):
+          if src[jidx] != 0: allZero = false


same remark, analyzing the data itself instead of metadata (length) breaks constant-time.

mratsim · 2024-06-29T09:42:40Z

After careful consideration:

The test is supposed to fail, because the X coordinate of the first point given in the input is way larger than the prime field size (see the string starting with a 10 byte). Investigating led me to realize that the BigInt parsing logic ignored the false return value of the unmarshal. The actual unmarshal call failed with the intended error that the size is too large for the target BigInt. But because we ignored that return value, the not-fully-parsed BigInt ended up being a valid point on the curve.

This should be fixed, and {.discardable.} should be removed for marshalling into var openArray but can be kept for marshalling into var array[N, byte] as the latter always has buffers of the right size.

After fixing this issue, I noticed that the io_limbs.nim unmarshalling logic also had an issue, in the sense that it returns false even if the remaining data in the input source string is entirely zero. But of course that must be allowed (and again, this was also hidden by the bug above). I've put in a -- not that great -- manual check for all bytes still left in the input string. If they are all zero, we simply return true. We might want to do this in a different way, but at the moment I'm not sure what is appropriate.

The serialization there cannot shorten or extend processing based on the data read to be constant-time (https://github.com/mratsim/constantine/wiki/Constant-time-arithmetics). Only the length can be leaked.
Furthermore, this means the bigint is not canonically serialized as per octet-string specs (https://www.ietf.org/rfc/rfc3447.html#section-4.2 ).
In practice protocols will implement serialization specified with exact byte lengths mentioned and we should not make an exception here.
This happens for KZG tests here, we don't pass the input for deserialization if the length is not exact

constantine/tests/t_ethereum_eip4844_deneb_kzg.nim

Lines 118 to 123 in ee818f0

    
           if hexInput.len != expectedLength: 
        
             let encodedBytes = (hexInput.len - prefixBytes) div 2 
        
             stdout.write "[ Incorrect input length for '" & 
        
                             astToStr(dstVariable) & 
        
                             "': encoding " & $encodedBytes & " bytes" & 
        
                             " instead of expected " & $elemSize & " ]\n"

The other issue (to be fixed later) is that our Nim EVM precompiles tests always use the Expected field's length as the result buffer length. That causes the functions to prematurely fail, which hid the actual bug described above. This last bug means that we got the "correct" test result by accident, because the resulting buffer size was wrong, failing the wrong test; just for the wrong reasons.

What's the test in question?

Vindaar · 2024-06-29T10:31:10Z

After fixing this issue, I noticed that the io_limbs.nim unmarshalling logic also had an issue, in the sense that it returns false even if the remaining data in the input source string is entirely zero. But of course that must be allowed (and again, this was also hidden by the bug above). I've put in a -- not that great -- manual check for all bytes still left in the input string. If they are all zero, we simply return true. We might want to do this in a different way, but at the moment I'm not sure what is appropriate.

The serialization there cannot shorten or extend processing based on the data read to be constant-time (https://github.com/mratsim/constantine/wiki/Constant-time-arithmetics). Only the length can be leaked. Furthermore, this means the bigint is not canonically serialized as per octet-string specs (https://www.ietf.org/rfc/rfc3447.html#section-4.2 ). In practice protocols will implement serialization specified with exact byte lengths mentioned and we should not make an exception here.

Ok, I understand the implication for non constant time behavior / calculation time depending on user input with my change in io_limbs (i.e. we walk over N bytes where N is user input length instead of M where M is target BigInt required). This should imply that we either always fail to parse if the input is longer than the target BigInt size or that we simply truncate the input. However, it seems to me (but I may be missing something) that all the EVM precompiles tests (at least for BLS12_381 G1 add) are strictly speaking ill defined. See

The Input in all tests is a 512 character string, i.e. 256 byte, so 128 bytes per point pair and 64 bytes per coordinate. Given that BLS12_381 is defined over a prime of 381 bits, which require 6 eight byte limbs (6·8 = 48 bytes, 48·8 = 384 bits), we are always left with an input string that is 64-48 = 16 bytes too long, no?

What is the intended way to handle these test cases, given that both the "supposed to pass" and "supposed to fail" tests use the same input format? In particular the bls_g1add_violate_top_bytes (in the fail JSON file from above), is an interesting candidate, because if we truncate the input string after 48 bytes, the coordinates actually are on the curve, whereas the test implies such input should not pass.

Now, I don't fully grasp yet to what extent other libraries even attempt to have constant time guarantees, so if not I guess they just don't have to wonder about what to do.

The other issue (to be fixed later) is that our Nim EVM precompiles tests always use the Expected field's length as the result buffer length. That causes the functions to prematurely fail, which hid the actual bug described above. This last bug means that we got the "correct" test result by accident, because the resulting buffer size was wrong, failing the wrong test; just for the wrong reasons.

What's the test in question?

Essentially, it's all the EVM precompiles tests in the following file:

https://github.com/mratsim/constantine/blob/master/tests/t_ethereum_evm_precompiles.nim#L49-L54

Note how the expected length is computed based on the test.Expected field, which for all failing tests (see again the fail_add_*.json file linked above) is not even defined in the file and thus Expected is always an empty string.

I would probably adjust that test to also pass a static size for each test case (for each of the precompiles functions), which provides the static expected output size for the data. Similar to how we will handle it in Go once the updated API PR is done.

mratsim · 2024-06-29T20:51:41Z

The Input in all tests is a 512 character string, i.e. 256 byte, so 128 bytes per point pair and 64 bytes per coordinate. Given that BLS12_381 is defined over a prime of 381 bits, which require 6 eight byte limbs (6·8 = 48 bytes, 48·8 = 384 bits), we are always left with an input string that is 64-48 = 16 bytes too long, no?

What is the intended way to handle these test cases, given that both the "supposed to pass" and "supposed to fail" tests use the same input format? In particular the bls_g1add_violate_top_bytes (in the fail JSON file from above), is an interesting candidate, because if we truncate the input string after 48 bytes, the coordinates actually are on the curve, whereas the test implies such input should not pass.

Good point.

For context, in general all parsers/codecs/serialization are protocol-specific unless the creator of the curve specifies a serialization. So in general none of the libraries agree on how to serialize elliptic curve points, a typical example is BN254 which is serialized in little-endian in some libraries.
And then when a curve leaves extra bit, for example BLS12-381 takes 48 bytes (384 bits) but has 3 bits unused so the 3 bits are used to indicate:

whether we're reading a point at infinity
whether we're reading a compressed point, i.e. only the x coordinates and the y must be retrieved by a square root from the curve equation y² = x³ + b
whether we need to take the "positive" or the "negative" field element after the square root, for whatever convention of positive/negative was chosen (for numbers in [0, p) ...)

So Constantine implements custom parsers according to protocol specs.
A way to fix that would be to create a parseRawUint for BLS12-381 in the EVM precompile file.

Vindaar · 2024-07-02T15:26:24Z

So Constantine implements custom parsers according to protocol specs. A way to fix that would be to create a parseRawUint for BLS12-381 in the EVM precompile file.

Got it, so I'll write a custom parser following the details in: https://eips.ethereum.org/EIPS/eip-2537#fine-points-and-encoding-of-base-elements for BLS12-381.

For BLS12-381 we now parse according to the spec https://eips.ethereum.org/EIPS/eip-2537#fine-points-and-encoding-of-base-elements making sure to check the 'upper' 16 bytes to be empty. If they are not we return IntLargerThanModulus.

Vindaar · 2024-07-03T10:30:04Z

I've now added custom parsing logic for BLS12-381 in the EVM precompiles file (and reverted the additional zero checkes in io_limbs.nim), 51d64b5.

Also, the EVM precompiles tests should now all correctly pass or fail based on the test case. We pass the result size explicitly in the test cases now.

mratsim · 2024-07-05T10:37:35Z

constantine/ethereum_evm_precompiles.nim

+  # Now check that all lower bytes are empty
+  var allZero = success # if `success` already false we still continue
+  for i in 0 ..< 15: # order irrelevant
+    allZero = allZero and (src[i] == 0)


As mentioned in discussion, even if we don't require constant-time here, it is fine not to return early on a non-zero.

The difference is likely a couple nanoseconds. It's also an exceptional path.
Furthermore with an exact 16 bytes check, the compiler can transform this into a vector load + vector zero check and it may very well be faster than having an if branch in a hot path.

Vindaar added 2 commits June 28, 2024 21:31

[evm precompiles] do *not* ignore unmarshal return value

09755bd

Because `unmarshal` is marked as `discardable` we silently ignored the return value of `unmarshal`, which in some test cases indicated that the source value does not fit into target bigint

mratsim reviewed Jun 28, 2024

View reviewed changes

Vindaar mentioned this pull request Jul 2, 2024

[Go] [Rust] update language bindings to return value API #417

Merged

Vindaar added 4 commits July 3, 2024 12:25

[tests] explicitly pass output size for EVM precompiles, chk exp err

a1ae4fe

revert checking of "upper" bytes in io_limbs.nim

91b2fbb

[evm precompiles] return invalid input size for MSM input.len == 0

5d085d2

mratsim approved these changes Jul 5, 2024

View reviewed changes

mratsim merged commit ad325cf into master Jul 5, 2024
24 checks passed

mratsim deleted the fixUnmarshalBugs branch July 5, 2024 10:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `unmarshal` bugs #409

Fix `unmarshal` bugs #409

Vindaar commented Jun 28, 2024 •

edited

Loading

mratsim commented Jun 28, 2024

mratsim Jun 28, 2024

mratsim Jun 28, 2024

mratsim commented Jun 29, 2024

Vindaar commented Jun 29, 2024

mratsim commented Jun 29, 2024

Vindaar commented Jul 2, 2024

Vindaar commented Jul 3, 2024 •

edited

Loading

mratsim Jul 5, 2024

Fix unmarshal bugs #409

Fix unmarshal bugs #409

Conversation

Vindaar commented Jun 28, 2024 • edited Loading

Edit: Note on {.discardable.}

mratsim commented Jun 28, 2024

mratsim Jun 28, 2024

Choose a reason for hiding this comment

mratsim Jun 28, 2024

Choose a reason for hiding this comment

mratsim commented Jun 29, 2024

Vindaar commented Jun 29, 2024

mratsim commented Jun 29, 2024

Vindaar commented Jul 2, 2024

Vindaar commented Jul 3, 2024 • edited Loading

mratsim Jul 5, 2024

Choose a reason for hiding this comment

Fix `unmarshal` bugs #409

Fix `unmarshal` bugs #409

Vindaar commented Jun 28, 2024 •

edited

Loading

Edit: Note on `{.discardable.}`

Vindaar commented Jul 3, 2024 •

edited

Loading