-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs(yellow-paper): Add pseudocode for verifying broadcasted functions in contract deployment #4431
Changes from 3 commits
8bbf44a
48b2464
03ce7c9
d387588
30ec77d
1ba6e29
36d9e17
c151b04
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -75,12 +75,14 @@ unconstrained_functions_artifact_tree_root = merkleize(unconstrained_functions_a | |
artifact_hash = sha256( | ||
private_functions_artifact_tree_root, | ||
unconstrained_functions_artifact_tree_root, | ||
artifact_metadata, | ||
artifact_metadata_hash, | ||
) | ||
``` | ||
|
||
For the artifact hash merkleization and hashing is done using sha256, since it is computed and verified outside of circuits and does not need to be SNARK friendly. Fields are left-padded with zeros to 256 bits before being hashed. Function leaves are sorted in ascending order before being merkleized, according to their function selectors. Note that a tree with dynamic height is built instead of having a tree with a fixed height, since the merkleization is done out of a circuit. | ||
|
||
<!-- TODO: Sure, sha256 is nice, but its output does not fit in a single field. Is it ok to wrap around the field modulus? Should we use sha224 instead? Should we use pedersen (or poseidon) everywhere instead? --> | ||
|
||
Bytecode for private functions is a mix of ACIR and Brillig, whereas unconstrained function bytecode is Brillig exclusively, as described on the [bytecode section](../bytecode/index.md). | ||
|
||
The metadata hash for each function is suggested to be computed as the sha256 of all JSON-serialized fields in the function struct of the compilation artifact, except for bytecode and debug symbols. The metadata is JSON-serialized using no spaces, and sorting ascending all keys in objects before serializing them. | ||
|
@@ -124,13 +126,16 @@ In pseudocode: | |
function register( | ||
artifact_hash: Field, | ||
private_functions_root: Field, | ||
public_bytecode_commitment: Field, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This will need to be an encoding of a point, so at least 1 Field + 1 bit. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can't we use the "all points are represented on one side of the curve" to avoid that extra bit? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know 😅 Potentially. We'd need to check with the crypto team, I think. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Redefining it to be a Point for now, until we know for sure what shape it'll have |
||
packed_public_bytecode: Field[], | ||
) | ||
assert is_valid_packed_public_bytecode(packed_public_bytecode) | ||
|
||
version = 1 | ||
bytecode_commitment = calculate_commitment(packed_public_bytecode) | ||
contract_class_id = pedersen([version, artifact_hash, private_functions_root, bytecode_commitment], GENERATOR__CLASS_IDENTIFIER) | ||
|
||
assert is_valid_packed_public_bytecode(packed_public_bytecode) | ||
computed_bytecode_commitment = calculate_commitment(packed_public_bytecode) | ||
assert public_bytecode_commitment == computed_bytecode_commitment | ||
|
||
contract_class_id = pedersen([version, artifact_hash, private_functions_root, computed_bytecode_commitment], GENERATOR__CLASS_IDENTIFIER) | ||
|
||
emit_nullifier contract_class_id | ||
emit_unencrypted_event ContractClassRegistered(contract_class_id, version, artifact_hash, private_functions_root, packed_public_bytecode) | ||
|
@@ -157,13 +162,13 @@ Broadcasted contract artifacts that do not match with their corresponding `artif | |
``` | ||
function broadcast_all_private_functions( | ||
contract_class_id: Field, | ||
artifact_metadata: Field, | ||
artifact_metadata_hash: Field, | ||
unconstrained_functions_artifact_tree_root: Field, | ||
functions: { selector: Field, metadata: Field, vk_hash: Field, bytecode: Field[] }[], | ||
functions: { selector: Field, metadata_hash: Field, vk_hash: Field, bytecode: Field[] }[], | ||
) | ||
emit_unencrypted_event ClassPrivateFunctionsBroadcasted( | ||
contract_class_id, | ||
artifact_metadata, | ||
artifact_metadata_hash, | ||
unconstrained_functions_artifact_tree_root, | ||
functions, | ||
) | ||
|
@@ -172,19 +177,19 @@ function broadcast_all_private_functions( | |
``` | ||
function broadcast_all_unconstrained_functions( | ||
contract_class_id: Field, | ||
artifact_metadata: Field, | ||
artifact_metadata_hash: Field, | ||
private_functions_artifact_tree_root: Field, | ||
functions:{ selector: Field, metadata: Field, bytecode: Field[] }[], | ||
functions:{ selector: Field, metadata_hash: Field, bytecode: Field[] }[], | ||
) | ||
emit_unencrypted_event ClassUnconstrainedFunctionsBroadcasted( | ||
contract_class_id, | ||
artifact_metadata, | ||
artifact_metadata_hash, | ||
unconstrained_functions_artifact_tree_root, | ||
functions, | ||
) | ||
``` | ||
|
||
<!-- TODO: What representation of bytecode can we use here? --> | ||
<!-- TODO: What representation of bytecode can we use here? --> | ||
|
||
The broadcast functions are split between private and unconstrained to allow for private bytecode to be broadcasted, which is valuable for composability purposes, without having to also include unconstrained functions, which could be costly to do due to data broadcasting costs. Additionally, note that each broadcast function must include enough information to reconstruct the `artifact_hash` from the Contract Class, so nodes can verify it against the one previously registered. | ||
|
||
|
@@ -193,35 +198,70 @@ The `ContractClassRegisterer` contract also allows broadcasting individual funct | |
``` | ||
function broadcast_private_function( | ||
contract_class_id: Field, | ||
artifact_metadata: Field, | ||
artifact_metadata_hash: Field, | ||
unconstrained_functions_artifact_tree_root: Field, | ||
function_leaf_sibling_path: Field, | ||
function: { selector: Field, metadata: Field, vk_hash: Field, bytecode: Field[] }, | ||
private_function_tree_sibling_path: Field[], | ||
artifact_function_tree_sibling_path: Field[], | ||
function: { selector: Field, metadata_hash: Field, vk_hash: Field, bytecode: Field[] }, | ||
) | ||
emit_unencrypted_event ClassPrivateFunctionBroadcasted( | ||
contract_class_id, | ||
artifact_metadata, | ||
artifact_metadata_hash, | ||
unconstrained_functions_artifact_tree_root, | ||
function_leaf_sibling_path, | ||
private_function_tree_sibling_path, | ||
artifact_function_tree_sibling_path, | ||
function, | ||
) | ||
``` | ||
|
||
``` | ||
function broadcast_unconstrained_function( | ||
contract_class_id: Field, | ||
artifact_metadata: Field, | ||
artifact_metadata_hash: Field, | ||
private_functions_artifact_tree_root: Field, | ||
function_leaf_sibling_path: Field, | ||
function: { selector: Field, metadata: Field, bytecode: Field[] }[], | ||
artifact_function_tree_sibling_path: Field[], | ||
function: { selector: Field, metadata_hash: Field, bytecode: Field[] }[], | ||
) | ||
emit_unencrypted_event ClassUnconstrainedFunctionBroadcasted( | ||
contract_class_id, | ||
artifact_metadata, | ||
unconstrained_functions_artifact_tree_root, | ||
function_leaf_sibling_path: Field, | ||
artifact_metadata_hash, | ||
private_functions_artifact_tree_root, | ||
artifact_function_tree_sibling_path, | ||
function, | ||
) | ||
``` | ||
|
||
A node that captures a `ClassPrivateFunctionBroadcasted` should perform the following validation steps before storing the private function information in its database: | ||
|
||
``` | ||
// Load contract class from local db | ||
contract_class = db.get_contract_class(contract_class_id) | ||
|
||
// Compute function leaf and assert it belongs to the private functions tree | ||
function_leaf = pedersen([selector as Field, vk_hash], GENERATOR__FUNCTION_LEAF) | ||
computed_private_function_tree_root = compute_root(function_leaf, private_function_tree_sibling_path) | ||
assert computed_private_function_tree_root == contract_class.private_function_root | ||
|
||
// Compute artifact leaf and assert it belongs to the artifact | ||
artifact_function_leaf = sha256(selector, metadata_hash, sha256(bytecode)) | ||
computed_artifact_private_function_tree_root = compute_root(artifact_function_leaf, artifact_function_tree_sibling_path) | ||
computed_artifact_hash = sha256(computed_artifact_private_function_tree_root, unconstrained_functions_artifact_tree_root, artifact_metadata_hash) | ||
assert computed_artifact_hash == contract_class.artifact_hash | ||
``` | ||
|
||
<!-- TODO: Requiring two sibling paths isn't nice. This is because we are splitting private function information across two trees: one for the protocol, that deals only with selectors and vk hashes, and one for the artifact, which deals with bytecode and metadata. If we are fine adding a `function_stuff_hash` to the function leaf that goes into the protocol tree, we could get rid of the second sibling path, but that introduces stuff into the private function tree that is not strictly needed and requires unnecessary hashing in the kernel. --> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I reckon aiming to reduce any in-circuit hashing is the best approach, here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good. Moved this to a "discarded approaches" section. |
||
|
||
The check for an unconstrained function is similar: | ||
|
||
``` | ||
// Load contract class from local db | ||
contract_class = db.get_contract_class(contract_class_id) | ||
|
||
// Compute artifact leaf and assert it belongs to the artifact | ||
artifact_function_leaf = sha256(selector, metadata_hash, sha256(bytecode)) | ||
computed_artifact_unconstrained_function_tree_root = compute_root(artifact_function_leaf, artifact_function_tree_sibling_path) | ||
computed_artifact_hash = sha256(private_functions_artifact_tree_root, computed_artifact_unconstrained_function_tree_root, artifact_metadata_hash) | ||
assert computed_artifact_hash == contract_class.artifact_hash | ||
``` | ||
|
||
It is strongly recommended for developers registering new classes to broadcast the code for `compute_hash_and_nullifier`, so any private message recipients have the code available to process their incoming notes. However, the `ContractClassRegisterer` contract does not enforce this during registration, since it is difficult to check the multiple signatures for `compute_hash_and_nullifier` as they may evolve over time to account for new note sizes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I think wrapping around the modulus should be ok in this case, because that should still give us collision resistance. But perhaps "Poseidon everywhere" is an easier approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I'll still leave a TODO for verifying it with the crypto team, just in case.
As for not padding to 256 bits, that's good to know! Still, I think it's best to pad anyway for consistency: pretty much everywhere we represent fields using 32 bytes.