Unify Constrain Errors #4239

vezenovm · 2024-02-02T15:35:38Z

Problem

This TODO was added in #4101. This is due to having two different strategies for resolving constrain instruction error messages which can be seen in this enum (https://github.com/noir-lang/noir/blob/afcb385daa572f990178eda51faf10dc20acd2d0/compiler/noirc_evaluator/src/ssa/ir/instruction.rs#L5730). We have two different strategies as sometimes when doing SSA codegen we want to include a constrain with a message, but we can't easily codegen a call to resolve_assert_message in the same way we do for user provided assertion messages. Thus, we have this assert_messages map for resolving messages specified by the compiler and separate calls to resolve_assert_message for handling assert messages specified by the user.

Happy Case

We should unify the strategy for specifying errors and error types. Ideally we would remove the assert_messages map from the Circuit type. We most likely could instead store errors and their respective types on the ABI rather than on the circuit. Upon circuit failure we can then use the ABI for fetching the appropriate error.

Some example pseudocode written by @TomAFrench from when we were originally planning #4101.

// This defines a class of errors, e.g. overflows
// This avoids us duplicating the definition of how to format this error message many times.
struct ErrorType {
    format_string: String
    types: Vec<AbiType>
}

// This defines a specific instance.
struct Error {
    type_id: u32
    inputs: Vec<(string, Range<Witness>)>
}

struct ABI {
    ...
    error_types: HashMap<u32, ErrorType>,
    errors: HashMap<u32, Error>
    ...
}

Unifying these constrain errors would essentially require full structured errors as part of the ABI. We could then have followup work to enable custom errors inside Noir similar to https://soliditylang.org/blog/2021/04/21/custom-errors/.

Alternatives Considered

No response

Additional Context

No response

Would you like to submit a PR for this Issue?

No

Support Needs

No response

The text was updated successfully, but these errors were encountered:

sirasistant · 2024-04-18T14:45:01Z

Dynamic assert payloads

Current state

Consider the following code containing a "dynamic" assert payload:

fn option_expect<T, Err>(opt: Option<T>, err: Err) -> T {
    assert(opt.is_some(), err);
    opt.unwrap_unchecked()
}

fn main(option: Option<u8>) -> pub u8 {
    option_expect(option, "option is none")
}

vs the static assert payload version of the code:

fn main(option: Option<u8>) -> pub u8 {
    assert(option.is_some(), "option is none");
    option.unwrap_unchecked()
}

This is the generated SSA for the static version:

Initial SSA:
acir(inline) fn main f0 {
  b0(v0: u1, v1: u8):
    v3 = call f1(v0, v1)
    constrain v3 == u1 1 "option is none"
    v6 = call f2(v0, v1)
    return v6
}

And this one for the "dynamic" version:

Initial SSA:
acir(inline) fn main f0 {
  b0(v0: u1, v1: u8):
    v12 = call f1(v0, v1, [u8 111, u8 2⁴×7, u8 116, u8 105, u8 111, u8 110, u8 2⁵, u8 105, u8 115, u8 2⁵, u8 110, u8 111, u8 110, u8 101])
    return v12
}
acir(inline) fn option_expect f1 {
  b0(v0: u1, v1: u8, v2: [u8; 14]):
    inc_rc v2
    v4 = call f2(v0, v1)
    v8 = call f2(v0, v1)
    inc_rc v2
    constrain v4 == u1 1 call f3(v2, v8)
    v10 = call f4(v0, v1)
    dec_rc v2
    return v10
}
acir(inline) fn is_some f2 {
  b0(v0: u1, v1: u8):
    return v0
}
brillig fn resolve_assert_message f3 {
  b0(v0: [u8; 14], v1: u1):
    inc_rc v0
    v2 = not v1
    jmpif v2 then: b1, else: b2
  b1():
    call v3(v0, [u8 123, u8 34, u8 107, u8 105, u8 110, u8 100, u8 34, u8 58, u8 34, u8 115, u8 116, u8 114, u8 105, u8 110, u8 103, u8 34, u8 44, u8 34, u8 108, u8 101, u8 110, u8 103, u8 116, u8 104, u8 34, u8 58, u8 49, u8 52, u8 125], u1 0)
    jmp b2()
  b2():
    dec_rc v0
    return 
}
acir(inline) fn unwrap_unchecked f4 {
  b0(v0: u1, v1: u8):
    return v1
}

The dynamic version has undergone some modifications before conversion to SSA. Let's explain what are those:

Modifications at HIR

        if matches!(
            assert_message_expr,
            Expression { kind: ExpressionKind::Literal(Literal::Str(..)), .. }
        ) {
            return Some(self.resolve_expression(assert_message_expr));
        }

        let is_in_stdlib = self.path_resolver.module_id().krate.is_stdlib();
        let assert_msg_call_path = if is_in_stdlib {
            ExpressionKind::Variable(Path {
                segments: vec![Ident::from("internal"), Ident::from("resolve_assert_message")],
                kind: PathKind::Crate,
                span,
            })
        } else {
            ExpressionKind::Variable(Path {
                segments: vec![
                    Ident::from("std"),
                    Ident::from("internal"),
                    Ident::from("resolve_assert_message"),
                ],
                kind: PathKind::Dep,
                span,
            })
        };
        let assert_msg_call_args = vec![assert_message_expr.clone(), condition];
        let assert_msg_call_expr = Expression::call(
            Expression { kind: assert_msg_call_path, span },
            assert_msg_call_args,
            span,
        );
        Some(self.resolve_expression(assert_msg_call_expr))

At the HIR level, if the assert message is not a literal string, a call to resolve_assert_message is introduced. This is the implementation of resolve_assert_message:

#[oracle(assert_message)]
unconstrained fn assert_message_oracle<T>(_input: T) {}
unconstrained pub fn resolve_assert_message<T>(input: T, condition: bool) {
    if !condition {
        assert_message_oracle(input);
    }
}

This explains this added function in the SSA:

...
    constrain v4 == u1 1 call f3(v2, v8)
...

brillig fn resolve_assert_message f3 {
  b0(v0: [u8; 14], v1: u1):
    inc_rc v0
    v2 = not v1
    jmpif v2 then: b1, else: b2
  b1():
    call v3(v0, [u8 123, u8 34, u8 107, u8 105, u8 110, u8 100, u8 34, u8 58, u8 34, u8 115, u8 116, u8 114, u8 105, u8 110, u8 103, u8 34, u8 44, u8 34, u8 108, u8 101, u8 110, u8 103, u8 116, u8 104, u8 34, u8 58, u8 49, u8 52, u8 125], u1 0)
    jmp b2()
  b2():
    dec_rc v0
    return 
}

Modifications at monomorphization

After monomorphization, the assert_message oracle call is modified with metadata about the type it's being called with, in a similar way as the print opcode.

        if let ast::Expression::Ident(ident) = original_func.as_ref() {
            if let Definition::Oracle(name) = &ident.definition {
                if name.as_str() == "print" {
                    // Oracle calls are required to be wrapped in an unconstrained function
                    // The first argument to the `print` oracle is a bool, indicating a newline to be inserted at the end of the input
                    // The second argument is expected to always be an ident
                    self.append_printable_type_info(&hir_arguments[1], &mut arguments);
                } else if name.as_str() == "assert_message" {
                    // The first argument to the `assert_message` oracle is the expression passed as a message to an `assert` or `assert_eq` statement
                    self.append_printable_type_info(&hir_arguments[0], &mut arguments);
                }
            }
        }

The metadata is the serialized ABI of the type:

    fn append_printable_type_info_inner(typ: &Type, arguments: &mut Vec<ast::Expression>) {
        // Disallow printing slices and mutable references for consistency,
        // since they cannot be passed from ACIR into Brillig
        if matches!(typ, HirType::MutableReference(_)) {
            unreachable!("println and format strings do not support mutable references.");
        }

        let printable_type: PrintableType = typ.into();
        let abi_as_string = serde_json::to_string(&printable_type)
            .expect("ICE: expected PrintableType to serialize");

        arguments.push(ast::Expression::Literal(ast::Literal::Str(abi_as_string)));
    }

This explains the call to v3 (assert_message) in SSA:

brillig fn resolve_assert_message f3 {
  b0(v0: [u8; 14], v1: u1):
    inc_rc v0
    v2 = not v1
    jmpif v2 then: b1, else: b2
  b1():
    call v3(v0, [u8 123, u8 34, u8 107, u8 105, u8 110, u8 100, u8 34, u8 58, u8 34, u8 115, u8 116, u8 114, u8 105, u8 110, u8 103, u8 34, u8 44, u8 34, u8 108, u8 101, u8 110, u8 103, u8 116, u8 104, u8 34, u8 58, u8 49, u8 52, u8 125], u1 0)
    jmp b2()
  b2():
    dec_rc v0
    return 
}

As can be seen here, the oracle call to assert_message includes not only the string v0 but also a literal byte array, that being the abi_as_string.

Compilation and runtime

This strategy will make it so when evaluating this instruction constrain v4 == u1 1 call f3(v2, v8) (the assertion on is_some()) a brillig call will be issued to a resolve_assert_message with the string and the condition. Then the brillig function will conditionally call the assert_message oracle with the string + extra metadata if the condition is false.

But how is this assert_message oracle related with the original assertion?
The runtime (acvm_js and nargo) recognizes this special oracle, parses the type using the metadata, stores the parsed result, and when the circuit fails, it has in cache the last call to assert_message.

Comparison with static assert messages

With static assert messages, no transformations happen on the frontend. The generated SSA

Initial SSA:
acir(inline) fn main f0 {
  b0(v0: u1, v1: u8):
    v3 = call f1(v0, v1)
    constrain v3 == u1 1 "option is none"
    v6 = call f2(v0, v1)
    return v6
}

gets compiled to ACIR and in compilation to ACIR, the literal string is stored in a Map<OpcodeLocation, String> inside the circuit. So when the circuit fails, the runtime just picks what String is stored there for the failing opcode.

If the function is brillig, instead of being added to the Map<OpcodeLocation, String> the error string is directly returned as revert data.

Challenges with current state

This approach, altough a bit fragile (since modifications in SSA assertions need to account for dynamic assertions, or bugs like this one can happen) works fine for circuits. But it poses some challenges for public functions in aztec:

The runtime is required to know about dynamic assertions, storing the error payload for a later error. This is quite problematic for the implementation of the VM, since it would need to emulate what acvm_js is doing right now.
Metadata about the type is embedded in the bytecode. This would increase bytecode size and runtime cost for public functions, with this data only being relevant for tooling.
There is a small extra cost of adding an extra function call for constrains, but it should be easily removed by inlining.

Alternative approach

Instead of using an entirely new approach for dynamic assertion messages, we can follow the same approach that we have for static assertion messages.
The Map<OpcodeLocation, String> that the circuit struct holds could be extended to be a Map<OpcodeLocation, AssertionData>
The type AssertionData would be:

struct AssertionData {
    typing_metadata: PrintableType, // This is what is currently serialized into the bytecode.
    payload: AssertionPayload
};

enum AssertionPayload {
    Witness(Vec<Expression>),
    MemoryArray(BlockId),
    BrilligOutput,  // The brillig VM returns the data directly in revert data
}

So the example code:

fn option_expect<T, Err>(opt: Option<T>, err: Err) -> T {
    assert(opt.is_some(), err);
    opt.unwrap_unchecked()
}

fn main(option: Option<u8>) -> pub u8 {
    option_expect(option, "option is none")
}

would generate the following SSA:

Initial SSA:
acir(inline) fn main f0 {
  b0(v0: u1, v1: u8):
    v12 = call f1(v0, v1, [u8 111, u8 2⁴×7, u8 116, u8 105, u8 111, u8 110, u8 2⁵, u8 105, u8 115, u8 2⁵, u8 110, u8 111, u8 110, u8 101])
    return v12
}
acir(inline) fn option_expect f1 {
  b0(v0: u1, v1: u8, v2: [u8; 14]):
    inc_rc v2
    v4 = call f2(v0, v1)
    inc_rc v2
    constrain v4 == u1 1 v2 // Notice here constrain directly holds an SSA value
    v10 = call f4(v0, v1)
    dec_rc v2
    return v10
}
acir(inline) fn is_some f2 {
  b0(v0: u1, v1: u8):
    return v0
}
acir(inline) fn unwrap_unchecked f4 {
  b0(v0: u1, v1: u8):
    return v1
}

ACIR generation and Brillig generation need access to the PrintableType metadata that is generated from the HIR for all non-string assert payloads. It could be maintained in parallel or directly embedded in the constrain instruction in SSA, altough the latter sounds worse. This is necessary since we need to build the AssertionData defined above.

When an opcode fails, the ACVM immediately has enough data to:

Extract the payload of the failing opcode
Know the types of the failing opcode payload and how they could be printed for an user.

Advantages of this approach

Supporting static non-string assertion payloads
Removes any notion of PrintableType from the bytecode, maintaining metadata outside.
No more need to cache oracle calls to figure out the payload for a failing assertion.
The constrain instruction just holds a regular SSA value. No need to special case it for everything like we have to do with the call to resolve_assert_message that has a parallel version of the constrain condition, etc.
The distinction of static constrain payload or dynamic constrain payload happens after SSA optimization. So any payload that can be determined static via optimization wouldn't end up in the bytecode, unlike now (in fact my example above should be determined constant at inlining!)
AVM support would now be easy, since no extraneous oracles have been added. Brillig gen just has to properly serialize the passed in type (possibly using the same serialization strategy as oracle calls have).

Updates to the proposal:

No metadata in the circuit struct

We can avoid storing metadata about assertion payload types in the circuit struct if we put it in the ABI.
The circuit struct will have a map of Map<OpcodeLocation, AssertionPayload> instead of Map<OpcodeLocation, AssertionData>. The metadata about the type of the error payload will live in the ABI instead. Nargo and other runners are then responsible to properly format the error data.

TomAFrench · 2024-04-18T18:04:46Z

No more need to cache oracle calls to figure out the payload for a failing assertion.

I think that's unnecessary now anyway now we have revert data in brillig so this is a constant between all proposals. This was just a hack in the short term.

This PR implements this proposal: noir-lang/noir#4239 (comment) - Removes legacy dynamic assertions via calling brillig + oracle - Removes legacy oracle handling for assertions - Frontend adds the HirType in the ConstrainError instead of embedding metadata about it in the program logic. - Constrain instruction in SSA now has `Values` for payload - SSA gen generates an error selector from the payload type - ACIR gen and Brillig gen handle the payload ValueIds and error selector - ABI has now error_types for non-string errors - ACVM now resolves the payload for non-constant AssertionPayloads - Nargo decodes the error for non-string errors using the ABI, allowing use in nargo tests Things to do in a followup PR: - Add an entry point in noirc_abi_wasm that allows decoding a given `Raw` (non-string) using the ABI in the same way that nargo does.

TomAFrench · 2024-04-26T14:15:41Z

Closing this after AztecProtocol/aztec-packages#5949, we can reopen an issue for the next iteration.

This PR implements this proposal: noir-lang/noir#4239 (comment) - Removes legacy dynamic assertions via calling brillig + oracle - Removes legacy oracle handling for assertions - Frontend adds the HirType in the ConstrainError instead of embedding metadata about it in the program logic. - Constrain instruction in SSA now has `Values` for payload - SSA gen generates an error selector from the payload type - ACIR gen and Brillig gen handle the payload ValueIds and error selector - ABI has now error_types for non-string errors - ACVM now resolves the payload for non-constant AssertionPayloads - Nargo decodes the error for non-string errors using the ABI, allowing use in nargo tests Things to do in a followup PR: - Add an entry point in noirc_abi_wasm that allows decoding a given `Raw` (non-string) using the ABI in the same way that nargo does.

vezenovm added the enhancement New feature or request label Feb 2, 2024

github-project-automation bot added this to Noir Feb 2, 2024

github-project-automation bot moved this to 📋 Backlog in Noir Feb 2, 2024

vezenovm mentioned this issue Feb 2, 2024

feat: Evaluation of dynamic assert messages #4101

Merged

5 tasks

Savio-Sou assigned vezenovm Feb 23, 2024

vezenovm mentioned this issue Apr 15, 2024

feat!: trap with revert data AztecProtocol/aztec-packages#5732

Merged

sirasistant self-assigned this Apr 22, 2024

sirasistant mentioned this issue Apr 24, 2024

feat: Dynamic assertion payloads v2 AztecProtocol/aztec-packages#5949

Merged

TomAFrench closed this as completed Apr 26, 2024

github-project-automation bot moved this from 📋 Backlog to ✅ Done in Noir Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify Constrain Errors #4239

Unify Constrain Errors #4239

vezenovm commented Feb 2, 2024 •

edited

Loading

sirasistant commented Apr 18, 2024 •

edited

Loading

Modifications at HIR

Modifications at monomorphization

Compilation and runtime

Comparison with static assert messages

TomAFrench commented Apr 18, 2024

TomAFrench commented Apr 26, 2024

Unify Constrain Errors #4239

Unify Constrain Errors #4239

Comments

vezenovm commented Feb 2, 2024 • edited Loading

Problem

Happy Case

Alternatives Considered

Additional Context

Would you like to submit a PR for this Issue?

Support Needs

sirasistant commented Apr 18, 2024 • edited Loading

Dynamic assert payloads

Current state

Modifications at HIR

Modifications at monomorphization

Compilation and runtime

Comparison with static assert messages

Challenges with current state

Alternative approach

Advantages of this approach

Updates to the proposal:

No metadata in the circuit struct

TomAFrench commented Apr 18, 2024

TomAFrench commented Apr 26, 2024

vezenovm commented Feb 2, 2024 •

edited

Loading

sirasistant commented Apr 18, 2024 •

edited

Loading