Algod: Additional simulation result information #4439

jdtzmn · 2022-08-19T23:52:03Z

Summary

This PR builds upon #4322 by exposing additional evaluation details with the help of the new debugger hooks.

Specifically, the simulate endpoint is more structured and includes a FailedAt pointer to the transaction (including inner transactions) where evaluation failed, as well as each transaction's ApplyData. And if a transaction fails, we include the ApplyData with as much detail as possible at the point of failure.

As a minor note, the simulate endpoint can now be configured to encode its response as either JSON or msgpack. By default, JSON will be used.

Test Plan

Many unit tests added to ensure the more detailed response structure behaves as expected.

Dynamic budget checks in the evaluator complicate the gatekeeping. This functionality will be easier to add in the future after the `DebuggerHook` interface has been extended.

@jannotti

This test confirms an error spotted by @jannotti where `errors.As(…)` was always true. algorand#4322 (comment)

daemon/algod/api/server/v2/test/helpers.go

bbroder-algo · 2023-02-17T18:24:32Z

daemon/algod/api/server/v2/handlers_test.go

+	t.Helper()
+
+	if seen[node] {
+		return


cycle blocker

bbroder-algo · 2023-02-17T18:26:33Z

daemon/algod/api/server/v2/handlers_test.go

+		preEncodedTxPath(generatedResponseGraph).children = preEncodedTxPath(customResponseGraph).children
+	}
+
+	generatedResponseGraph.AssertEquals(t, customResponseGraph)


elaborate but helpful

ledger/simulation/simulator.go

bbroder-algo · 2023-02-17T18:45:48Z

ledger/simulation/simulator.go

 	return vb, missingSignatures, err
 }
+
+// Simulate simulates a transaction group using the simulator. Will error if the transaction group is not well-formed.
+func (s Simulator) Simulate(txgroup []transactions.SignedTxn) (Result, error) {


ledger/simulation/trace.go

ledger/simulation/tracer_test.go

algochoi

TIL there was a linter rule to favor ++ instead of += 1.

jannotti · 2023-02-23T16:11:34Z

daemon/algod/api/algod.oas2.json

+        "failed-at": {
+          "description": "If present, indicates which transaction in this group caused the failure",
+          "type": "array",


Why is this an array? The description appears to say it will be a single integer.

It's an array because we show you the path to the failing txn. E.g. if the second txn is an app call whose third inner failed, failed-at would be [1,2]

Can we explain that succinctly(?) in the description? Perhaps it's too much.

I made an attempt in 9557a66, let me know what you think

I think including that in the description is nice - the endpoint only returns the first error so it could be ambiguous whether failed-at is reporting every error in the group, or the path to the failed transaction.

jannotti · 2023-02-23T16:15:14Z

daemon/algod/api/algod.oas2.json

-          "missing-signatures": {
-            "description": "\\[ms\\] Whether any transactions would have failed during a live broadcast because they were missing signatures.",
+          "would-succeed": {
+            "description": "Indicates whether the simulated transactions would have succeeded during an actual submission.",


I suppose this is true if all of the transaction groups would have succeeded? I think I'd just drop it, but I don't feel strongly.

It seems a bit strange that we have three levels of results being aggregated, Txn, TxnGroup, and Group of TxnGroups. At each level, we describe success a little differently. At the transaction level, we put it inside the txn-result, then at the group level, we aggregate it as a "failure-message", and then at the very top, we aggregate the other way, as "would-succeed". We could strive for more unity here, or we could skip the aggregation entirely.

The motivation behind "would-succeed" at this level was to provide a dead simple way of figuring out if your transactions would succeed if you sent them to the real submission endpoint. Yes, this is redundant information, since you could deduce the same thing by checking that no TxnGroup has a failure message and no Txn is missing signatures.

But in the future you may have to check more things, like no additional foreign resources were needed and you didn't exceed the standard opcode budget. True, these features would likely be opt-in so they're less likely to confuse people, but still think there's a benefit to summarizing all the possible reasons for failure in a simple boolean.

I'm ok with this, but I'm actually so convinced that I think it should exist on individual txn-groups as well (we can add in a later PR, since it will be an addition). The idea that there will be multiple ways that we might allow a group to "succeed" even though they would not actually succeed on chain is what convinces me. And, I think there will be a good use for simulation in unit tests where you want to confirm that, say, txgroups 1, 2 , 3 and 5 would succeed, but 4 would not. That is, the convenience you're offering is good at the txgroup level, because you may want some of the groups to fail (you are testing that they will, in fact). But that the other groups are perfectly correct.

Those are good points. I'll admit I haven't given much thought to how multiple groups would be handled, but your ideas make sense.

jannotti

A few API nits, but I'm willing to approve either way you want to go.

jdtzmn added 30 commits July 20, 2022 17:18

Add simulate endpoint to OpenAPI file

8f840dc

Basic simulation endpoint with failure message

62f4b59

Return msgpack response instead of JSON

505bc50

Test Simulator class with basic pay transactions

58846a3

Use genesis methods for block header creation

e242bc4

Test simple group transaction

63805af

Trivial app create and call test

5d93ceb

Remove budget error gatekeeping for now

2c8c912

Dynamic budget checks in the evaluator complicate the gatekeeping. This functionality will be easier to add in the future after the `DebuggerHook` interface has been extended.

Add invalid signature check

b9f96ee

Lowercase simulate test methods to make them private

e52d1bf

Add balance change test

a9f4d1a

Merge branch 'feature/simulate-endpoint' into psuedo-eval-endpoint

8a54d2e

Implement newly included ledger methods introduced in the merge

301a366

Fix unkeyed struct fields

06d0063

Fix golint errors

0631333

Spec file changes

7f0b564

Use new flag for configuring simulation endpoint visibility

6e5c9ce

Check node status separately from transaction decoding

64567aa

Write interface annotations

fc76cb2

Add docstring to MakeLedgerForRound

760bbfb

Use error type to fix fragile signature error check

cc6e605

Fix signature check and add invalid signature test

c51926f

Test reject app call

deb330f

Disable transaction simulator by default

19cdb73

Fix API spec descriptions

462a9dd

Improve decodeTxGroup

b2c4f51

Clean up simulator ledger

6ec12f1

Add invalid transaction group test

b5473ae

This test confirms an error spotted by @jannotti where `errors.As(…)` was always true. algorand#4322 (comment)

Make SignatureError a struct

307709d

Cleanup simulator tests

7995700