From f4b53636f2492f6d4a625d8a94bed72f94c4b510 Mon Sep 17 00:00:00 2001
From: Ray Gao <qg2153@columbia.edu>
Date: Mon, 15 Jul 2024 21:34:55 -0400
Subject: [PATCH 01/10] Adjust documentation

---
 aggregator/README.md    | 176 +++++++++++++++++++++++++++++-----------
 aggregator/src/batch.rs |   3 -
 2 files changed, 129 insertions(+), 50 deletions(-)

diff --git a/aggregator/README.md b/aggregator/README.md
index 26723acf36..0f5a0c7c33 100644
--- a/aggregator/README.md
+++ b/aggregator/README.md
@@ -1,20 +1,24 @@
 Proof Aggregation
 -----
 
-![Architecture](./figures/architecture.jpg)
+# Mechanism
+Aggregation allows larger amounts of data to be verified on-chain using fewer proofs.
+Currently, chunks (a list of continuous blocks) are first aggregated into a batch, then multiple batches are aggregated using a recursive scheme into a bundle.
+The `bundle` is the current apex entity that will be verified on-chain. 
+
+<!-- ![Architecture](./figures/architecture.jpg) -->
 # Params
 |param|meaning |
 |:---:|:---|
 |k | number of valid chunks|
-|n | max number of chunks per batch|
+|n | max number of chunks per batch (hard-coded)|
 |t | number of rounds for the final hash $\lceil32\times n/136\rceil$ |
 
-Currently `n` is hard coded to `10`.
 # Structs
 
 ## Chunk
 
-A __chunk__ is a list of continuous blocks. It consists of 5 hashes:
+A __chunk__ is a list of L2 `blocks` and will be proven by the `ChunkCircuit` (this is in fact the ZkEVM `SuperCircuit`). It consists of 5 hashes:
 - state root before this chunk
 - state root after this chunk
 - the withdraw root of this chunk
@@ -47,93 +51,112 @@ If $k< n$, $(n-k)$ padded chunks are padded to the list. A padded chunk has the
 
 ## Batch
 
-A __batch__ consists of continuous chunks of size `k`. If the input chunks' size `k` is less than `n`, we pad the input with `(n-k)` chunks identical to `chunk[k]`.
+A __batch__ is a list of `chunks` of size `k` that will be aggregated using the `BatchCircuit`. If the input chunks' size `k` is less than `n`, we pad the input with `(n-k)` chunks identical to `chunk[k]`. The batch is represented by the preimage fields to the `batch_hash`, which is constructed as:
+```
+batchHash := keccak256(version || batch_index || l1_message_popped || total_l1_message_popped || batch_data_hash || versioned_hash || parent_batch_hash || last_block_timestamp || z || y)
+```
+All preimage fields' values are provided to the batch through the `BatchHeader` struct, so it can correctly construct the hash state transition from `parent_batch_hash` to `batch_hash` (for current batch). 
+
+## BatchHeader
+The current schema for batch header is:
+
+|Field|Bytes|Type|Index|Comments|
+|:---|:---|:---|:---|:---|
+|version | 1 | uint8| 0| The batch version|
+|batchIndex | 8 | uint64| 1| The index of the batch|
+|l1MessagePopped | 8 | uint64| 9| Number of L1 messages popped in the batch|
+|totalL1MessagePopped | 8 | uint64| 17| Number of total L1 messages popped after the batch|
+|dataHash | 32 | bytes32| 25| The data hash of the batch|
+|blobVersionedHash | 32 | bytes32| 57| The versioned hash of the blob with this batch’s data|
+|parentBatchHash  | 32 | bytes32| 89| The parent batch hash|
+|lastBlockTimestamp | 8 | uint64| 121| The timestamp of the last block in this batch|
+|blobDataProof | 64 | bytes64| 129| The blob data proof: z (32), y (32)|
+
+## Continuous batches
+A list of continuous batches $b_1, \dots, b_k$ satisfy
+```
+b_i.batch_hash == b_{i+1}.parent_batch_hash
+```
+for $i \in [1, k-1]$.
+Unlike chunks aggregation, the last layer of recursive batch aggregation can accept an arbitrary number of batches. There's no explicit upper limit. Instead, the number of rounds of recursion can solely be defined by the latency target on L1 for those batches. As a result, continuous batches are never padded. 
+
+## Bundle
+A __bundle__ is a list of `batches` that will be aggregated recursively using the `RecursionCircuit`. The __bundle__ is the current apex entity whose proof will be verified on-chain.
 
 # Circuits
 
 ## Chunk circuit
 
-Circuit proving the relationship for a chunk is indeed the zkEVM circuit. It will go through 2 layers of compression circuit, and becomes a __snark__ struct. We do not list its details here. Abstractly, a snark circuit has the following properties:
+Circuit proving the relationship for a chunk is the zkEVM circuit. It will go through 2 layers of compression circuit, and becomes a __snark__ struct. We do not list its details here. Abstractly, a snark circuit has the following properties:
 - it takes 44 elements as public inputs 
     - 12 from accumulators
     - 32 from public input hash
 
+## Batch Circuit
 
-![Architecture](./figures/hashes.jpg)
-
-## Aggregation Circuit
-
-We want to aggregate `k` snarks, each from a valid chunk. We generate `(n-k)` padded chunks, and obtain a total of `n` snarks. 
-
-In the above example, we have `k = 2` valid chunks, and `2` padded chunks.
-
-The padded snarks are identical the the last valid snark, so the aggregator does not need to generate snarks for padded chunks.
+We want to aggregate `k` snarks, each from a valid chunk. We generate `(n-k)` padded chunks (by repeating the last non-padding chunk), and obtain a total of `n` snarks.
+Additionally, the batch circuit has to ascertain the correct hash transition from the parent batch. 
 
 ### Configuration
 
-There will be three configurations for Aggregation circuit.
-- FpConfig; used for snark aggregation
-- KeccakConfig: used to build keccak table
-- RlcConfig: used to compute RLC of hash inputs
+There are several configuration subcomponents for batch circuit.
+- FpConfig; used for snark aggregation.
+- KeccakConfig: used to build keccak table.
+- RlcConfig: used to compute RLC of hash inputs.
+- BlobDataConfig: used for representing the zstd-encoded form of batch data, with 4096 * 31 rows. Each row is a byte value.
+- BatchDataConfig: used for representing plain bytes as preimage for `batch_hash`.
+- DecoderConfig: used for describing how the blob data is decoded into plain bytes (batch data).
+- BarycentricEvaluationConfig: used for evaluate the interpolated blob polynomial at an arbitrary challenge point.
 
 ### Public Input
-The public input of the aggregation circuit consists of
+The public input of the batch circuit consists of
 - 12 elements from accumulator
-- 32 elements of `batch_pi_hash`
+- 2 elements of `parent_state_root` (split by hi/lo)
+- 2 elements of `parent_batch_hash`
+- 2 elements of `current_state_root`
+- 2 elements of `current_batch_hash`
+- 1 element of `chain_id`
+- 2 elements of `current_withdraw_root`
 
-### Statements
-For snarks $s_1,\dots,s_k,\dots, s_n$ the aggregation circuit argues the following statements.
-
-1. batch_data_hash digest is reused for public input hash. __Static__.
+Note that `parent_state_root` is the same as `chunk[0].prev_state_root` and `current_state_root` is the same as `chunk[k].post_state_root`. When these chunk fields are assigned into keccak preimages, their cells are constrained against the public input to ensure equality. If any public input appears in the preimage of the `batch_hash`, their corresponding assigned preimage cells will be equality contrained as well. 
 
-2. batch_pi_hash used same roots as chunk_pi_hash. __Static__.
-```
-batch_pi_hash   := keccak(chain_id || chunk_1.prev_state_root || chunk_n.post_state_root || chunk_n.withdraw_root || batch_data_hash || z || y || versioned_hash)
-```
-and `batch_pi_hash` matches public input.
+### Statements
+For snarks $s_1,\dots,s_k,\dots, s_n$ the batch circuit argues the following statements.
 
-3. batch_data_hash and chunk[i].pi_hash use a same chunk[i].data_hash when chunk[i] is not padded
+1. batch_data_hash digest is reused for batch hash. __Static__.
 
+2. batch_data_hash and chunk[i].pi_hash use a same chunk[i].data_hash when chunk[i] is not padded
 ```
 for i in 1 ... n
     chunk_pi_hash   := keccak(chain_id || prev_state_root || post_state_root || withdraw_root || chunk_data_hash || chunk_txdata_hash)
 ```
-
 This is done by computing the RLCs of chunk[i]'s data_hash for `i=0..k`, and then check the RLC matches the one from the keccak table.
 
-4. chunks are continuous when they are not padded: they are linked via the state roots.
-
+3. chunks are continuous when they are not padded: they are linked via the state roots.
 ```
 for i in 1 ... k-1
     c_i.post_state_root == c_{i+1}.prev_state_root
 ```
-
-5. All the chunks use the same chain id. __Static__.
+4. All the chunks use the same chain id. __Static__.
 ```
 for i in 1 ... n
     batch.chain_id == chunk[i].chain_id
 ```
-
-6. The last `(n-k)` chunk[i] are padding
+5. The last `(n-k)` chunk[i] are padding
 ```
 for i in 1 ... n:
     if is_padding:
         chunk[i]'s chunk_pi_hash_rlc_cells == chunk[i-1].chunk_pi_hash_rlc_cells
 ```
 This is done via comparing the `data_rlc` of `chunk_{i-1}` and ` chunk_{i}`.
-7. the hash input length is correct
-- hashes[0] has 200 bytes
-- hashes[1..N_SNARKS+1] has 168 bytes input
-- batch's data_hash length is 32 * number_of_valid_snarks
-8. batch data hash is correct w.r.t. its RLCs
-9. is_final_cells are set correctly
+6. the hash input length is correct
+- hashes[0] has 193 bytes (`batch_hash` preimage)
+- hashes[1..N_SNARKS+1] has 168 bytes input (`chunk_pi_hash` preimages)
+- batch's data_hash length is 32 * number_of_valid_snarks (`batch_data_hash` preimage)
 
 ### Handling dynamic inputs
-
-
 ![Dynamic_inputs](./figures/hash_table.jpg)
 
-
 Our keccak table uses $2^{19}$ rows. Each keccak round takes `300` rows. When the number of round is less than $2^{19}/300$, the cell manager will fill in the rest of the rows with dummy hashes. 
 
 The only hash that uses a dynamic number of rounds is the last hash. 
@@ -158,3 +181,62 @@ For the output of the final data hash
 
 Additional checks for dummy chunk
 - if `is_padding` for `i`-th chunk, we constrain `chunk[i]'s chunk_pi_hash_rlc_cells == chunk[i-1].chunk_pi_hash_rlc_cells`
+
+## Recursion Circuit
+
+`RecursionCircuit` aggregates $N$ SNARKs from a generic circuit (called `AppCircuit`). It achieves this aggregation by repeatedly combine each `AppCircuit`'s SNARK with a SNARK generated from last round of aggregation (hence the name `recursion`). In each round of recursion, the Recursion circuit verifies a SNARK from the `AppCircuit` and its SNARK from the “previous” round. For the first round of aggregation, a dummy SNARK is generated to combine with the first `AppCircuit` SNARK. Essentially, we have:
+
+$RC_{Snark}(N) \colonequals verify(App_{Snark}(N)) \bigvee verify(RC_{Snark}(N-1))$
+
+where $RC$ indicates the Recursion circuit.
+With `accumulator_indices` , Recursion Circuit can merge `AppCircuit` ’s accumulator, in case `AppCircuit` itself is also a Batch Circuit.
+
+### StateTransition Trait
+The `AppCircuit` must follow a layout that `RecursionCircuit` accepts. The layout is described in the `StateTransition` trait which describes a data which can transition from prev state to current state, with methods clearly indicating the indices of accumulators, states (prev and post) and additional exported PI fields. 
+
+```rust
+pub trait StateTransition: Sized {
+    type Input: Clone;
+    type Circuit: CircuitExt<Fr>;
+
+    // Describes the state transition
+    fn state_transition(&self, round: usize) -> Self::Input;
+
+    // Count of the fields used to represent state. The public input consists of twice
+    // this number as both the previous and current states are included in the public input.
+    fn num_transition_instance() -> usize
+
+    // Other counts of instance variables
+    fn num_additional_instance() -> usize
+    fn num_instance() -> usize
+    fn num_accumulator_instance() -> usize
+
+    // Location of accumulator, state variables and additional exported PIs
+    fn accumulator_indices() -> Vec<usize>
+    fn state_prev_indices() -> Vec<usize>
+    fn state_indices() -> Vec<usize>
+    fn additional_indices() -> Vec<usize>
+}
+```
+
+### Public Inputs
+All parts of $PI$ in `AppCircuit` is also put into the $PI$ of recursion circuit, the recursion circuit has a single column of $PI$ with following layout:
+
+```markdown
+`accumulator` | `preprocessed_digest` | `init_states` | `final_states` | `round`
+```
+
+- `accumulator` accumulates all of the accumulators from the $N$ $snark_{app}$, all the accumulators exported from the $PI$ of these snarks (if there is any), and accumulators generated by the $N$ steps verification of snarks from recursion circuit.
+- `preprocessed_digest` represents the Recursion Circuit itself. There would be an unique value for every recursion circuit which can bundle (any number of) snarks from specified `AppCircuit`
+- `init_states` represent the initial state $S_0$.
+- `final_states` represent the final state, along with the exported $PI$ from $S_N$.
+- `round` represents the number of batches being bundled recursively, i.e. $N$.
+
+### Statements
+To verify the $k_{th}$ snark, we have 3 PIs from the current circuit, the snark of $k_{th}$ `AppCircuit` , and the snark of $(k-1)_{th}$ recursion circuit respectively. We named it $PI$, $PI_{app}$ and $PI_{prev}$. We have following equality constraints for them:
+
+- if $N > 0$, $PI(preprocessed\_digest) = PI_{prev}(preprocessed\_digest)$: ensure the snark for “previous recursion circuit” is the same circuit of current one
+- if $N > 0$, $PI(round) = PI_{prev}(round) + 1$: ensure the round number is increment so the first snark from app circuit has round = 0
+- $PI_{app}(final\_states) = PI(final\_states)$: transparent pass the PI to app circuit
+- if $N > 0$, $PI(init\_states) = PI_{prev}(init\_states)$, else $PI(init\_states) = PI_{app}(init\_states)$c: propagate the init state, and for first recursion, the init state part of PI is passed to app circuit
+- $PI_{app}(init\_states) = PI_{prev}(final\_states)$: the init state part of PI for app circuit must be “chained” with previous recursion round
\ No newline at end of file
diff --git a/aggregator/src/batch.rs b/aggregator/src/batch.rs
index bcfebe39d2..046a71c6e2 100644
--- a/aggregator/src/batch.rs
+++ b/aggregator/src/batch.rs
@@ -131,9 +131,6 @@ impl<const N_SNARKS: usize> BatchHeader<N_SNARKS> {
 /// - the first k chunks are from real traces
 /// - the last (#N_SNARKS-k) chunks are from empty traces
 /// A BatchHash consists of 2 hashes.
-/// - batch_pi_hash   := keccak(chain_id || chunk_0.prev_state_root || chunk_k-1.post_state_root ||
-///   chunk_k-1.withdraw_root || batch_data_hash || z || y || versioned_hash)
-///
 /// - batchHash := keccak256(version || batch_index || l1_message_popped || total_l1_message_popped ||
 ///   batch_data_hash || versioned_hash || parent_batch_hash || last_block_timestamp || z || y)
 /// - batch_data_hash := keccak(chunk_0.data_hash || ... || chunk_k-1.data_hash)

From 27ec9ea844f88b4f6710d82ee71ebadf776703b4 Mon Sep 17 00:00:00 2001
From: Ray Gao <qg2153@columbia.edu>
Date: Mon, 15 Jul 2024 22:00:44 -0400
Subject: [PATCH 02/10] Fix typo

---
 aggregator/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/aggregator/README.md b/aggregator/README.md
index 0f5a0c7c33..9c89d2ba59 100644
--- a/aggregator/README.md
+++ b/aggregator/README.md
@@ -118,7 +118,7 @@ The public input of the batch circuit consists of
 - 1 element of `chain_id`
 - 2 elements of `current_withdraw_root`
 
-Note that `parent_state_root` is the same as `chunk[0].prev_state_root` and `current_state_root` is the same as `chunk[k].post_state_root`. When these chunk fields are assigned into keccak preimages, their cells are constrained against the public input to ensure equality. If any public input appears in the preimage of the `batch_hash`, their corresponding assigned preimage cells will be equality contrained as well. 
+Note that `parent_state_root` is the same as `chunk[0].prev_state_root` and `current_state_root` is the same as `chunk[k].post_state_root`. When these chunk fields are assigned into keccak preimages, their cells are constrained against the public input to ensure equality. If any public input appears in the preimage of the `batch_hash`, their corresponding assigned preimage cells will be equality constrained as well. 
 
 ### Statements
 For snarks $s_1,\dots,s_k,\dots, s_n$ the batch circuit argues the following statements.

From 091aeb2965ca329b63e567d24ea200cc05247449 Mon Sep 17 00:00:00 2001
From: Ho <noel.wei@gmail.com>
Date: Tue, 16 Jul 2024 16:47:00 +0900
Subject: [PATCH 03/10] Update README.md

To emphase the "continuous" in aggregated data
---
 aggregator/README.md | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/aggregator/README.md b/aggregator/README.md
index 9c89d2ba59..ea503ab3be 100644
--- a/aggregator/README.md
+++ b/aggregator/README.md
@@ -51,12 +51,18 @@ If $k< n$, $(n-k)$ padded chunks are padded to the list. A padded chunk has the
 
 ## Batch
 
-A __batch__ is a list of `chunks` of size `k` that will be aggregated using the `BatchCircuit`. If the input chunks' size `k` is less than `n`, we pad the input with `(n-k)` chunks identical to `chunk[k]`. The batch is represented by the preimage fields to the `batch_hash`, which is constructed as:
+A __batch__ is a list of continuous `chunks` of size `k` that will be aggregated using the `BatchCircuit`. If the input chunks' size `k` is less than `n`, we pad the input with `(n-k)` chunks identical to `chunk[k]`. The batch is represented by the preimage fields to the `batch_hash`, which is constructed as:
 ```
 batchHash := keccak256(version || batch_index || l1_message_popped || total_l1_message_popped || batch_data_hash || versioned_hash || parent_batch_hash || last_block_timestamp || z || y)
 ```
 All preimage fields' values are provided to the batch through the `BatchHeader` struct, so it can correctly construct the hash state transition from `parent_batch_hash` to `batch_hash` (for current batch). 
 
+Note there are also implicitly represents of state roots before/after batch from the states of chunks it has aggregated:
+```
+prev_state_root := c_0.prev_state_root
+post_state_root := c_k.post_state_root
+```
+
 ## BatchHeader
 The current schema for batch header is:
 
@@ -75,13 +81,14 @@ The current schema for batch header is:
 ## Continuous batches
 A list of continuous batches $b_1, \dots, b_k$ satisfy
 ```
-b_i.batch_hash == b_{i+1}.parent_batch_hash
+b_i.batch_hash == b_{i+1}.parent_batch_hash AND
+b_i.post_state_root == b_{i+1}.prev_state_root
 ```
 for $i \in [1, k-1]$.
 Unlike chunks aggregation, the last layer of recursive batch aggregation can accept an arbitrary number of batches. There's no explicit upper limit. Instead, the number of rounds of recursion can solely be defined by the latency target on L1 for those batches. As a result, continuous batches are never padded. 
 
 ## Bundle
-A __bundle__ is a list of `batches` that will be aggregated recursively using the `RecursionCircuit`. The __bundle__ is the current apex entity whose proof will be verified on-chain.
+A __bundle__ is a list of continuous `batches` that will be aggregated recursively using the `RecursionCircuit`. The __bundle__ is the current apex entity whose proof will be verified on-chain.
 
 # Circuits
 
@@ -239,4 +246,4 @@ To verify the $k_{th}$ snark, we have 3 PIs from the current circuit, the snark
 - if $N > 0$, $PI(round) = PI_{prev}(round) + 1$: ensure the round number is increment so the first snark from app circuit has round = 0
 - $PI_{app}(final\_states) = PI(final\_states)$: transparent pass the PI to app circuit
 - if $N > 0$, $PI(init\_states) = PI_{prev}(init\_states)$, else $PI(init\_states) = PI_{app}(init\_states)$c: propagate the init state, and for first recursion, the init state part of PI is passed to app circuit
-- $PI_{app}(init\_states) = PI_{prev}(final\_states)$: the init state part of PI for app circuit must be “chained” with previous recursion round
\ No newline at end of file
+- $PI_{app}(init\_states) = PI_{prev}(final\_states)$: the init state part of PI for app circuit must be “chained” with previous recursion round

From 22b5c2176eee1201680028bd37dd946a24e14831 Mon Sep 17 00:00:00 2001
From: Rohit Narurkar <rohit.narurkar@proton.me>
Date: Tue, 16 Jul 2024 10:53:08 +0100
Subject: [PATCH 04/10] doc: more inline documentation for RecursionCircuit

---
 aggregator/src/recursion/circuit.rs | 128 +++++++++++++++++++++-------
 1 file changed, 96 insertions(+), 32 deletions(-)

diff --git a/aggregator/src/recursion/circuit.rs b/aggregator/src/recursion/circuit.rs
index 7cb7cb8a9e..5ae92c0f7b 100644
--- a/aggregator/src/recursion/circuit.rs
+++ b/aggregator/src/recursion/circuit.rs
@@ -25,14 +25,21 @@ use sv_halo2_base::{
     QuantumCell::Existing,
 };
 
-use crate::param::ConfigParams as BatchCircuitConfigParams;
+use crate::param::ConfigParams as RecursionCircuitConfigParams;
 
 use super::*;
 
+/// Convenience type to represent the verifying key.
 type Svk = KzgSuccinctVerifyingKey<G1Affine>;
+
+/// Convenience type to represent the polynomial commitment scheme.
 type Pcs = Kzg<Bn256, Bdfg21>;
+
+/// Convenience type to represent the accumulation scheme for accumulating proofs from multiple
+/// SNARKs.
 type As = KzgAs<Pcs>;
 
+/// Select condition ? LHS : RHS.
 fn select_accumulator<'a>(
     loader: &Rc<Halo2Loader<'a>>,
     condition: &AssignedValue<Fr>,
@@ -56,6 +63,7 @@ fn select_accumulator<'a>(
     ))
 }
 
+/// Accumulate a value into the current accumulator.
 fn accumulate<'a>(
     loader: &Rc<Halo2Loader<'a>>,
     accumulators: Vec<KzgAccumulator<G1Affine, Rc<Halo2Loader<'a>>>>,
@@ -68,21 +76,39 @@ fn accumulate<'a>(
 
 #[derive(Clone)]
 pub struct RecursionCircuit<ST> {
+    /// The verifying key for the circuit.
     svk: Svk,
+    /// The default accumulator to initialise the circuit.
     default_accumulator: KzgAccumulator<G1Affine, NativeLoader>,
+    /// The SNARK witness from the k-th BatchCircuit.
     app: SnarkWitness,
+    /// The SNARK witness from the (k-1)-th BatchCircuit.
     previous: SnarkWitness,
+    /// The recursion round, starting at round=0 and incrementing at every subsequent recursion.
     round: usize,
+    /// The public inputs to the RecursionCircuit itself.
     instances: Vec<Fr>,
+    /// The accumulation of the SNARK proofs recursed over thus far.
     as_proof: Value<Vec<u8>>,
-    app_is_aggregation: bool,
+
     _marker: PhantomData<ST>,
 }
 
 impl<ST: StateTransition> RecursionCircuit<ST> {
+    /// The index of the preprocessed digest in the [`RecursionCircuit`]'s instances. Note that we
+    /// need a single cell to hold this value as it is a poseidon hash over the bn256 curve, hence
+    /// it fits within an [`Fr`] cell.
+    ///
+    /// [`Fr`]: halo2_proofs::halo2curves::bn256::Fr
     const PREPROCESSED_DIGEST_ROW: usize = 4 * LIMBS;
+
+    /// The index within the instances to find the "initial" state in the state transition.
     const INITIAL_STATE_ROW: usize = Self::PREPROCESSED_DIGEST_ROW + 1;
 
+    /// Construct a new instance of the [`RecursionCircuit`] given the SNARKs from the current and
+    /// previous [`BatchCircuit`], and the recursion round.
+    ///
+    /// [`BatchCircuit`]: aggregator::BatchCircuit
     pub fn new(
         params: &ParamsKZG<Bn256>,
         app: Snark,
@@ -180,7 +206,6 @@ impl<ST: StateTransition> RecursionCircuit<ST> {
             round,
             instances,
             as_proof: Value::known(as_proof),
-            app_is_aggregation: true,
             _marker: Default::default(),
         }
     }
@@ -206,7 +231,14 @@ impl<ST: StateTransition> RecursionCircuit<ST> {
 
     /// Returns the number of instance cells in the Recursion Circuit, help to refine the CircuitExt trait
     pub fn num_instance_fixed() -> usize {
-        // [..lhs, ..rhs, preprocessed_digest, initial_state, state, round]
+        // [
+        //     ..lhs (accumulator LHS),
+        //     ..rhs (accumulator RHS),
+        //     preprocessed_digest,
+        //     initial_state,
+        //     state,
+        //     round
+        // ]
         4 * LIMBS + 2 * ST::num_transition_instance() + ST::num_additional_instance() + 2
     }
 }
@@ -225,14 +257,13 @@ impl<ST: StateTransition> Circuit<Fr> for RecursionCircuit<ST> {
             instances: self.instances.clone(),
             as_proof: Value::unknown(),
             _marker: Default::default(),
-            app_is_aggregation: self.app_is_aggregation,
         }
     }
 
     fn configure(meta: &mut ConstraintSystem<Fr>) -> Self::Config {
         let path = std::env::var("BUNDLE_CONFIG")
             .unwrap_or_else(|_| "configs/bundle_circuit.config".to_owned());
-        let params: BatchCircuitConfigParams = serde_json::from_reader(
+        let params: RecursionCircuitConfigParams = serde_json::from_reader(
             File::open(path.as_str()).unwrap_or_else(|err| panic!("{err:?}")),
         )
         .unwrap();
@@ -251,7 +282,7 @@ impl<ST: StateTransition> Circuit<Fr> for RecursionCircuit<ST> {
 
         let mut first_pass = halo2_base::SKIP_FIRST_PASS; // assume using simple floor planner
         let assigned_instances = layouter.assign_region(
-            || "",
+            || "recursion circuit",
             |region| -> Result<Vec<Cell>, Error> {
                 if first_pass {
                     first_pass = false;
@@ -266,20 +297,29 @@ impl<ST: StateTransition> Circuit<Fr> for RecursionCircuit<ST> {
                     },
                 );
 
-                let init_state_row_beg = Self::INITIAL_STATE_ROW;
-                let state_row_beg = init_state_row_beg + ST::num_transition_instance();
-                let addition_state_beg = state_row_beg + ST::num_transition_instance();
-                let round_row = addition_state_beg + ST::num_additional_instance();
+                // The index of the "initial state", i.e. the state last finalised on L1.
+                let index_init_state = Self::INITIAL_STATE_ROW;
+                // The index of the "state", i.e. the state achieved post the current batch.
+                let index_state = index_init_state + ST::num_transition_instance();
+                // The index where the "additional" fields required to define the state are
+                // present.
+                let index_additional_state = index_state + ST::num_transition_instance();
+                // The index to find the "round" of recursion in the current instance of the
+                // Recursion Circuit.
+                let index_round = index_additional_state + ST::num_additional_instance();
+
                 log::debug!(
-                    "state position: init {}|cur {}|add {}",
-                    state_row_beg,
-                    addition_state_beg,
-                    round_row
+                    "indices within instances: init {} |cur {} | add {} | round {}",
+                    index_init_state,
+                    index_state,
+                    index_additional_state,
+                    index_round,
                 );
 
+                // Get the field elements representing the "preprocessed digest" and "recursion round".
                 let [preprocessed_digest, round] = [
                     self.instances[Self::PREPROCESSED_DIGEST_ROW],
-                    self.instances[round_row],
+                    self.instances[index_round],
                 ]
                 .map(|instance| {
                     main_gate
@@ -287,7 +327,8 @@ impl<ST: StateTransition> Circuit<Fr> for RecursionCircuit<ST> {
                         .unwrap()
                 });
 
-                let initial_state = self.instances[init_state_row_beg..state_row_beg]
+                // Get the field elements representing the "initial state"
+                let initial_state = self.instances[index_init_state..index_state]
                     .iter()
                     .map(|&instance| {
                         main_gate
@@ -296,7 +337,9 @@ impl<ST: StateTransition> Circuit<Fr> for RecursionCircuit<ST> {
                     })
                     .collect::<Vec<_>>();
 
-                let state = self.instances[state_row_beg..round_row]
+                // Get the field elements representing the "state" post batch. This includes the
+                // additional state fields as well.
+                let state = self.instances[index_state..index_round]
                     .iter()
                     .map(|&instance| {
                         main_gate
@@ -305,6 +348,7 @@ impl<ST: StateTransition> Circuit<Fr> for RecursionCircuit<ST> {
                     })
                     .collect::<Vec<_>>();
 
+                // Whether or not we are in the first round of recursion.
                 let first_round = main_gate.is_zero(&mut ctx, &round);
                 let not_first_round = main_gate.not(&mut ctx, Existing(first_round));
 
@@ -318,6 +362,8 @@ impl<ST: StateTransition> Circuit<Fr> for RecursionCircuit<ST> {
                     Some(preprocessed_digest),
                 );
 
+                // Choose between the default accumulator or the previous accumulator depending on
+                // whether or not we are in the first round of recursion.
                 let default_accumulator = self.load_default_accumulator(&loader)?;
                 let previous_accumulators = previous_accumulators
                     .iter()
@@ -331,6 +377,8 @@ impl<ST: StateTransition> Circuit<Fr> for RecursionCircuit<ST> {
                     })
                     .collect::<Result<Vec<_>, Error>>()?;
 
+                // Accumulate the accumulators over the previous accumulators, to compute the
+                // accumulator values for this instance of the Recursion Circuit.
                 let KzgAccumulator { lhs, rhs } = accumulate(
                     &loader,
                     [app_accumulators, previous_accumulators].concat(),
@@ -343,9 +391,15 @@ impl<ST: StateTransition> Circuit<Fr> for RecursionCircuit<ST> {
                 let previous_instances = previous_instances.pop().unwrap();
 
                 let mut ctx = loader.ctx_mut();
+
+                //////////////////////////////////////////////////////////////////////////////////
+                /////////////////////////////// CONSTRAINTS //////////////////////////////////////
+                //////////////////////////////////////////////////////////////////////////////////
+
+                // Propagate the "initial state"
                 let initial_state_propagate = initial_state
                     .iter()
-                    .zip_eq(previous_instances[init_state_row_beg..state_row_beg].iter())
+                    .zip_eq(previous_instances[index_init_state..index_state].iter())
                     .zip_eq(
                         ST::state_prev_indices()
                             .into_iter()
@@ -353,21 +407,23 @@ impl<ST: StateTransition> Circuit<Fr> for RecursionCircuit<ST> {
                     )
                     .flat_map(|((&st, &previous_st), &app_inst)| {
                         [
-                            // Propagate initial_state
-                            (
-                                main_gate.mul(&mut ctx, Existing(st), Existing(not_first_round)),
-                                previous_st,
-                            ),
-                            // Verify initial_state is same as the first application snark
+                            // Verify initial_state is same as the first application snark in the
+                            // first round of recursion.
                             (
                                 main_gate.mul(&mut ctx, Existing(st), Existing(first_round)),
                                 main_gate.mul(&mut ctx, Existing(app_inst), Existing(first_round)),
                             ),
+                            // Propagate initial_state for subsequent rounds of recursion.
+                            (
+                                main_gate.mul(&mut ctx, Existing(st), Existing(not_first_round)),
+                                previous_st,
+                            ),
                         ]
                     })
                     .collect::<Vec<_>>();
 
-                // Verify current state is same as the current application snark
+                // Verify that the current "state" is the same as the state defined in the
+                // application SNARK.
                 let verify_app_state = state
                     .iter()
                     .zip_eq(
@@ -383,8 +439,10 @@ impl<ST: StateTransition> Circuit<Fr> for RecursionCircuit<ST> {
                     .map(|(&st, &app_inst)| (st, app_inst))
                     .collect::<Vec<_>>();
 
-                // Verify previous state (additional state not included) is same as the current application snark
-                let verify_app_init_state = previous_instances[state_row_beg..addition_state_beg]
+                // Verify that the "previous state" (additional state not included) is the same
+                // as the previous state defined in the current application SNARK. This check is
+                // meaningful only in subsequent recursion rounds after the first round.
+                let verify_app_init_state = previous_instances[index_state..index_additional_state]
                     .iter()
                     .zip_eq(
                         ST::state_prev_indices()
@@ -399,8 +457,10 @@ impl<ST: StateTransition> Circuit<Fr> for RecursionCircuit<ST> {
                     })
                     .collect::<Vec<_>>();
 
+                // Finally apply the equality constraints between the (LHS, RHS) values constructed
+                // above.
                 for (lhs, rhs) in [
-                    // Propagate preprocessed_digest
+                    // Propagate the preprocessed digest.
                     (
                         main_gate.mul(
                             &mut ctx,
@@ -409,13 +469,13 @@ impl<ST: StateTransition> Circuit<Fr> for RecursionCircuit<ST> {
                         ),
                         previous_instances[Self::PREPROCESSED_DIGEST_ROW],
                     ),
-                    // Verify round is increased by 1 when not at first round
+                    // Verify that "round" increments by 1 when not the first round of recursion.
                     (
                         round,
                         main_gate.add(
                             &mut ctx,
                             Existing(not_first_round),
-                            Existing(previous_instances[round_row]),
+                            Existing(previous_instances[index_round]),
                         ),
                     ),
                 ]
@@ -427,13 +487,15 @@ impl<ST: StateTransition> Circuit<Fr> for RecursionCircuit<ST> {
                     ctx.region.constrain_equal(lhs.cell(), rhs.cell())?;
                 }
 
-                // IMPORTANT:
+                // Mark the end of this phase.
                 config.base_field_config.finalize(&mut ctx);
+
                 #[cfg(feature = "display")]
                 dbg!(ctx.total_advice);
                 #[cfg(feature = "display")]
                 println!("Advice columns used: {}", ctx.advice_alloc[0][0].0 + 1);
 
+                // Return the computed instance cells for this Recursion Circuit.
                 Ok([lhs.x(), lhs.y(), rhs.x(), rhs.y()]
                     .into_iter()
                     .flat_map(|coordinate| coordinate.limbs())
@@ -447,6 +509,8 @@ impl<ST: StateTransition> Circuit<Fr> for RecursionCircuit<ST> {
         )?;
 
         assert_eq!(assigned_instances.len(), self.num_instance()[0]);
+
+        // Ensure that the computed instances are in fact the instances for this circuit.
         for (row, limb) in assigned_instances.into_iter().enumerate() {
             layouter.constrain_instance(limb, config.instance, row)?;
         }

From df0af78b8ef62ed1623fd4e16969cecfd7902fa4 Mon Sep 17 00:00:00 2001
From: Rohit Narurkar <rohit.narurkar@proton.me>
Date: Tue, 16 Jul 2024 11:09:46 +0100
Subject: [PATCH 05/10] rephrase comment

---
 aggregator/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/aggregator/README.md b/aggregator/README.md
index ea503ab3be..f83e81c32c 100644
--- a/aggregator/README.md
+++ b/aggregator/README.md
@@ -57,7 +57,7 @@ batchHash := keccak256(version || batch_index || l1_message_popped || total_l1_m
 ```
 All preimage fields' values are provided to the batch through the `BatchHeader` struct, so it can correctly construct the hash state transition from `parent_batch_hash` to `batch_hash` (for current batch). 
 
-Note there are also implicitly represents of state roots before/after batch from the states of chunks it has aggregated:
+Note that there are also implicit constraints between state roots before/after batch and the state roots of the chunks it has aggregated:
 ```
 prev_state_root := c_0.prev_state_root
 post_state_root := c_k.post_state_root

From bb4a71fdc0e65d8f69e65b00c5dd9b08d601164c Mon Sep 17 00:00:00 2001
From: Rohit Narurkar <rohit.narurkar@proton.me>
Date: Tue, 16 Jul 2024 11:09:58 +0100
Subject: [PATCH 06/10] Add more context to comment

---
 aggregator/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/aggregator/README.md b/aggregator/README.md
index f83e81c32c..30269b6192 100644
--- a/aggregator/README.md
+++ b/aggregator/README.md
@@ -110,7 +110,7 @@ There are several configuration subcomponents for batch circuit.
 - FpConfig; used for snark aggregation.
 - KeccakConfig: used to build keccak table.
 - RlcConfig: used to compute RLC of hash inputs.
-- BlobDataConfig: used for representing the zstd-encoded form of batch data, with 4096 * 31 rows. Each row is a byte value.
+- BlobDataConfig: used for representing the zstd-encoded form of batch data, with `4096 * 31` rows. Each row is a byte value. An EIP-4844 blob consists of `4096 * 32` bytes, where we set the most-significant byte in each 32-bytes chunk as `0` to guarantee that each 32-bytes chunk is a valid BLS12-381 scalar field element.
 - BatchDataConfig: used for representing plain bytes as preimage for `batch_hash`.
 - DecoderConfig: used for describing how the blob data is decoded into plain bytes (batch data).
 - BarycentricEvaluationConfig: used for evaluate the interpolated blob polynomial at an arbitrary challenge point.

From a203456ad5f024aa81623a8fa7897bc56fda9690 Mon Sep 17 00:00:00 2001
From: Rohit Narurkar <rohit.narurkar@proton.me>
Date: Tue, 16 Jul 2024 11:10:09 +0100
Subject: [PATCH 07/10] Add more context to comment

---
 aggregator/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/aggregator/README.md b/aggregator/README.md
index 30269b6192..eb22e3ce15 100644
--- a/aggregator/README.md
+++ b/aggregator/README.md
@@ -111,7 +111,7 @@ There are several configuration subcomponents for batch circuit.
 - KeccakConfig: used to build keccak table.
 - RlcConfig: used to compute RLC of hash inputs.
 - BlobDataConfig: used for representing the zstd-encoded form of batch data, with `4096 * 31` rows. Each row is a byte value. An EIP-4844 blob consists of `4096 * 32` bytes, where we set the most-significant byte in each 32-bytes chunk as `0` to guarantee that each 32-bytes chunk is a valid BLS12-381 scalar field element.
-- BatchDataConfig: used for representing plain bytes as preimage for `batch_hash`.
+- BatchDataConfig: used for representing the raw batch bytes, effectively constructing the random challenge point `z` for the KZG opening proof.
 - DecoderConfig: used for describing how the blob data is decoded into plain bytes (batch data).
 - BarycentricEvaluationConfig: used for evaluate the interpolated blob polynomial at an arbitrary challenge point.
 

From 088c4161b8d48f57ecab32e289a445a13e2e05dd Mon Sep 17 00:00:00 2001
From: Rohit Narurkar <rohit.narurkar@proton.me>
Date: Tue, 16 Jul 2024 11:10:18 +0100
Subject: [PATCH 08/10] rephrase comment

---
 aggregator/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/aggregator/README.md b/aggregator/README.md
index eb22e3ce15..733f3f5a01 100644
--- a/aggregator/README.md
+++ b/aggregator/README.md
@@ -112,7 +112,7 @@ There are several configuration subcomponents for batch circuit.
 - RlcConfig: used to compute RLC of hash inputs.
 - BlobDataConfig: used for representing the zstd-encoded form of batch data, with `4096 * 31` rows. Each row is a byte value. An EIP-4844 blob consists of `4096 * 32` bytes, where we set the most-significant byte in each 32-bytes chunk as `0` to guarantee that each 32-bytes chunk is a valid BLS12-381 scalar field element.
 - BatchDataConfig: used for representing the raw batch bytes, effectively constructing the random challenge point `z` for the KZG opening proof.
-- DecoderConfig: used for describing how the blob data is decoded into plain bytes (batch data).
+- DecoderConfig: implements an in-circuit zstd-decoder that decodes blob data into batch data
 - BarycentricEvaluationConfig: used for evaluate the interpolated blob polynomial at an arbitrary challenge point.
 
 ### Public Input

From aae22b608ac5db373e654ac745efa560d9a07f25 Mon Sep 17 00:00:00 2001
From: Rohit Narurkar <rohit.narurkar@proton.me>
Date: Tue, 16 Jul 2024 11:10:37 +0100
Subject: [PATCH 09/10] rephrase comment and add more context

---
 aggregator/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/aggregator/README.md b/aggregator/README.md
index 733f3f5a01..d57c76dd24 100644
--- a/aggregator/README.md
+++ b/aggregator/README.md
@@ -113,7 +113,7 @@ There are several configuration subcomponents for batch circuit.
 - BlobDataConfig: used for representing the zstd-encoded form of batch data, with `4096 * 31` rows. Each row is a byte value. An EIP-4844 blob consists of `4096 * 32` bytes, where we set the most-significant byte in each 32-bytes chunk as `0` to guarantee that each 32-bytes chunk is a valid BLS12-381 scalar field element.
 - BatchDataConfig: used for representing the raw batch bytes, effectively constructing the random challenge point `z` for the KZG opening proof.
 - DecoderConfig: implements an in-circuit zstd-decoder that decodes blob data into batch data
-- BarycentricEvaluationConfig: used for evaluate the interpolated blob polynomial at an arbitrary challenge point.
+- BarycentricEvaluationConfig: used for evaluating the interpolated blob polynomial at an arbitrary challenge point `z`, where both `z` and the evaluation `y` are included in the `BatchHeader`.
 
 ### Public Input
 The public input of the batch circuit consists of

From 18a7f5ad496c25a709cd2a352fae091e4ad8ef04 Mon Sep 17 00:00:00 2001
From: Rohit Narurkar <rohit.narurkar@proton.me>
Date: Tue, 16 Jul 2024 11:10:48 +0100
Subject: [PATCH 10/10] nit-pick

---
 aggregator/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/aggregator/README.md b/aggregator/README.md
index d57c76dd24..1dc187e302 100644
--- a/aggregator/README.md
+++ b/aggregator/README.md
@@ -149,7 +149,7 @@ for i in 1 ... k-1
 for i in 1 ... n
     batch.chain_id == chunk[i].chain_id
 ```
-5. The last `(n-k)` chunk[i] are padding
+5. The last `(n-k)` chunk[i] are padded chunks
 ```
 for i in 1 ... n:
     if is_padding: