From 4cecbbb196c4c6bb34f64c43001a41f38834479f Mon Sep 17 00:00:00 2001
From: Akshay Dahiya <akshay@powerloom.io>
Date: Fri, 28 Jul 2023 12:52:50 +0530
Subject: [PATCH] chore: udpate README

---
 README.md | 353 ++++++++++++++++++++++++++++++++----------------------
 1 file changed, 212 insertions(+), 141 deletions(-)

diff --git a/README.md b/README.md
index 2ac6bc6c..197bba7f 100644
--- a/README.md
+++ b/README.md
@@ -4,74 +4,68 @@
 - [Setup](#setup)
 - [State transitions and data composition](#state-transitions-and-data-composition)
   - [Epoch Generation](#epoch-generation)
+  - [Preloading](#preloading)
   - [Base Snapshot Generation](#base-snapshot-generation)
   - [Snapshot Finalization](#snapshot-finalization)
-  - [Aggregation and data composition - snapshot generation of higher order datapoints on base snapshots](#aggregation-and-data-composition---snapshot-generation-of-higher-order-datapoints-on-base-snapshots)
+  - [Aggregation and data composition](#aggregation-and-data-composition---snapshot-generation-of-higher-order-data-points-on-base-snapshots)
+- [Major Components](#major-components)
+  - [System Event Detector](#system-event-detector)
+  - [Process Hub Core](#process-hub-core)
+  - [Processor Distributor](#processor-distributor)
+  - [Delegation Workers for preloaders](#delegation-workers-for-preloaders)
+  - [Callback Workers](#callback-workers)
+  - [RPC Helper](#rpc-helper)
+  - [Core API](#core-api)
 - [Development Instructions](#development-instructions)
   - [Configuration](#configuration)
 - [Monitoring and Debugging](#monitoring-and-debugging)
 - [For Contributors](#for-contributors)
+- [Pooler: Case study and extending this implementation](#pooler-case-study-and-extending-this-implementation)
 - [Extending pooler with a Uniswap v2 data point](#extending-pooler-with-a-uniswap-v2-data-point)
   - [Step 1. Review: Base snapshot extraction logic for trade information](#step-1-review-base-snapshot-extraction-logic-for-trade-information)
   - [Step 2. Review: 24 hour aggregate of trade volume snapshots over a single pair contract](#step-2-review-24-hour-aggregate-of-trade-volume-snapshots-over-a-single-pair-contract)
   - [Step 3. New Datapoint: 2 hours aggregate of only swap events](#step-3-new-datapoint-2-hours-aggregate-of-only-swap-events)
-- [Major Components](#major-components)
-  - [System Event Detector](#system-event-detector)
-  - [Process Hub Core](#process-hub-core)
-  - [Processor Distributor](#processor-distributor)
-  - [Callback Workers](#callback-workers)
-  - [RPC Helper](#rpc-helper)
-  - [Core API](#core-api)
 - [Find us](#find-us)
 
 ## Overview
 
-![Pooler workflow](pooler/static/docs/assets/OverallArchitecture.png)
-
+![Snapshotter workflow](snapshotter/static/docs/assets/OverallArchitecture.png)
 
-Pooler is a Uniswap specific implementation of what is known as a 'snapshotter' in the PowerLoom Protocol ecosystem. It synchronizes with other snapshotter peers over a smart contract running on the present version of the PowerLoom Protocol testnet. It follows an architecture that is driven by state transitions which makes it easy to understand and modify. This present release ultimately provide access to rich aggregates that can power a Uniswap v2 dashboard with the following data points:
+A snapshotter peer as part of Powerloom Protocol does exactly what the name suggests:  It synchronizes with other snapshotter peers over a smart contract running on the present version of the PowerLoom Protocol testnet. It follows an architecture that is driven by state transitions which makes it easy to understand and modify.
 
-- Total Value Locked (TVL)
-- Trade Volume, Liquidity reserves, Fees earned
-    - grouped by
-        - Pair contracts
-        - Individual tokens participating in pair contract
-    - aggregated over time periods
-        - 24 hours
-        - 7 days
-- Transactions containing `Swap`, `Mint`, and `Burn` events
+Because of its decentralized nature, the snapshotter specification and its implementations share some powerful features that can adapt to your specific information requirements on blockchain applications:
 
-
-
-In its [last release](https://github.com/PowerLoom/pooler/releases/tag/v0.1.3-alpha), pooler was a *self contained* system that would provide access to the above. The present implementation differs from that in some major ways:
-
-* each data point is calculated, updated and synchronized with other snapshotter peers participating in the network for this Uniswap v2 use case
-* synchronization of data points is defined as a function of an epoch ID(entifier) where epoch refers to an equally spaced collection of blocks on the data source smart contract's chain (Ethereum mainnet in case of Uniswap v2). This simplifies building of use cases that are stateful (i.e. can be accessed according to their state at a given height of the data source chain), synchronized and depend on reliable data. For example,
-    * dashboards by offering higher order aggregate datapoints
+* each data point is calculated, updated, and synchronized with other snapshotter peers participating in the network
+* synchronization of data points is defined as a function of an epoch ID(identifier) where epoch refers to an equally spaced collection of blocks on the data source blockchain (for eg, Ethereum Mainnet/Polygon Mainnet/Polygon Testnet -- Mumbai). This simplifies the building of use cases that are stateful (i.e. can be accessed according to their state at a given height of the data source chain), synchronized, and depend on reliable data. For example,
+    * dashboards by offering higher-order aggregate datapoints
     * trading strategies and bots
-* a snapshotter peer can now load past epochs, indexes and aggregates from a decentralized state and have access to a rich history of data
-    * earlier a stand alone node would have differing aggregate datapoints from other such nodes running independently, and even though the datasets were decentralized on IPFS/Filecoin, the power of these decentralized storage networks could not be leveraged fully since the content identifiers would often be different without any coordinating pivot like a decentralized state or smart contract on which all peers had to agree.
+* a snapshotter peer can load past epochs, indexes, and aggregates from a decentralized state and have access to a rich history of data
+    * all the datasets are decentralized on IPFS/Filecoin
+    * the power of these decentralized storage networks can be leveraged fully by applying the [principle of composability](#aggregation-and-data-composition---snapshot-generation-of-higher-order-datapoints-on-base-snapshots)
 
 
 ## Setup
 
-Pooler is part of a distributed system with multiple moving parts. The easiest way to get started is by using the docker-based setup from the [deploy](https://github.com/PowerLoom/deploy) repository.
+The snapshotter is a distributed system with multiple moving parts. The easiest way to get started is by using the Docker-based setup from the [deploy](https://github.com/PowerLoom/deploy) repository.
 
 If you're planning to participate as a snapshotter, refer to [these instructions](https://github.com/PowerLoom/deploy#for-snapshotters) to start snapshotting.
 
-If you're a developer, you can follow the [manual configuration steps for pooler](#configuration) from this document followed by the [instructions on the `deploy` repo for code contributors](https://github.com/PowerLoom/deploy#instructions-for-code-contributors) for a more hands on approach.
+If you're a developer, you can follow the [manual configuration steps for pooler](#configuration) from this document followed by the [instructions on the `deploy` repo for code contributors](https://github.com/PowerLoom/deploy#instructions-for-code-contributors) for a more hands-on approach.
+
+**Note** - RPC usage is highly use-case specific. If your use case is complicated and needs to make a lot of RPC calls, it is recommended to run your own RPC node instead of using third-party RPC services as it can be expensive.
 
-**Note** - RPC usage is highly use case specific. If your use case is complicated and needs to make a lot of RPC calls, it is recommended to run your own RPC node instead of using third party RPC services as it can be expensive.
 
 ## State transitions and data composition
 
+![Data composition](snapshotter/static/docs/assets/DependencyDataComposition.png)
+
 ### Epoch Generation
 
 An epoch denotes a range of block heights on the data source blockchain, Ethereum mainnet in the case of Uniswap v2. This makes it easier to collect state transitions and snapshots of data on equally spaced block height intervals, as well as to support future work on other lightweight anchor proof mechanisms like Merkle proofs, succinct proofs, etc.
 
 The size of an epoch is configurable. Let that be referred to as `size(E)`
 
-- A trusted service keeps track of the head of the chain as it moves ahead, and a marker `h₀` against the max block height from the last released epoch. This makes the beginning of the next epoch, `h₁ = h₀ + 1`
+- A [trusted service](https://github.com/PowerLoom/onchain-consensus) keeps track of the head of the chain as it moves ahead, and a marker `h₀` against the max block height from the last released epoch. This makes the beginning of the next epoch, `h₁ = h₀ + 1`
 
 - Once the head of the chain has moved sufficiently ahead so that an epoch can be published, an epoch finalization service takes into account the following factors
     - chain reorganization reports where the reorganized limits are a subset of the epoch qualified to be published
@@ -87,62 +81,198 @@ The size of an epoch is configurable. Let that be referred to as `size(E)`
 
  The `epochId` here is incremented by 1 with every successive epoch release.
 
+
+ ### Preloading
+
+Preloaders perform an important function of fetching low-level data for eg. block details, and transaction receipts so that subsequent base snapshot building can proceed without performing unnecessary redundant calls that ultimately save on access costs on RPC and other queries on the underlying node infrastructure for the source data blockchain.
+
+Each project type within the project configuration as found in [`config/projects.json`](config/projects.example.json) can specify the preloaders that their base snapshot builds depend on. Once the dependent preloaders have completed their fetches, the [Processor Distributor](#processor-distributor) subsequently triggers the base snapshot builders for each project type.
+
+https://github.com/PowerLoom/pooler/blob/5e7cc3812074d91e8d7d85058554bb1175bf8070/config/projects.example.json#L3-L12
+
+The preloaders implement one of the following two generic interfaces
+
+* `GenericPreloader`
+
+https://github.com/PowerLoom/pooler/blob/5e7cc3812074d91e8d7d85058554bb1175bf8070/snapshotter/utils/callback_helpers.py#L109-L126
+
+* `GenericDelegatorPreloader`. Such preloaders are tasked with fetching large volumes of data and utilize [delegated workers](#delegation-workers-for-preloaders) to whom they submit large workloads over a request queue and wait for the results to be returned over a response queue.
+
+https://github.com/PowerLoom/pooler/blob/5e7cc3812074d91e8d7d85058554bb1175bf8070/snapshotter/utils/callback_helpers.py#L129-L161
+
+
+The preloaders can be found in the [`snapshotter/utils/preloaders`](snapshotter/utils/preloaders/) directory. The preloaders that are available to project configuration entries are exposed through the [`config/preloader.json`](config/preloader.json) configuration.
+
+https://github.com/PowerLoom/pooler/blob/5e7cc3812074d91e8d7d85058554bb1175bf8070/config/preloader.json#L1-L27
+
+At the moment, we have 3 generic preloaders built into the snapshotter template.
+- [Block Details](snapshotter/utils/preloaders/block_details/preloader.py) - It prefetches and stores block details for blocks in each Epoch and stores it in Redis
+- [Eth Price](snapshotter/utils/preloaders/eth_price/preloader.py) - It prefetches and stores ETH price for blocks in each Epoch and stores it in redis
+- [Tx Receipts](snapshotter/utils/preloaders/tx_receipts/preloader.py) - It prefetches all transaction details present in each Epoch and stores the data in Redis. Since fetching all block transactions is a lot of work, it utilizes the [delegated workers](#delegation-workers-for-preloaders) architecture to parallelize and fetch data in a fast and reliable way
+
+More preloaders can be easily added depending on the use case user is snapshotting for. It is as simple as writing logic in `preloader.py`, adding the preloader config to `config/preloader.json`, and adding the preloader dependency in `config/projects.json`
+
  ### Base Snapshot Generation
 
- Workers in [`config/projects.json`](config/projects.example.json) calculate base snapshots against this `epochId` which corresponds to collections of state observations and event logs between the blocks at height in the range `[begin, end]`, per Uniswap v2 pair contract. Each such pair contract is assigned a project ID on the protocol state contract according to the following format:
+ Workers, as mentioned in the configuration section for [`config/projects.json`](#configuration), calculate base snapshots against this `epochId` which corresponds to collections of state observations and event logs between the blocks at height in the range `[begin, end]`. The data sources are determined according to the following specification for the `projects` key:
+
+ * an empty array against the `projects` indicates the data sources are to be loaded from the protocol state contract on initialization
+ * an array of EVM-compatible wallet address strings can also be listed
+ * an array of "<addr1>_<addr2>" strings that denote the relationship between two EVM addresses (for eg ERC20 balance of `addr2` against a token contract `addr1`)
+ * data sources can be dynamically added on the protocol state contract which the [processor distributor](#processor-distributor) [syncs with](https://github.com/PowerLoom/pooler/blob/5e7cc3812074d91e8d7d85058554bb1175bf8070/snapshotter/processor_distributor.py#L597)
 
-`{project_type}:{pair_contract_address}:{settings.namespace}`
+The project ID is ultimately generated in the following manner:
 
-https://github.com/PowerLoom/pooler/blob/e7a5bd62debfe726d1473f0dfb68856dab43ef25/pooler/utils/snapshot_worker.py#L114
+https://github.com/PowerLoom/pooler/blob/5e7cc3812074d91e8d7d85058554bb1175bf8070/snapshotter/utils/snapshot_worker.py#L29-L38
 
- The snapshots generated by workers defined in this config are the fundamental data models on which higher order aggregates and richer datapoints are built.
+
+ The snapshots generated by workers defined in this config are the fundamental data models on which higher-order aggregates and richer data points are built. The `SnapshotSubmitted` event generated on such base snapshots further triggers the building of sophisticated aggregates, super-aggregates, filters, and other data composites on top of them.
 
  ### Snapshot Finalization
 
 All snapshots per project reach consensus on the protocol state contract which results in a `SnapshotFinalized` event being triggered.
 
-This helps us in building sophisticated aggregates, super-aggregates, filters and other forms of data composition on top of base snapshots.
-
 ```
 event SnapshotFinalized(uint256 indexed epochId, uint256 epochEnd, string projectId, string snapshotCid, uint256 timestamp);
 ```
 
-### Aggregation and data composition - snapshot generation of higher order datapoints on base snapshots
+### Aggregation and data composition - snapshot generation of higher-order data points on base snapshots
 
-Workers as defined in `config/aggregator.json` are triggered by the appropriate signals forwarded to [`Processor Distributor`](pooler/processor_distributor.py) corresponding to the project ID filters as explained in the [Configuration](#configuration) section.
+Workers as defined in `config/aggregator.json` are triggered by the appropriate signals forwarded to [`Processor Distributor`](pooler/processor_distributor.py) corresponding to the project ID filters as explained in the [Configuration](#configuration) section. This is best seen in action in Pooler, the snapshotter implementation that serves multiple aggregated data points for Uniswap v2 trade information.
 
-![Data composition](pooler/static/docs/assets/DependencyDataComposition.png)
 
+In case of aggregation over multiple projects, their project IDs are generated with a combination of the hash of the dependee project IDs along with the namespace
 
-![Aggregation Workflow](pooler/static/docs/assets/AggregationWorkflow.png)
+https://github.com/PowerLoom/pooler/blob/5e7cc3812074d91e8d7d85058554bb1175bf8070/snapshotter/utils/aggregation_worker.py#L116-L124
 
-In case of aggregation over multiple projects, their project IDs are generated with a combination of the hash of the dependee project IDs along with the namespace
 
-https://github.com/PowerLoom/pooler/blob/e7a5bd62debfe726d1473f0dfb68856dab43ef25/pooler/utils/aggregation_worker.py#L116-L124
+## Major Components
+
+![Snapshotter Components](snapshotter/static/docs/assets/MajorComponents.png)
+
+### System Event Detector
+
+The system event detector tracks events being triggered on the protocol state contract running on the anchor chain and forwards it to a callback queue with the appropriate routing key depending on the event signature and type among other information.
+
+Related information and other services depending on these can be found in previous sections: [State Transitions](#state-transitions), [Configuration](#configuration).
+
+
+### Process Hub Core
+
+The Process Hub Core, defined in [`process_hub_core.py`](snapshotter/process_hub_core.py), serves as the primary process manager in the snapshotter.
+* Operated by the CLI tool [`processhub_cmd.py`](snapshotter/processhub_cmd.py), it is responsible for starting and managing the `SystemEventDetector` and `ProcessorDistributor` processes.
+* Additionally, it spawns the base snapshot and aggregator workers required for processing tasks from the `powerloom-backend-callback` queue. The number of workers and their configuration path can be adjusted in `config/settings.json`.
+
+### Processor Distributor
+The Processor Distributor, defined in [`processor_distributor.py`](snapshotter/processor_distributor.py), is initiated using the `processhub_cmd.py` CLI.
+
+* It loads the preloader, base snapshotting, and aggregator config information from the settings file
+* It reads the events forwarded by the event detector to the `f'powerloom-event-detector:{settings.namespace}:{settings.instance_id}'` RabbitMQ queue bound to a topic exchange as configured in `settings.rabbitmq.setup.event_detector.exchange`([code-ref: RabbitMQ exchanges and queue setup in pooler](snapshotter/init_rabbitmq.py))
+* It creates and distributes processing messages based on the preloader configuration present in `config/preloader.json`, the project configuration present in `config/projects.json` and `config/aggregator.json`, and the topic pattern used in the routing key received from the topic exchange
+  * For [`EpochReleased` events](#epoch-generation), it forwards such messages to base snapshot builders for data source contracts as configured in `config/projects.json` for the current epoch information contained in the event.
+    https://github.com/PowerLoom/pooler/blob/5e7cc3812074d91e8d7d85058554bb1175bf8070/snapshotter/processor_distributor.py#L125-L141
+  * For [`SnapshotSubmitted` events](#base-snapshot-generation), it forwards such messages to single and multi-project aggregate topic routing keys.
+    https://github.com/PowerLoom/pooler/blob/5e7cc3812074d91e8d7d85058554bb1175bf8070/snapshotter/processor_distributor.py#L228-L303
+
+
+### Delegation Workers for preloaders
+
+The preloaders often fetch and cache large volumes of data, for eg, all the transaction receipts for a block on the data source blockchain. In such a case, a single worker will never be enough to feasibly fetch the data for a timely base snapshot generation and subsequent aggregate snapshot generations to finally reach a consensus.
+
+Hence such workers are defined as `delegate_tasks` in [`config/preloader.json`](config/preloader.json) and the [process hub core](#process-hub-core) launches a certain number of workers as defined in the primary settings file, `config/settings.json` under the key `callback_worker_config.num_delegate_workers`.
+
+https://github.com/PowerLoom/pooler/blob/5e7cc3812074d91e8d7d85058554bb1175bf8070/config/preloader.json#L19-L25
+
+https://github.com/PowerLoom/pooler/blob/5e7cc3812074d91e8d7d85058554bb1175bf8070/config/settings.example.json#L86-L90
+
+### Callback Workers
+
+The callback workers are the ones that build the base snapshot and aggregation snapshots and as explained above, are launched by the [process hub core](#process-hub-core) according to the configurations in `aggregator/projects.json` and `config/aggregator.json`.
+
+They listen to new messages on the RabbitMQ topic exchange as described in the following configuration, and the topic queue's initialization is as follows.
+
+https://github.com/PowerLoom/pooler/blob/5e7cc3812074d91e8d7d85058554bb1175bf8070/config/settings.example.json#L42-L44
+
+https://github.com/PowerLoom/pooler/blob/5e7cc3812074d91e8d7d85058554bb1175bf8070/snapshotter/init_rabbitmq.py#L118-L140
+
+Upon receiving a message from the processor distributor after preloading is complete, the workers do most of the heavy lifting along with some sanity checks and then call the actual `compute` function defined in the project configuration to transform the dependent data points as cached by the preloaders to finally generate the base snapshots.
+
+* [Base Snapshot builder](pooler/utils/snapshot_worker.py)
+* [Aggregation Snapshot builder](pooler/utils/aggregation_worker.py)
+
+### RPC Helper
+
+Extracting data from the blockchain state and generating the snapshot can be a complex task. The `RpcHelper`, defined in [`utils/rpc.py`](pooler/utils/rpc.py), has a bunch of helper functions to make this process easier. It handles all the `retry` and `caching` logic so that developers can focus on efficiently building their use cases.
+
+
+### Core API
+
+This component is one of the most important and allows you to access the finalized protocol state on the smart contract running on the anchor chain. Find it in [`core_api.py`](pooler/core_api.py).
+
+The [pooler-frontend](https://github.com/powerloom/pooler-frontend) that serves the Uniswap v2 dashboards hosted by the PowerLoom foundation on locations like https://uniswapv2.powerloom.io/ is a great example of a frontend specific web application that makes use of this API service.
+
+Among many things, the core API allows you to **access the finalized CID as well as its contents at a given epoch ID for a project**.
+
+The main endpoint implementations can be found as follows:
+
+https://github.com/PowerLoom/pooler/blob/5e7cc3812074d91e8d7d85058554bb1175bf8070/snapshotter/core_api.py#L186-L268
+https://github.com/PowerLoom/pooler/blob/5e7cc3812074d91e8d7d85058554bb1175bf8070/snapshotter/core_api.py#L273-L324
+
+The first endpoint in `GET /last_finalized_epoch/{project_id}` returns the last finalized EpochId for a given project ID and the second one is `GET /data/{epoch_id}/{project_id}/` which can be used to return the actual snapshot data for a given EpochId and ProjectId.
+
+These endpoints along with the combination of a bunch of other helper endpoints present in `Core API` can be used to build powerful Dapps and dashboards.
+
+You can observe the way it is [used in `pooler-frontend` repo](https://github.com/PowerLoom/pooler-frontend/blob/361268d27584520450bf33353f7519982d638f8a/src/routes/index.svelte#L85) to fetch the dataset for the aggregate projects of top pairs trade volume and token reserves summary:
+
+
+```javascript
+try {
+      response = await axios.get(API_PREFIX+`/data/${epochInfo.epochId}/${top_pairs_7d_project_id}/`);
+      console.log('got 7d top pairs', response.data);
+      if (response.data) {
+        for (let pair of response.data.pairs) {
+          pairsData7d[pair.name] = pair;
+        }
+      } else {
+        throw new Error(JSON.stringify(response.data));
+      }
+    }
+    catch (e){
+      console.error('7d top pairs', e);
+    }
+```
+
 
 ## Development Instructions
+
 These instructions are needed if you're planning to run the system using `build-dev.sh` from [deploy](https://github.com/PowerLoom/deploy).
 
 ### Configuration
 Pooler needs the following config files to be present
 * **`settings.json` in `pooler/auth/settings`**: Changes are trivial. Copy [`config/auth_settings.example.json`](config/auth_settings.example.json) to `config/auth_settings.json`. This enables an authentication layer over the core API exposed by the pooler snapshotter.
 * settings files in `config/`
-    * **`config/projects.json`**: This lists out the smart contracts on the data source chain on which snapshots will be generated paired with the snapshot worker class. It's an array of objects with the following structure:
+    * **[`config/projects.json`](config/projects.example.json)**: Each entry in this configuration file defines the most fundamental unit of data representation in Powerloom Protocol, that is, a project. It is of the following schema
         ```javascript
         {
             "project_type": "snapshot_project_name_prefix_",
             "projects": ["array of smart contract addresses"], // Uniswap v2 pair contract addresses in this implementation
+            "preload_tasks":[
+              "eth_price",
+              "block_details"
+            ],
             "processor":{
                 "module": "snapshotter.modules.uniswapv2.pair_total_reserves",
-                "class_name": "PairTotalReservesProcessor" // class to be found in module pooler/modules/uniswapv2/pair_total_reserves.py
+                "class_name": "PairTotalReservesProcessor" // class to be found in module snapshotter/modules/pooler/uniswapv2/pair_total_reserves.py
             }
         }
         ```
-        Copy over [`config/projects.example.json`](config/projects.example.json) to `config/projects.json`. For more details, read on in the [section below on extending a use case](#extending-pooler-with-a-uniswap-v2-data-point).
-    * **`config/aggregator.json`** : This lists out different type of aggregation work to be performed over a span of snapshots. Copy over [`config/aggregator.example.json`](config/aggregator.example.json) to `config/aggregator.json`. The span is usually calculated as a function of the epoch size and average block time on the data source network. For eg,
+        Copy over [`config/projects.example.json`](config/projects.example.json) to `config/projects.json`. For more details, read on in the [use case study](#pooler-case-study-and-extending-this-implementation) for this current implementation.
+
+  * **`config/aggregator.json`** : This lists out different type of aggregation work to be performed over a span of snapshots. Copy over [`config/aggregator.example.json`](config/aggregator.example.json) to `config/aggregator.json`. The span is usually calculated as a function of the epoch size and average block time on the data source network. For eg,
         * the following configuration calculates a snapshot of total trade volume over a 24 hour time period, based on the [snapshot finalization](#snapshot-finalization) of a project ID corresponding to a pair contract. This can be seen by the `aggregate_on` key being set to `SingleProject`.
-            * This is specified by the `filters` key below. When a snapshot build is achieved for an epoch over a project ID [(ref:generation of project ID for snapshot building workers)](#epoch-generation). For eg, a snapshot build on `pairContract_trade_volume:0xb4e16d0168e52d35cacd2c6185b44281ec28c9dc:UNISWAPV2` triggers the worker [`AggreagateTradeVolumeProcessor`](pooler/modules/uniswapv2/aggregate/single_uniswap_trade_volume_24h.py) as defined in the `processor` section of the config against the pair contract `0xb4e16d0168e52d35cacd2c6185b44281ec28c9dc`.
-        ```javascript
+            * This is specified by the `filters` key below. When a snapshot build is achieved for an epoch over a project ID [(ref:generation of project ID for snapshot building workers)](#epoch-generation). For eg, a snapshot build on `pairContract_trade_volume:0xb4e16d0168e52d35cacd2c6185b44281ec28c9dc:UNISWAPV2` triggers the worker [`AggreagateTradeVolumeProcessor`](snapshotter/modules/pooler/uniswapv2/aggregate/single_uniswap_trade_volume_24h.py) as defined in the `processor` section of the config against the pair contract `0xb4e16d0168e52d35cacd2c6185b44281ec28c9dc`.
+
+    ```javascript
         {
             "config": [
                 {
@@ -161,9 +291,11 @@ Pooler needs the following config files to be present
                 }
             ]
         }
-        ```
-        * The following configuration generates a collection of data sets of 24 hour trade volume as calculated by the worker above across multiple pair contracts. This can be seen by the `aggregate_on` key being set to `MultiProject`.
+    ```
+
+    * The following configuration generates a collection of data sets of 24 hour trade volume as calculated by the worker above across multiple pair contracts. This can be seen by the `aggregate_on` key being set to `MultiProject`.
             * `projects_to_wait_for` specifies the exact project IDs on which this collection will be generated once a snapshot build has been achieved for an [`epochId`](#epoch-generation).
+
         ```javascript
         {
             "config": [
@@ -192,20 +324,20 @@ Pooler needs the following config files to be present
             ]
         }
         ```
-
     * To begin with, you can keep the workers and contracts as specified in the example files.
 
     * **`config/settings.json`**: This is the primary configuration. We've provided a settings template in `config/settings.example.json` to help you get started. Copy over [`config/settings.example.json`](config/settings.example.json) to `config/settings.json`. There can be a lot to fine tune but the following are essential.
         - `instance_id`: This is the unique public key for your node to participate in consensus. It is currently registered on approval of an application (refer [deploy](https://github.com/PowerLoom/deploy) repo for more details on applying).
-        - `namespace`, it is the unique key used to identify your project namespace around which all consensus activity takes place.
+        - `namespace`, is the unique key used to identify your project namespace around which all consensus activity takes place.
         - RPC service URL(s) and rate limit configurations. Rate limits are service provider specific, different RPC providers have different rate limits. Example rate limit config for a node looks something like this `"100000000/day;20000/minute;2500/second"`
-            - **`rpc.full_nodes`**: This will correspond to RPC nodes for the chain on which the data source smart contracts lives (for eg. Ethereum Mainnet, Polygon Mainnet etc).
+            - **`rpc.full_nodes`**: This will correspond to RPC nodes for the chain on which the data source smart contracts live (for eg. Ethereum Mainnet, Polygon Mainnet, etc).
             - **`anchor_chain_rpc.full_nodes`**: This will correspond to RPC nodes for the anchor chain on which the protocol state smart contract lives (Prost Chain).
             - **`protocol_state.address`** : This will correspond to the address at which the protocol state smart contract is deployed on the anchor chain. **`protocol_state.abi`** is already filled in the example and already available at the static path specified [`pooler/static/abis/ProtocolContract.json`](pooler/static/abis/ProtocolContract.json)
 
 
 ## Monitoring and Debugging
-Login to pooler docker container using `docker exec -it deploy-pooler-1 bash` (use `docker ps` to verify its presence in the list of running containers) and use the following commands for monitoring and debugging
+
+Login to the pooler docker container using `docker exec -it deploy-boost-1 bash` (use `docker ps` to verify its presence in the list of running containers) and use the following commands for monitoring and debugging
 - To monitor the status of running processes, you simply need to run `pm2 status`.
 - To see all logs you can run `pm2 logs`
 - To see logs for a specific process you can run `pm2 logs <Process Identifier>`
@@ -213,11 +345,24 @@ Login to pooler docker container using `docker exec -it deploy-pooler-1 bash` (u
 
 ## For Contributors
 We use [pre-commit hooks](https://pre-commit.com/) to ensure our code quality is maintained over time. For this contributors need to do a one-time setup by running the following commands.
-* Install the required dependencies using `pip install -r dev-requirements.txt`, this will setup everything needed for pre-commit checks.
+* Install the required dependencies using `pip install -r dev-requirements.txt`, this will set up everything needed for pre-commit checks.
 * Run `pre-commit install`
 
 Now, whenever you commit anything, it'll automatically check the files you've changed/edited for code quality issues and suggest improvements.
 
+## Pooler: Case study and extending this implementation
+
+Pooler is a Uniswap specific implementation of what is known as a 'snapshotter' in the PowerLoom Protocol ecosystem. It synchronizes with other snapshotter peers over a smart contract running on the present version of the PowerLoom Protocol testnet. It follows an architecture that is driven by state transitions which makes it easy to understand and modify. This present release ultimately provide access to rich aggregates that can power a Uniswap v2 dashboard with the following data points:
+
+- Total Value Locked (TVL)
+- Trade Volume, Liquidity reserves, Fees earned
+    - grouped by
+        - Pair contracts
+        - Individual tokens participating in pair contract
+    - aggregated over time periods
+        - 24 hours
+        - 7 days
+- Transactions containing `Swap`, `Mint`, and `Burn` events
 
 ## Extending pooler with a Uniswap v2 data point
 
@@ -241,31 +386,29 @@ There's currently no limitation on the number or type of usecases you can build
 https://github.com/PowerLoom/pooler/blob/1452c166bef7534568a61b3a2ab0ff94535d7229/config/projects.example.json#L1-L35
 
 
-If we take a look at the `TradeVolumeProcessor` class present at [`pooler/modules/uniswapv2/trade_volume.py`](pooler/modules/uniswapv2/trade_volume.py) it implements the interface of `GenericProcessorSnapshot` defined in [`pooler/utils/callback_helpers.py`](pooler/utils/callback_helpers.py).
+If we take a look at the `TradeVolumeProcessor` class present at [`snapshotter/modules/pooler/uniswapv2/trade_volume.py`](snapshotter/modules/pooler/uniswapv2/trade_volume.py) it implements the interface of `GenericProcessorSnapshot` defined in [`pooler/utils/callback_helpers.py`](pooler/utils/callback_helpers.py).
 
 
-https://github.com/PowerLoom/pooler/blob/1452c166bef7534568a61b3a2ab0ff94535d7229/pooler/modules/uniswapv2/trade_volume.py#L13-L86
+https://github.com/PowerLoom/pooler/blob/1452c166bef7534568a61b3a2ab0ff94535d7229/snapshotter/modules/pooler/uniswapv2/trade_volume.py#L13-L86
 
 
 There are a couple of important concepts here necessary to write your extraction logic:
 * `compute` is the main function where most of the snapshot extraction and generation logic needs to be written. It receives the following inputs:
-- `max_chain_height` (epoch end block)
-- `min_chain_height` (epoch start block)
-- `address` (contract address to extract data from)
+- `epoch` (current epoch details)
 - `redis` (async redis connection)
 - `rpc_helper` ([`RpcHelper`](pooler/utils/rpc.py) instance to help with any calls to the data source contract's chain)
 
 * `transformation_lambdas` provide an additional layer for computation on top of the generated snapshot (if needed). If `compute` function handles everything you can just set `transformation_lambdas` to `[]` otherwise pass the list of transformation function sequence. Each function referenced in `transformation_lambdas` must have same input interface. It should receive the following inputs -
- - `snapshot` (the generated snapshot to apply transformation on)
+ - `sn`apshot` (the generated snapshot to apply transformation on)
  - `address` (contract address to extract data from)
  - `epoch_begin` (epoch begin block)
  - `epoch_end` (epoch end block)
 
 Output format can be anything depending on the usecase requirements. Although it is recommended to use proper [`pydantic`](https://pypi.org/project/pydantic/) models to define the snapshot interface.
 
-The resultant output model in this specific example is `UniswapTradesSnapshot` as defined in the Uniswap v2 specific modules directory: [`utils/models/message_models.py`](pooler/modules/uniswapv2/utils/models/message_models.py). This encapsulates state information captured by `TradeVolumeProcessor` between the block heights of the epoch: `min_chain_height` and `max_chain_height`.
+The resultant output model in this specific example is `UniswapTradesSnapshot` as defined in the Uniswap v2 specific modules directory: [`utils/models/message_models.py`](snapshotter/modules/pooler/uniswapv2/utils/models/message_models.py). This encapsulates state information captured by `TradeVolumeProcessor` between the block heights of the epoch: `min_chain_height` and `max_chain_height`.
 
-https://github.com/PowerLoom/pooler/blob/1452c166bef7534568a61b3a2ab0ff94535d7229/pooler/modules/uniswapv2/utils/models/message_models.py#L37-L44
+https://github.com/PowerLoom/pooler/blob/1452c166bef7534568a61b3a2ab0ff94535d7229/snapshotter/modules/pooler/uniswapv2/utils/models/message_models.py#L37-L44
 
 
 ### Step 2. Review: 24 hour aggregate of trade volume snapshots over a single pair contract
@@ -296,7 +439,7 @@ https://github.com/PowerLoom/pooler/blob/1452c166bef7534568a61b3a2ab0ff94535d722
 
 * Each finalized `epochId` is registered with a snapshot commit against the aggregated data set generated by running summations on trade volumes on all the base snapshots contained within the span calculated above.
 
-https://github.com/PowerLoom/pooler/blob/1452c166bef7534568a61b3a2ab0ff94535d7229/pooler/modules/uniswapv2/aggregate/single_uniswap_trade_volume_24h.py#L84-L157
+https://github.com/PowerLoom/pooler/blob/1452c166bef7534568a61b3a2ab0ff94535d7229/snapshotter/modules/pooler/uniswapv2/aggregate/single_uniswap_trade_volume_24h.py#L84-L157
 
 
 ### Step 3. New Datapoint: 2 hours aggregate of only swap events
@@ -307,85 +450,13 @@ From the information provided above, the following is left as an exercise for th
 
 * Add a new configuration entry in `config/aggregator.json` for this new aggregation worker class
 
-* Define a new data model in [`utils/message_models.py`](pooler/modules/uniswapv2/utils/models/message_models.py) referring to
+* Define a new data model in [`utils/message_models.py`](snapshotter/modules/pooler/uniswapv2/utils/models/message_models.py) referring to
     * `UniswapTradesAggregateSnapshot` as used in above example
     * `UniswapTradesSnapshot` used to capture each epoch's trade snapshots which includes the raw event logs as well
 
-* Follow the example of the aggregator worker [as implemented for 24 hours aggregation calculation](pooler/modules/uniswapv2/aggregate/single_uniswap_trade_volume_24h.py) , and work on calculating an `epochId` span of 2 hours and filtering out only the `Swap` events and the trade volume contained within.
-
-## Major Components
-
-![Pooler Components](pooler/static/docs/assets/MajorComponents.png)
-
-### System Event Detector
+* Follow the example of the aggregator worker [as implemented for 24 hours aggregation calculation](snapshotter/modules/pooler/uniswapv2/aggregate/single_uniswap_trade_volume_24h.py) , and work on calculating an `epochId` span of 2 hours and filtering out only the `Swap` events and the trade volume contained within.
 
-The system event detector tracks events being triggered on the protocol state contract running on the anchor chain and forwards it to a callback queue with the appropriate routing key depending on the event signature and type among other information.
 
-Related information and other services depending on these can be found in previous sections: [State Transitions](#state-transitions), [Configuration](#configuration).
-
-
-### Process Hub Core
-
-The Process Hub Core, defined in [`process_hub_core.py`](pooler/process_hub_core.py), serves as the primary process manager in snapshotter.
-* Operated by the CLI tol [`processhub_cmd.py`](pooler/processhub_cmd.py), it is responsible for starting and managing the `SystemEventDetector` and `ProcessorDistributor` processes.
-* Additionally, it spawns the base snapshot and aggregator workers required for processing tasks from the `powerloom-backend-callback` queue. The number of workers and their configuration path can be adjusted in `config/settings.json`.
-
-### Processor Distributor
-The Processor Distributor, defined in [`processor_distributor.py`](pooler/processor_distributor.py), is initiated using the `processhub_cmd.py` CLI.
-
-* It loads the base snapshotting and aggregator config information from settings
-* It reads the events forwarded by the event detector to the `f'powerloom-event-detector:{settings.namespace}:{settings.instance_id}'` RabbitMQ queue bound to a topic exchange as configured in `settings.rabbitmq.setup.event_detector.exchange`([code-ref: RabbitMQ exchanges and queue setup in pooler](pooler/init_rabbitmq.py))
-* It creates and distributes processing messages based on project configuration present in `config/projects.json` and `config/aggregator.json`, and the topic pattern used in the routing key received from the topic exchange
-  * For [`EpochReleased` events](#epoch-generation), it forwards such messages to base snapshot builders for data source contracts as configured in `config/projects.json` for the current epoch information contained in the event.
-    https://github.com/PowerLoom/pooler/blob/1452c166bef7534568a61b3a2ab0ff94535d7229/pooler/processor_distributor.py#L125-L141
-  * For [`SnapshotFinalized` events](#snapshot-finalization), it forwards such messages to single and multi project aggregate topic routing keys.
-    https://github.com/PowerLoom/pooler/blob/1452c166bef7534568a61b3a2ab0ff94535d7229/pooler/processor_distributor.py#L228-L303
-
-### Callback Workers
-
-The callback workers are basically the ones that build the base snapshot and aggregation snapshots and as explained above, are launched by the [processor distributor](#processor-distributor) according to the configurations in `aggregator/projects.json` and `config/aggregator.json`. They listen to new messages on the `powerloom-backend-callback` topic queue. Upon receiving a message, the workers do most of the heavy lifting along with some sanity checks and then calls the actual `compute` function defined in the project configuration to read blockchain state and generate the snapshot.
-
-* [Base Snapshot builder](pooler/utils/snapshot_worker.py)
-* [Aggregation Snapshot builder](pooler/utils/aggregation_worker.py)
-
-### RPC Helper
-
-Extracting data from blockchain state and generating the snapshot can be a complex task. The `RpcHelper`, defined in [`utils/rpc.py`](pooler/utils/rpc.py), has a bunch of helper functions to make this process easier. It handles all the `retry` and `caching` logic so that developers can focus on building their use cases in an efficient way.
-
-
-### Core API
-
-This component is one of the most important and allows you to access the finalized protocol state on the smart contract running on the anchor chain. Find it in [`core_api.py`](pooler/core_api.py).
-
-The [pooler-frontend](https://github.com/powerloom/pooler-frontend) that serves the Uniswap v2 dashboards hosted by the PowerLoom foundation on locations like https://uniswapv2.powerloom.io/ .
-
-Among many thing, it allows you to **access the finalized CID as well as its contents at a given epoch ID for a project**.
-
-The endpoint implementations can be found as follows:
-
-https://github.com/PowerLoom/pooler/blob/1452c166bef7534568a61b3a2ab0ff94535d7229/pooler/core_api.py#L305-L355
-
-https://github.com/PowerLoom/pooler/blob/1452c166bef7534568a61b3a2ab0ff94535d7229/pooler/core_api.py#L250-L300
-
-You can observe the way it is [used in `pooler-frontend` repo](https://github.com/PowerLoom/pooler-frontend/blob/361268d27584520450bf33353f7519982d638f8a/src/routes/index.svelte#L85) to fetch the dataset for the aggregate projects of top pairs trade volume and token reserves summary:
-
-
-```javascript
-try {
-      response = await axios.get(API_PREFIX+`/data/${epochInfo.epochId}/${top_pairs_7d_project_id}/`);
-      console.log('got 7d top pairs', response.data);
-      if (response.data) {
-        for (let pair of response.data.pairs) {
-          pairsData7d[pair.name] = pair;
-        }
-      } else {
-        throw new Error(JSON.stringify(response.data));
-      }
-    }
-    catch (e){
-      console.error('7d top pairs', e);
-    }
-```
 
 ## Find us