From 1ab8e207d2cddf76c455883c7fc4a4c550bc5db6 Mon Sep 17 00:00:00 2001 From: Mohamed Zahoor <940575+jmozah@users.noreply.github.com> Date: Thu, 16 May 2024 10:41:57 +0530 Subject: [PATCH 1/2] Add initial version of SWIP for Swarm Data Chain --- SWIPs/swip-draft_swarm_data_chain.md | 105 +++++++++++++++++++++++++++ 1 file changed, 105 insertions(+) create mode 100644 SWIPs/swip-draft_swarm_data_chain.md diff --git a/SWIPs/swip-draft_swarm_data_chain.md b/SWIPs/swip-draft_swarm_data_chain.md new file mode 100644 index 0000000..602ef15 --- /dev/null +++ b/SWIPs/swip-draft_swarm_data_chain.md @@ -0,0 +1,105 @@ +--- +WIP: +title: Swarm Data Chain +author: Mohamed Zahoor (jmozah) +discussions-to: https://discord.com/channels/799027393297514537/1068161013934985287/1239528605013377085 +status: Draft +type: Standards Track +category: Layer 2 +created: 2024-05-15 +--- + + + + +`SWIP-draft_title_abbrev.md`. + + + +## Simple Summary + +One of the biggest issues that any blockchain face is to store and manage its ledger(data). The faster the transaction execution of a chain, the bigger the issue of making its data available and retrievabile to all its clients. This SWIP proposes to solve the data availabilty and retrievability problems of other blockchains by having a generic data chain to manage and store their data. This data chain will use Swarm as the storage layer to acheieve this goal. + + +## Abstract + +Blockchain scaling usually means to scale in the following dimensions +- Transaction Execution +- Data storage +- Bandwidth + +Lately, Layer2 networks have helped scale Ethereum to a degree by offloading the Transaction Execution and compressing block data (rollup). Data storage related issues like availability and retrievability are still open problems. The proposed solution is to solve the data availability and retrievability problems of blockchains (Especially Ethereum Layer2's). Chains will be able to store their data (blob, blocks, state, logs, receipts etc) and allow its clients to check for avilability and to retreive them later if needed. This will scale the respective chains by having more economicaland secure data storage and efficient use if p2p bandwidth when retreiving the data. + + +## Motivation + + +Modular blockchains are gaining popularity so that new chains can be build fast and with ease. Having a modular storage layer for blockchains will make the data storage of these chains more manageble. Storing a blockchain data in a decentralised storage will give rise to new dimensions for centralised applications like etherscan. + +Following are some of the motivations for creating a generic Data chain +- Solving Data Availability problems + - Light nodes need strong data availability assurances without the need to download the entire block. + - Ethereum Layer2 is another example where the data should be available to other nodes for liveness. + - This is also required to build future "stateless" clients where it need not be required to download and store the data. +- Solving Data Retrievability problems: + - Blockchains have special archive nodes to store the entire data. most of the other clients rely on them to get the full data. This is a problem especially if the number of archive nodes are small. + - Future chains can totally eliminate the storage of data in each client and instead support a stateless client model which will be light and thereby increase decentralisation. + +- Using Swarm as the base Layer: + - Highly distributed storage network + - Provable data storage (Merkle Proof) + - Censorship resistance + - Data redundancy (chunk is stored in all the neighbourhood) + - Efficient use of Bandwidth if a chunk is requested often. + + +## Specification + + +The design consists of a new "Data Chain" and the existing Swarm network. + +- New Data Chain + + - This is a new blockchain that uses a Delegated Poof of Stake (DPoS) consensus with multiple validators which manage the network. Validators need to stake a certain amount of BZZ to become active. Other BZZ holders can delegate their BZZ tokens to any of the existing validators. The voting power of a validator will be proportional to the number of BZZ tokens it has staked. + + - Validators arrive at consensus about a new data (ex: block) that is created by the supported blockchain. Once greater then 2/3 majority if reached about the data, it is then permenantly stored in the Swarm network. The Validators will take care of all the pre-processing work like data sampling and organising the data before storing them in Swarm. Any request for the original data or a piece of data (sample) will get the necessary mapping from the validators and will get the respective Swarm hash to access the data from Swarm network. + + - New data sources (blockchains) and types (block, state, logs, receipts, blobs etc) can be added for ingestion over time using governance. + + - The state of the Data chain should be updated to a smart contract in Layer 1 (Ethereum) so that it inherits the same security gurantees as in Ethereum. + + +## Rationale + + +1) [SWIP-42](https://github.com/ethersphere/SWIPs/pull/42/files#diff-b0a6bcf1f6e706ea47edb89ad8b82c36c4ef6dee3576e1e91b2b0248fd31a5a8) proposes a similar design where every data that is ingested is updated to layer 1. This solution is prohibitively expensive and less in data capacity since it uses Layer 1. + +2) Using a seperate blockchain for managing the storage makes it much more data centric and helps to create a generic solution to add dataspaces on the fly. + +3) Using Swarm as the final resting place for data inherits all the goodies that has been built in Swarm over the years. + +4) Later this chain can be upgraded to have an EVM so that more programmable usage and control of data can be enabled. + +5) Having a seperate Swarm chain and bringing in more chain data in to the network and will help Swarm operators directly. + +6) Future blockchains will be highly decentralised as the resource requirements of a client will come down drastically and at the same time they will have the same security as if they have stored all the chain data. + +## Backwards Compatibility + + +This is a new design for Data Chain so there is no specific backward compatibility requirement. Special care has to be taken to ensure that the data sampling algorithms are backward compatible. + +With r.to Swarm, the validators will use the Swarm API to push data and store their BZZ address as part of their state data. + +## Test Cases + + +AT first we should run a testnet that can capture some data from testnets of other blockchains. This will help in testing the working of the chain and the integration with Swarm network. + +## Implementation + + +To start with, the Data chain ca be built with CometBFT and Cosmos SDK on top of it. We can start with few validators and then increase the number as we test. The BZZ can be brough in using a bridge from Ethereum Layer1 so that validators can stake and other users can delegate them to run the network. + +## Copyright +Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). From 6bdf758ccbeff1a0b876d3cc398c39b2c9deeb4f Mon Sep 17 00:00:00 2001 From: Zahoor Mohamed <940575+jmozah@users.noreply.github.com> Date: Fri, 17 May 2024 01:00:21 +0530 Subject: [PATCH 2/2] Correct spelling mistakes and gramatical errors --- SWIPs/swip-draft_swarm_data_chain.md | 54 ++++++++++++++-------------- 1 file changed, 27 insertions(+), 27 deletions(-) diff --git a/SWIPs/swip-draft_swarm_data_chain.md b/SWIPs/swip-draft_swarm_data_chain.md index 602ef15..0167540 100644 --- a/SWIPs/swip-draft_swarm_data_chain.md +++ b/SWIPs/swip-draft_swarm_data_chain.md @@ -2,7 +2,7 @@ WIP: title: Swarm Data Chain author: Mohamed Zahoor (jmozah) -discussions-to: https://discord.com/channels/799027393297514537/1068161013934985287/1239528605013377085 +discussions-to: https://rb.gy/g26g6q status: Draft type: Standards Track category: Layer 2 @@ -12,94 +12,94 @@ created: 2024-05-15 -`SWIP-draft_title_abbrev.md`. + ## Simple Summary -One of the biggest issues that any blockchain face is to store and manage its ledger(data). The faster the transaction execution of a chain, the bigger the issue of making its data available and retrievabile to all its clients. This SWIP proposes to solve the data availabilty and retrievability problems of other blockchains by having a generic data chain to manage and store their data. This data chain will use Swarm as the storage layer to acheieve this goal. +One of the biggest issues any blockchain faces is storing and managing its ledger(data). The faster a blockchain's transaction execution, the bigger the issue of making its data available and retrievable to all its clients. This SWIP proposes to solve the [Data Availability and Retrievability](https://ethereum.org/en/developers/docs/data-availability/) issues of other blockchains by having a generic data chain to manage and store their data. This data chain will use Swarm as the storage layer to achieve this goal. ## Abstract -Blockchain scaling usually means to scale in the following dimensions +Blockchain scaling usually means scaling in the following dimensions - Transaction Execution - Data storage -- Bandwidth +- Bandwidth Utilization -Lately, Layer2 networks have helped scale Ethereum to a degree by offloading the Transaction Execution and compressing block data (rollup). Data storage related issues like availability and retrievability are still open problems. The proposed solution is to solve the data availability and retrievability problems of blockchains (Especially Ethereum Layer2's). Chains will be able to store their data (blob, blocks, state, logs, receipts etc) and allow its clients to check for avilability and to retreive them later if needed. This will scale the respective chains by having more economicaland secure data storage and efficient use if p2p bandwidth when retreiving the data. +Lately, Ethereum Layer2 networks have helped scale Ethereum to a degree by offloading the Transaction Execution and compressing block data (roll-up). Data storage-related issues like availability and retrievability are still open problems. The proposed solution is to solve blockchain data availability and retrievability problems (Especially Ethereum and its Layer2 networks). Chains can store their data (blob, blocks, state, logs, receipts, etc.) and allow their clients to check for availability and retrieve them later if needed. This will scale the respective chains by providing more economical and secure data storage and efficient use of p2p bandwidth when retrieving the data. ## Motivation -Modular blockchains are gaining popularity so that new chains can be build fast and with ease. Having a modular storage layer for blockchains will make the data storage of these chains more manageble. Storing a blockchain data in a decentralised storage will give rise to new dimensions for centralised applications like etherscan. +Modular blockchains are gaining popularity so that new chains can be built quickly and easily. Having a modular storage layer for blockchains will make the data storage of these chains more manageable. Storing blockchain data in decentralized storage will create new dimensions for centralized applications like etherscan. Following are some of the motivations for creating a generic Data chain - Solving Data Availability problems - - Light nodes need strong data availability assurances without the need to download the entire block. + - Light nodes need strong data availability assurances without downloading the entire block. - Ethereum Layer2 is another example where the data should be available to other nodes for liveness. - - This is also required to build future "stateless" clients where it need not be required to download and store the data. + - This is also required to build future "stateless" clients where the data need not be downloaded and stored. - Solving Data Retrievability problems: - - Blockchains have special archive nodes to store the entire data. most of the other clients rely on them to get the full data. This is a problem especially if the number of archive nodes are small. - - Future chains can totally eliminate the storage of data in each client and instead support a stateless client model which will be light and thereby increase decentralisation. + Blockchains have special archive nodes to store the entire data. Most of the other clients rely on them to get the full data, which can be a problem, especially if the number of archive nodes is small. + - Future chains can eliminate data storage in every client and instead support a stateless client model, which will be light and increase decentralization. - Using Swarm as the base Layer: - Highly distributed storage network - Provable data storage (Merkle Proof) - Censorship resistance - - Data redundancy (chunk is stored in all the neighbourhood) + - Data redundancy (chunk is stored in all the neighborhood) - Efficient use of Bandwidth if a chunk is requested often. ## Specification -The design consists of a new "Data Chain" and the existing Swarm network. +The design includes a new "Data Chain" and the Swarm network. - New Data Chain - - This is a new blockchain that uses a Delegated Poof of Stake (DPoS) consensus with multiple validators which manage the network. Validators need to stake a certain amount of BZZ to become active. Other BZZ holders can delegate their BZZ tokens to any of the existing validators. The voting power of a validator will be proportional to the number of BZZ tokens it has staked. + - This new blockchain uses a Delegated Proof of Stake (DPoS) consensus with multiple validators that manage the network. Validators need to stake a certain amount of BZZ to become active. Other BZZ holders can delegate their BZZ tokens to any of the existing validators. The voting power of a validator will be proportional to the number of BZZ tokens it has staked. - - Validators arrive at consensus about a new data (ex: block) that is created by the supported blockchain. Once greater then 2/3 majority if reached about the data, it is then permenantly stored in the Swarm network. The Validators will take care of all the pre-processing work like data sampling and organising the data before storing them in Swarm. Any request for the original data or a piece of data (sample) will get the necessary mapping from the validators and will get the respective Swarm hash to access the data from Swarm network. + - Validators arrive at a consensus about new data (ex, block) created by the supported blockchains. Once greater than 2/3 majority is reached about the data, it is then permanently stored in the Swarm network. The Validators will handle all the pre-processing work, like data sampling and organizing the data, before storing them in Swarm. Any request for the original data or a piece of data (sample) will get the necessary mapping from the validators and the respective Swarm hash to access the data from the Swarm network. - - New data sources (blockchains) and types (block, state, logs, receipts, blobs etc) can be added for ingestion over time using governance. + - New data sources (blockchains) and types (block, state, logs, receipts, blobs, etc) can be added for ingestion over time using on-chain governance. - - The state of the Data chain should be updated to a smart contract in Layer 1 (Ethereum) so that it inherits the same security gurantees as in Ethereum. + - The state of the data chain should be updated to a smart contract in Layer 1 (Ethereum) so that it inherits the same security guarantees as Ethereum. ## Rationale -1) [SWIP-42](https://github.com/ethersphere/SWIPs/pull/42/files#diff-b0a6bcf1f6e706ea47edb89ad8b82c36c4ef6dee3576e1e91b2b0248fd31a5a8) proposes a similar design where every data that is ingested is updated to layer 1. This solution is prohibitively expensive and less in data capacity since it uses Layer 1. +1) [SWIP-42](https://github.com/ethersphere/SWIPs/pull/42/files#diff-b0a6bcf1f6e706ea47edb89ad8b82c36c4ef6dee3576e1e91b2b0248fd31a5a8) proposes a similar design where every data ingested is updated to layer 1. This solution is prohibitively expensive and uses less data capacity since it uses Layer 1. -2) Using a seperate blockchain for managing the storage makes it much more data centric and helps to create a generic solution to add dataspaces on the fly. +2) Using a separate blockchain for managing the storage makes it much more data-centric and helps create a generic solution for adding dataspaces on the fly. -3) Using Swarm as the final resting place for data inherits all the goodies that has been built in Swarm over the years. +3) Using Swarm as the final resting place for data inherits all the goodies built in Swarm over the years. -4) Later this chain can be upgraded to have an EVM so that more programmable usage and control of data can be enabled. +4) Later, this chain can be upgraded to have an EVM to enable more programmable usage and data control. -5) Having a seperate Swarm chain and bringing in more chain data in to the network and will help Swarm operators directly. +5) Having a separate Swarm chain helps bring more data into the network, which will directly benefit Swarm operators. -6) Future blockchains will be highly decentralised as the resource requirements of a client will come down drastically and at the same time they will have the same security as if they have stored all the chain data. +6) Future blockchains will be highly decentralized as the resource requirements of a client will come down drastically, and at the same time, they will have the same security as if they have stored all the chain data. ## Backwards Compatibility -This is a new design for Data Chain so there is no specific backward compatibility requirement. Special care has to be taken to ensure that the data sampling algorithms are backward compatible. +This is a new design for Data Chain, so there is no specific backward compatibility requirement. However, special care must be taken to ensure that the data sampling algorithms are backward compatible. -With r.to Swarm, the validators will use the Swarm API to push data and store their BZZ address as part of their state data. +With r.to Swarm, the validators will use the Swarm API to push data and store their BZZ address as part of the data chain's state. ## Test Cases -AT first we should run a testnet that can capture some data from testnets of other blockchains. This will help in testing the working of the chain and the integration with Swarm network. +First, we should run a testnet that can capture data from testnets of other blockchains. This will help us test the chain's workings and integration with the Swarm network. ## Implementation -To start with, the Data chain ca be built with CometBFT and Cosmos SDK on top of it. We can start with few validators and then increase the number as we test. The BZZ can be brough in using a bridge from Ethereum Layer1 so that validators can stake and other users can delegate them to run the network. +To start with, the Data chain can be built with CometBFT(Previously Tendermint) and Cosmos SDK on top of it. We can start with a few validators and then increase the number as we test. The BZZ can be brought into this new chain using a bridge from Ethereum Layer1 so that validators can stake and other users can delegate them to run the network. ## Copyright Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).