From a5e86b456fc3edf98cf36aaf18badf03dc6e2e3d Mon Sep 17 00:00:00 2001 From: David Dias Date: Tue, 10 Sep 2019 10:52:47 +0300 Subject: [PATCH 01/19] Create ENHANCED_BITSWAP_GRAPHSYNC.md --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 29 +++++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100644 OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md new file mode 100644 index 0000000..255faef --- /dev/null +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -0,0 +1,29 @@ +# Enhanced Bitswap/GraphSync with more Network Smarts + +## Description + +## State of the Art + +> This survey on the State of the Art is not by any means complete, however, it should provide a good entry point to learn what are the existing work. If you have something that is fundamentally missing, please consider submitting a PR to augment this survey. + +### Within the IPFS Ecosystem +> Existing attempts and strategies + +### Within the broad Research Ecosystem +> How do people try to solve this problem? + +### Known shortcommins of existing solutions +> What are the limitations on those solutions? + +## Solving this Open Problem + +### What is the impact + +### What defines a complete solution? +> What hard constraints should it obey? Are there additional soft constraints that a solution would ideally obey? + +## Other + +### Existing Conversations/Threads + +### Extra notes From 88459f8b831de50b2a2313e5db40562ab4ab94e3 Mon Sep 17 00:00:00 2001 From: David Dias Date: Thu, 12 Sep 2019 12:10:03 +0300 Subject: [PATCH 02/19] Update ENHANCED_BITSWAP_GRAPHSYNC.md --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index 255faef..65339dd 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -26,4 +26,14 @@ ### Existing Conversations/Threads +- [Batch gets](https://github.com/ipfs/notes/issues/285) +- [Move giant files/datasets at Level1 + Level2 ](https://github.com/ipfs/notes/issues/218) +- [Reed-Solomon layer over IPFS](https://github.com/ipfs/notes/issues/196) +- [Tracking progress and other data](https://github.com/ipfs/notes/issues/107) +- [Advanced Pinning](https://github.com/ipfs/notes/issues/49) +- [bandwidth estimators](https://github.com/ipfs/notes/issues/30) +- [Alternative BitSwap strategies](https://github.com/ipfs/notes/issues/20) +- [Bitswap Simulator](https://github.com/heems/bssim) +- [bsdash - dashboard for go-ipfs bitswap](https://www.npmjs.com/package/bsdash) + ### Extra notes From 3d5e62a94b09ef448bce6fa81e492d14f9099c49 Mon Sep 17 00:00:00 2001 From: David Dias Date: Wed, 18 Sep 2019 14:30:23 +0300 Subject: [PATCH 03/19] Update ENHANCED_BITSWAP_GRAPHSYNC.md --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index 65339dd..ef4d028 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -1,6 +1,9 @@ # Enhanced Bitswap/GraphSync with more Network Smarts -## Description +## Short Description +> In one sentence or paragraph. + +## Long Description ## State of the Art From acc6b6e6bcc24009952173bd690b9e684f7ddd4d Mon Sep 17 00:00:00 2001 From: Yiannis Psaras <52073247+yiannisbot@users.noreply.github.com> Date: Thu, 7 Nov 2019 07:48:02 +0200 Subject: [PATCH 04/19] Bitswap/Graphsync Open Problem Description --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 88 ++++++++++++++++++++- 1 file changed, 87 insertions(+), 1 deletion(-) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index ef4d028..32848b6 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -3,8 +3,28 @@ ## Short Description > In one sentence or paragraph. +Once IPFS has resolved the CID to a peerID and further to the exact location of the node that stores the requested content, it then uses a very simple protocol called bitswap to exchange content between the requestor and the server. + +Although bitswap is simple and generally works, *its performance is significantly suboptimal*. This is mainly due to the fact that a node cannot request a subgraph of the DAG and results in many round-trips in order to “walk down” the DAG. The current operation of bitswap is also very often linked to duplicate transmission and receipt of content which overloads both the end nodes but also the network. + ## Long Description +In order to synchronise a DAG between untrusted nodes, bitswap is exploiting the content-addressability feature of IPFS. This means that bitswap starts with the root hash of the DAG and once it has the data, it rehashes the data to verify that it results to the same hash. The node can now trust this block and hence, can continue with the rest of the blocks in the DAG (aka walk the DAG), until it gets the whole DAG and the data associated with it. + +In the current implementation of bitswap, requesting nodes send their WANT lists to all the peers they are directly connected to. This inevitably results in potentially duplicate traffic travelling back to the receiving node. When the receiving node has received the block(s) it asked for, it sends a CANCEL message to its peers to let them know that the block(s) is not in its WANT list anymore. + +If none of the directly connected peers have any of the WANT list blocks, bitswap falls back to the DHT to find the requested content. Obviously, this results in long delays to get to the actual peer that stores the requested content. + +Furthermore, once the recipient node starts receiving content from multiple peer nodes, it prunes down the long-latency peers and keeps the one to which the RTT is the shortest. Instead, current proposals within the IPFS ecosystem are considering optimising for throughput. Although this has its advantages, which one is the ideal to follow is not clear at this point. + +It is important to highlight that bitswap is a message-oriented protocol and _not_ a request-response protocol. That is, it sends messages including WANT lists and is then waiting for the blocks in the WANT list to be delivered back, as and when the discovery protocol manages to find the blocks in the WANT list. + + +This process of bitswap results in the following major problems: + +- **Many Roundtrips:** The most pressing issue is that requesting nodes have to make several roundtrips to walk the DAG, especially given that IPFS is a network run by untrusted nodes. +- **Duplicate Data:** as mentioned, bitswap is operating at the block level. In turn, this means that every block has to be requested from each peer that the node is connected to. This process results in duplicate traffic towards the data requestor unnecessary load for the peers and high overhead for the network. + ## State of the Art > This survey on the State of the Art is not by any means complete, however, it should provide a good entry point to learn what are the existing work. If you have something that is fundamentally missing, please consider submitting a PR to augment this survey. @@ -12,19 +32,80 @@ ### Within the IPFS Ecosystem > Existing attempts and strategies +The IPFS community has long been aware of the issues of bitswap and has experimented (at least conceptually) with potential solutions. We are discussing some of them here: + +- *Parallelise the DAG walk.* Inherently, DAGs have one root node and then split into binary trees. Doing one walk for each of the branches results in the same number of round trips, but reduces the actual delay in half. Clearly, this approach does not apply for single-branch graphs, such as blockchains. + +- *DAG Block Interconnection.* Although bitswap does not/cannot recognise any relationship between different blocks of the same DAG, a requesting node can artificially continuously ask for subsequent blocks of the same block the node that it found the previous block. This approach intuitively assumes that a node that had a block of a DAG is very likely to have subsequent ones. This is often referred to as “session” between the peers that have provided some part of the DAG. + +- *Latency vs Throughput.* Bitswap is currently sorting peers by latency, i.e., it is pruning down the connections that incur higher latency. It has been suggested that this is changed to maximise throughput (i.e., keep the pipe full). + +- *Coding and Reed Solomon Codes.* This is a very efficient way to reduce storage space in general (trusted or untrusted) caching systems. The area of coded caching has seen significant development lately in the research community, hence, it is discussed further down. + +- *GraphSync through IPLD selectors:* GraphSync is a specification of a protocol to synchronise graphs across peers. Graphsync answers the question of “how do we succinctly identify selectors connected to the root”. In doing that, it develops WANT lists of selectors instead of blocks and requests have selectors instead of hashes. Furthermore, in contrast to bitswap which is message-oriented, graph sync includes responses and hence, it is closer to a request-response protocol. If designed carefully, Graphsync can be very effective, but there are challenges that are still unresolved: i) it is more difficult to parallelise (see above), given that GraphSync operates in terms of requests and not blocks, ii) it can potentially be much easier to carry DDoS attacks with queries than with requests for blocks. + +- *WANT Potential and HAVE message.* There have been several optimisation proposals, such as the implementation of a HAVE message before actually sending the requested block back. Although this immediately reduces duplicate traffic, it introduces one extra RTT, which might not be desirable in some cases, but might not have any negative impact in many others. + +- Golang Implementation of Bitswap: https://github.com/ipfs/go-bitswap +- GraphSync: https://github.com/ipld/specs/blob/master/block-layer/graphsync/graphsync.md +- Bitswap Latency Measurement: https://github.com/ipfs/go-bitswap/issues/166 +- Bitswap Sessions Exploration: https://github.com/ipfs/go-bitswap/issues/165 +- Bitswap Protocol Extensions: https://github.com/ipfs/go-bitswap/issues/186 +- Reed-Solomon layer over IPFS: https://github.com/ipfs/notes/issues/196 +- Bitswap Request Sharding: https://github.com/ipfs/go-bitswap/issues/167 + ### Within the broad Research Ecosystem > How do people try to solve this problem? -### Known shortcommins of existing solutions +*Bittorrent* + +The bittorrent protocol has a different architectural design, which includes the “tracker” that keeps track of chunks per node and coordinates what is downloaded from which node. The design is substantially different to the IPFS, but included some interesting policies, such as fetch “Rarest First”, “Random First” and “Endgame Mode”, which is a greedy mode to collect all the chunks that a node has not managed to find yet. + +- Peer-to-Peer Networking with Bittorrent, 2005, http://web.cs.ucla.edu/classes/cs217/05BitTorrent.pdf +- THE BITTORRENT P2P FILE-SHARING SYSTEM: MEASUREMENTS AND ANALYSIS https://pdos.csail.mit.edu/archive/6.824-2007/papers/pouwelse-btmeasure.pdf +- Do incentives build robustness in BitTorrent?, NSDI 2007, https://www.usenix.org/legacy/events/nsdi07/tech/piatek/piatek.pdf +- Modeling and performance analysis of BitTorrent-like peer-to-peer networks, ACM CCR 2004 http://conferences.sigcomm.org/sigcomm/2004/papers/p444-qiu1.pdf +- Analyzing and Improving a BitTorrent Network’s Performance Mechanisms, Infocom 2006, http://jmvidal.cse.sc.edu/library/bharambe06a.pdf + +*Coded Caching* + +There have been significant research efforts lately in the area of coded caching. The main concept has been proposed in 1960s in the form of error correction and targeted the area of content delivery over wireless, lossy channels. It has been known as Reed-Solomon error correction. Lately, with seminal works such as “Fundamental Limits of Caching”, Niesen et. al. have proposed the use of coding to improve caching performance. In a summary, the technique works as follows: if we have a file that consists of 10 chunks and we store all 10 chunks in the same or different memories/nodes, then we need to retrieve those exact 10 chunks in order to reconstruct the file. + +In contrast, according to the coded caching theory, before storing the 10 chunks we encode the file using erasure codes. This results in some number of chunks x>10, say 13, for the sake of illustration. This clearly results in more data produced after adding codes to the original data. However, when attempting to retrieve the original file, a user needs to collect *any 10 of those 13 chunks*. By doing so, the user will be able to reconstruct the original file, without needing to get all 13 chunks. Although such approach does not save bandwidth (we still need to reconstruct 10 chunks of equal size to the original one), it makes the network more resilient to nodes being unavailable. In other words, in order to reconstruct the original file without coding, all 10 of the original peers that store a file have to be online and ready to deliver the chunks, whereas in the coded caching case, any 10 out of the 13 peers need to be available and ready to provide the chunks. Blind replication of the original chunks will not provide the same benefit, as the number of peers will need to be much higher (at least 20 as compared to 13) in order to operate with the same satisfaction ratio. + +- Reed-Solomon Error Correction: https://en.wikipedia.org/wiki/Reed–Solomon_error_correction +- Maddah-Ali, M.A., Niesen, U. Fundamental limits of caching. IEEE Transactions on Information Theory,60(5):2856–2867, 2014 +- Fundamental Limits of Caching (slides): http://archive.dimacs.rutgers.edu/Workshops/Green/Slides/niesen.pdf +- Shanmugam, K., Golrezaei, N., Dimakis, A.G., Molisch, A.F., Caire, G. Femtocaching: Wire- +less content delivery through distributed caching helpers. IEEE Transactions on Information Theory, 59(12):8402–8413, 2013 +- Amiri, M.M., Gündüz, D. Fundamental limits of coded caching: Improved delivery rate-cache capacity +tradeoff. IEEE Transactions on Communications, 65(2):806–815, 2017 +- Decentralized Coded Caching Attains Order-Optimal Memory-Rate Tradeoff, IEEE Transactions on Networking, 2015, https://ieeexplore.ieee.org/document/6807823, https://arxiv.org/pdf/1301.5848.pdf +- Coded Caching With Nonuniform Demands, IEEE Transactions on Information Theory, 2017, https://ieeexplore.ieee.org/document/7782760 + + +### Known shortcomings of existing solutions > What are the limitations on those solutions? +The shortcomings of the current bitswap protocol have been discussed above in the State of the Art section. GraphSync is a very promising next step, which however, comes with its own challenges, also discussed above. + +The approaches that bittorrent has used need to be evaluated, but always keeping in mind that its architecture is different and based on a (logically) centralised tracker. + +Coded caching on the other hand incurs some overhead, which might not be ideal for several use-cases of the IPFS system (e.g., popular content), but very convenient for others (cold storage). + ## Solving this Open Problem ### What is the impact +Bitswap is a central component of the IPFS system and a protocol that generally influences the performance of IPFS as a whole. Designing and integrating smart algorithmic solutions to improve its performance can be instrumental for the performance and adoption of IPFS. It is worth highlighting that bitswap allows ample space for improvement, hence, several different solutions are encouraged and will be considered. Addressing sub-graphs of the DAG through IPLD as GraphSync is attempting to do in a secure manner is a first necessary step. + ### What defines a complete solution? > What hard constraints should it obey? Are there additional soft constraints that a solution would ideally obey? +First and foremost, any complete solution should account for extensibility as the IPFS system needs to scale up and more applications are implemented on top. The active number of users of IPFS is increasing exponentially and the requests submitted to the network are following accordingly. That said, a complete solution should account for those numbers. + +There are several desirable features discussed within IPFS that should be implemented in a complete solution for bitswap and its successor GraphSync - see list below in Existing Conversations/Threads section. Some of them target clean protocol extensions, while some others target higher-level measurement and UI/UX issues. + ## Other ### Existing Conversations/Threads @@ -36,7 +117,12 @@ - [Advanced Pinning](https://github.com/ipfs/notes/issues/49) - [bandwidth estimators](https://github.com/ipfs/notes/issues/30) - [Alternative BitSwap strategies](https://github.com/ipfs/notes/issues/20) +- [2-Hop Bitswap](https://github.com/ipfs/notes/issues/386) +- [IPFS Spy](https://github.com/jimpick/go-ipfs/blob/jim/network-logging/README.ipfs-spy.md) - [Bitswap Simulator](https://github.com/heems/bssim) - [bsdash - dashboard for go-ipfs bitswap](https://www.npmjs.com/package/bsdash) +- Presentation of bitswap in Sept 2019 IPFS Weekly Call: https://www.youtube.com/watch?v=G_Q7iTpwYQU&list=PLuhRWgmPaHtSGRSHdU9dbsukHKlihZZAe&index=4 + + ### Extra notes From b80b4c0d3926c8127f98655aefe9e10274c7ff9f Mon Sep 17 00:00:00 2001 From: Yiannis Psaras <52073247+yiannisbot@users.noreply.github.com> Date: Thu, 7 Nov 2019 16:03:15 +0200 Subject: [PATCH 05/19] Update OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md Co-Authored-By: Jorge Soares --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index 32848b6..6928e3c 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -5,7 +5,7 @@ Once IPFS has resolved the CID to a peerID and further to the exact location of the node that stores the requested content, it then uses a very simple protocol called bitswap to exchange content between the requestor and the server. -Although bitswap is simple and generally works, *its performance is significantly suboptimal*. This is mainly due to the fact that a node cannot request a subgraph of the DAG and results in many round-trips in order to “walk down” the DAG. The current operation of bitswap is also very often linked to duplicate transmission and receipt of content which overloads both the end nodes but also the network. +Although bitswap is simple and generally works, *its performance is suboptimal*. This is mainly due to the fact that a node cannot request a subgraph of the DAG and results in many round-trips in order to “walk down” the DAG. The current operation of bitswap is also very often linked to duplicate transmission and receipt of content which overloads both the end nodes and the network. ## Long Description From d9287d7bfc6062a584eb502fba2163f6c0ca1797 Mon Sep 17 00:00:00 2001 From: Yiannis Psaras <52073247+yiannisbot@users.noreply.github.com> Date: Thu, 7 Nov 2019 16:04:08 +0200 Subject: [PATCH 06/19] Update OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md Co-Authored-By: Jorge Soares --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index 6928e3c..baff700 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -9,7 +9,7 @@ Although bitswap is simple and generally works, *its performance is suboptimal*. ## Long Description -In order to synchronise a DAG between untrusted nodes, bitswap is exploiting the content-addressability feature of IPFS. This means that bitswap starts with the root hash of the DAG and once it has the data, it rehashes the data to verify that it results to the same hash. The node can now trust this block and hence, can continue with the rest of the blocks in the DAG (aka walk the DAG), until it gets the whole DAG and the data associated with it. +In order to synchronise a DAG between untrusted nodes, bitswap is exploiting the content-addressability feature of IPFS. This means that bitswap starts with the root hash of the DAG. Once it fetches the data, it verifies that its hash matches. The node can now trust this block and can thus continue with the rest of the blocks in the DAG (aka walk the DAG) until it gets the complete DAG and the data associated with it. In the current implementation of bitswap, requesting nodes send their WANT lists to all the peers they are directly connected to. This inevitably results in potentially duplicate traffic travelling back to the receiving node. When the receiving node has received the block(s) it asked for, it sends a CANCEL message to its peers to let them know that the block(s) is not in its WANT list anymore. From cf2b0194db9c0280bdb96748c482c7428e3f9f63 Mon Sep 17 00:00:00 2001 From: Yiannis Psaras <52073247+yiannisbot@users.noreply.github.com> Date: Thu, 7 Nov 2019 16:06:07 +0200 Subject: [PATCH 07/19] Update OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md Co-Authored-By: Jorge Soares --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index baff700..652cff4 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -11,7 +11,7 @@ Although bitswap is simple and generally works, *its performance is suboptimal*. In order to synchronise a DAG between untrusted nodes, bitswap is exploiting the content-addressability feature of IPFS. This means that bitswap starts with the root hash of the DAG. Once it fetches the data, it verifies that its hash matches. The node can now trust this block and can thus continue with the rest of the blocks in the DAG (aka walk the DAG) until it gets the complete DAG and the data associated with it. -In the current implementation of bitswap, requesting nodes send their WANT lists to all the peers they are directly connected to. This inevitably results in potentially duplicate traffic travelling back to the receiving node. When the receiving node has received the block(s) it asked for, it sends a CANCEL message to its peers to let them know that the block(s) is not in its WANT list anymore. +In the current implementation of bitswap, requesting nodes send their WANT lists to all the peers they are directly connected to. This results in potentially duplicate data being sent back to the receiving node. When the receiving node has received the block(s) it asked for, it sends a CANCEL message to its peers to let them know that the block(s) is not in its WANT list anymore. If none of the directly connected peers have any of the WANT list blocks, bitswap falls back to the DHT to find the requested content. Obviously, this results in long delays to get to the actual peer that stores the requested content. From 5f8aeb230fdf7743daf584051d9a2a89ea109b99 Mon Sep 17 00:00:00 2001 From: Yiannis Psaras <52073247+yiannisbot@users.noreply.github.com> Date: Thu, 7 Nov 2019 16:11:06 +0200 Subject: [PATCH 08/19] Update OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md Co-Authored-By: Jorge Soares --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index 652cff4..12cc475 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -13,7 +13,7 @@ In order to synchronise a DAG between untrusted nodes, bitswap is exploiting the In the current implementation of bitswap, requesting nodes send their WANT lists to all the peers they are directly connected to. This results in potentially duplicate data being sent back to the receiving node. When the receiving node has received the block(s) it asked for, it sends a CANCEL message to its peers to let them know that the block(s) is not in its WANT list anymore. -If none of the directly connected peers have any of the WANT list blocks, bitswap falls back to the DHT to find the requested content. Obviously, this results in long delays to get to the actual peer that stores the requested content. +If none of the directly connected peers have any of the WANT list blocks, bitswap falls back to the DHT to find the requested content. This results in long delays to get to a peer that stores the requested content. Furthermore, once the recipient node starts receiving content from multiple peer nodes, it prunes down the long-latency peers and keeps the one to which the RTT is the shortest. Instead, current proposals within the IPFS ecosystem are considering optimising for throughput. Although this has its advantages, which one is the ideal to follow is not clear at this point. From 0500bf17f31ab426295edfe8eee18e93bd829082 Mon Sep 17 00:00:00 2001 From: Yiannis Psaras <52073247+yiannisbot@users.noreply.github.com> Date: Thu, 7 Nov 2019 16:13:17 +0200 Subject: [PATCH 09/19] Update OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md Co-Authored-By: Jorge Soares --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index 12cc475..ae2c27a 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -15,7 +15,7 @@ In the current implementation of bitswap, requesting nodes send their WANT lists If none of the directly connected peers have any of the WANT list blocks, bitswap falls back to the DHT to find the requested content. This results in long delays to get to a peer that stores the requested content. -Furthermore, once the recipient node starts receiving content from multiple peer nodes, it prunes down the long-latency peers and keeps the one to which the RTT is the shortest. Instead, current proposals within the IPFS ecosystem are considering optimising for throughput. Although this has its advantages, which one is the ideal to follow is not clear at this point. +Once the recipient node starts receiving content from multiple peer nodes, it prunes down the long-latency peers and keeps the one to which the RTT is the shortest. Current proposals within the IPFS ecosystem are considering keeping the node with the highest throughput instead. It is not clear at this point which is the best approach. It is important to highlight that bitswap is a message-oriented protocol and _not_ a request-response protocol. That is, it sends messages including WANT lists and is then waiting for the blocks in the WANT list to be delivered back, as and when the discovery protocol manages to find the blocks in the WANT list. From cf51f318332f28232eeffdc4d0c23c7033c10096 Mon Sep 17 00:00:00 2001 From: Yiannis Psaras <52073247+yiannisbot@users.noreply.github.com> Date: Thu, 7 Nov 2019 16:14:23 +0200 Subject: [PATCH 10/19] Update OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md Co-Authored-By: Jorge Soares --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index ae2c27a..e33e4fa 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -42,7 +42,7 @@ The IPFS community has long been aware of the issues of bitswap and has experime - *Coding and Reed Solomon Codes.* This is a very efficient way to reduce storage space in general (trusted or untrusted) caching systems. The area of coded caching has seen significant development lately in the research community, hence, it is discussed further down. -- *GraphSync through IPLD selectors:* GraphSync is a specification of a protocol to synchronise graphs across peers. Graphsync answers the question of “how do we succinctly identify selectors connected to the root”. In doing that, it develops WANT lists of selectors instead of blocks and requests have selectors instead of hashes. Furthermore, in contrast to bitswap which is message-oriented, graph sync includes responses and hence, it is closer to a request-response protocol. If designed carefully, Graphsync can be very effective, but there are challenges that are still unresolved: i) it is more difficult to parallelise (see above), given that GraphSync operates in terms of requests and not blocks, ii) it can potentially be much easier to carry DDoS attacks with queries than with requests for blocks. +- *GraphSync through IPLD selectors:* GraphSync is a specification of a protocol to synchronise graphs across peers. Graphsync answers the question of “how do we succinctly identify selectors connected to the root”. In doing that, it develops WANT lists of selectors instead of blocks. Furthermore, in contrast to bitswap, which is message-oriented, GraphSync includes responses and is thus closer to a request-response protocol. If designed carefully, GraphSync can be very effective but there are still unresolved challenges: i) it is more difficult to parallelise (see above), given that GraphSync operates in terms of requests and not blocks, ii) it can potentially be much easier to carry DDoS attacks with queries than with requests for blocks. - *WANT Potential and HAVE message.* There have been several optimisation proposals, such as the implementation of a HAVE message before actually sending the requested block back. Although this immediately reduces duplicate traffic, it introduces one extra RTT, which might not be desirable in some cases, but might not have any negative impact in many others. From 9e1a51ba9e379258524ded25aa672fe218ce8206 Mon Sep 17 00:00:00 2001 From: Yiannis Psaras <52073247+yiannisbot@users.noreply.github.com> Date: Thu, 7 Nov 2019 16:18:34 +0200 Subject: [PATCH 11/19] Update OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md Co-Authored-By: Jorge Soares --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index e33e4fa..9ef3ac9 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -23,7 +23,7 @@ It is important to highlight that bitswap is a message-oriented protocol and _no This process of bitswap results in the following major problems: - **Many Roundtrips:** The most pressing issue is that requesting nodes have to make several roundtrips to walk the DAG, especially given that IPFS is a network run by untrusted nodes. -- **Duplicate Data:** as mentioned, bitswap is operating at the block level. In turn, this means that every block has to be requested from each peer that the node is connected to. This process results in duplicate traffic towards the data requestor unnecessary load for the peers and high overhead for the network. +- **Duplicate Data:** Bitswap is operating at the block level. In turn, this means that every block has to be requested from each peer that the node is connected to. This process results in duplicate traffic towards the data requestor, unnecessary load for the peers, and high overhead for the network. ## State of the Art From 9516af0995453f93d61c5a6906b671fcbcb0c262 Mon Sep 17 00:00:00 2001 From: Yiannis Psaras <52073247+yiannisbot@users.noreply.github.com> Date: Thu, 7 Nov 2019 19:10:59 +0200 Subject: [PATCH 12/19] Update OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md Co-Authored-By: Jorge Soares --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index 9ef3ac9..c0baa61 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -36,7 +36,7 @@ The IPFS community has long been aware of the issues of bitswap and has experime - *Parallelise the DAG walk.* Inherently, DAGs have one root node and then split into binary trees. Doing one walk for each of the branches results in the same number of round trips, but reduces the actual delay in half. Clearly, this approach does not apply for single-branch graphs, such as blockchains. -- *DAG Block Interconnection.* Although bitswap does not/cannot recognise any relationship between different blocks of the same DAG, a requesting node can artificially continuously ask for subsequent blocks of the same block the node that it found the previous block. This approach intuitively assumes that a node that had a block of a DAG is very likely to have subsequent ones. This is often referred to as “session” between the peers that have provided some part of the DAG. +- *DAG Block Interconnection.* Although bitswap does not/cannot recognise any relationship between different blocks of the same DAG, a requesting node can ask a node that provided a previous block for subsequent blocks of the same DAG. This approach intuitively assumes that a node that has a block of a DAG is very likely to have others. This is often referred to as “session” between the peers that have provided some part of the DAG. - *Latency vs Throughput.* Bitswap is currently sorting peers by latency, i.e., it is pruning down the connections that incur higher latency. It has been suggested that this is changed to maximise throughput (i.e., keep the pipe full). From bcdca81f80b67e4d1c2d4394eab57deb9b60f7ff Mon Sep 17 00:00:00 2001 From: Yiannis Psaras <52073247+yiannisbot@users.noreply.github.com> Date: Tue, 12 Nov 2019 11:51:40 +0000 Subject: [PATCH 13/19] Integration of Dirk's feedback --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 27 ++++++++++++++++++--- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index c0baa61..5f82ec7 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -5,7 +5,9 @@ Once IPFS has resolved the CID to a peerID and further to the exact location of the node that stores the requested content, it then uses a very simple protocol called bitswap to exchange content between the requestor and the server. -Although bitswap is simple and generally works, *its performance is suboptimal*. This is mainly due to the fact that a node cannot request a subgraph of the DAG and results in many round-trips in order to “walk down” the DAG. The current operation of bitswap is also very often linked to duplicate transmission and receipt of content which overloads both the end nodes and the network. +In particular, IPFS asks Bitswap for the blocks corresponding to a set of CIDs. Upon this request, Bitswap sends a request for the CIDs to all of its directly connected peers. If none of the node's directly connected peers have one or more of the requested blocks, Bitswap falls back to querying the DHT for the root node of the DAG. That said, Bitswap also participates in the discovery phase and not only in the actual block exchange. + +Although bitswap is simple and generally works, we feel that *its performance can be substantially improved*. One of the main factors that hold performance back is the fact that a node cannot request a subgraph of the DAG and results in many round-trips in order to “walk down” the DAG. The current operation of bitswap is also very often linked to duplicate transmission and receipt of content which overloads both the end nodes and the network. ## Long Description @@ -17,7 +19,7 @@ If none of the directly connected peers have any of the WANT list blocks, bitswa Once the recipient node starts receiving content from multiple peer nodes, it prunes down the long-latency peers and keeps the one to which the RTT is the shortest. Current proposals within the IPFS ecosystem are considering keeping the node with the highest throughput instead. It is not clear at this point which is the best approach. -It is important to highlight that bitswap is a message-oriented protocol and _not_ a request-response protocol. That is, it sends messages including WANT lists and is then waiting for the blocks in the WANT list to be delivered back, as and when the discovery protocol manages to find the blocks in the WANT list. +It is important to highlight that bitswap is a message-oriented protocol and _not_ a request-response protocol. That is, it sends messages including WANT lists and is then waiting for the blocks in the WANT list to be delivered back, as and when the discovery protocol manages to find the blocks in the WANT list. This process of bitswap results in the following major problems: @@ -25,6 +27,14 @@ This process of bitswap results in the following major problems: - **Many Roundtrips:** The most pressing issue is that requesting nodes have to make several roundtrips to walk the DAG, especially given that IPFS is a network run by untrusted nodes. - **Duplicate Data:** Bitswap is operating at the block level. In turn, this means that every block has to be requested from each peer that the node is connected to. This process results in duplicate traffic towards the data requestor, unnecessary load for the peers, and high overhead for the network. +In the latest release of Bitswap there is an optimisation of the above procedure that incorporates the concept of Sessions. Sessions are used to keep track of peers that have some of the blocks that the Session is interested in. In particular: + +- When IPFS asks Bitswap for the blocks corresponding to a set of CIDs, Bitswap: + - creates a new session + - starts a discovery phase to find the peers with a subset of the blocks. +- As peers are discovered they are remembered by the Session. Subsequent requests are sent only to those peers. +- If there is a timeout, the Session tries to discover more peers through the DHT. + ## State of the Art > This survey on the State of the Art is not by any means complete, however, it should provide a good entry point to learn what are the existing work. If you have something that is fundamentally missing, please consider submitting a PR to augment this survey. @@ -44,7 +54,17 @@ The IPFS community has long been aware of the issues of bitswap and has experime - *GraphSync through IPLD selectors:* GraphSync is a specification of a protocol to synchronise graphs across peers. Graphsync answers the question of “how do we succinctly identify selectors connected to the root”. In doing that, it develops WANT lists of selectors instead of blocks. Furthermore, in contrast to bitswap, which is message-oriented, GraphSync includes responses and is thus closer to a request-response protocol. If designed carefully, GraphSync can be very effective but there are still unresolved challenges: i) it is more difficult to parallelise (see above), given that GraphSync operates in terms of requests and not blocks, ii) it can potentially be much easier to carry DDoS attacks with queries than with requests for blocks. -- *WANT Potential and HAVE message.* There have been several optimisation proposals, such as the implementation of a HAVE message before actually sending the requested block back. Although this immediately reduces duplicate traffic, it introduces one extra RTT, which might not be desirable in some cases, but might not have any negative impact in many others. +- *WANT Potential and HAVE message.* There have been several optimisation proposals, such as the implementation of a HAVE message before actually sending the requested block back. Although this immediately reduces duplicate traffic, it introduces one extra RTT, which might not be desirable in some cases, but might not have any negative impact in many others. In particular, the new message types are: + + - HAVE: a node can indicate that it has a block, which helps with reducing duplication, particularly during the discovery phase. + - DONT_HAVE: a node can immediately indicate that it does not have a block. This is in contrast to the current implementation, where the requesting node has to wait for a timeout. + +In the latest proof-of-concept implementation, the following message combinations are being implemented as optimisation steps: + + - WANT-BLOCK: *"send the block if you have it"*. This is an attempt to avoid the extra RTT. + - WANT-HAVE: *"let me know if you have the block"*. This is to avoid duplication and help disseminate knowledge of block distribution in the network. + +This way, the WANT-BLOCK messages are directed to the peers that are most likely to have those blocks. - Golang Implementation of Bitswap: https://github.com/ipfs/go-bitswap - GraphSync: https://github.com/ipld/specs/blob/master/block-layer/graphsync/graphsync.md @@ -53,6 +73,7 @@ The IPFS community has long been aware of the issues of bitswap and has experime - Bitswap Protocol Extensions: https://github.com/ipfs/go-bitswap/issues/186 - Reed-Solomon layer over IPFS: https://github.com/ipfs/notes/issues/196 - Bitswap Request Sharding: https://github.com/ipfs/go-bitswap/issues/167 +- Bitswap POC with HAVE / DONT_HAVE: ipfs/go-bitswap#189 ### Within the broad Research Ecosystem > How do people try to solve this problem? From b1d134dfe43f97fbb958783a6ac9eea2b17a1517 Mon Sep 17 00:00:00 2001 From: Yiannis Psaras <52073247+yiannisbot@users.noreply.github.com> Date: Tue, 12 Nov 2019 12:44:02 +0000 Subject: [PATCH 14/19] Update ENHANCED_BITSWAP_GRAPHSYNC.md --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index 5f82ec7..80fd0c7 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -123,7 +123,7 @@ Bitswap is a central component of the IPFS system and a protocol that generally ### What defines a complete solution? > What hard constraints should it obey? Are there additional soft constraints that a solution would ideally obey? -First and foremost, any complete solution should account for extensibility as the IPFS system needs to scale up and more applications are implemented on top. The active number of users of IPFS is increasing exponentially and the requests submitted to the network are following accordingly. That said, a complete solution should account for those numbers. +First and foremost, any complete solution should account for extensibility as the IPFS system needs to scale up and more applications are implemented on top. The active number of users of IPFS is increasing exponentially and the requests submitted to the network are following accordingly. That being said and acknowledging the fact that it is often difficult to find "one size fits all" solutions, tunable versions of bitswap that can dynamically adapt according to network and environment conditions is an avenue that needs to be explored. There are several desirable features discussed within IPFS that should be implemented in a complete solution for bitswap and its successor GraphSync - see list below in Existing Conversations/Threads section. Some of them target clean protocol extensions, while some others target higher-level measurement and UI/UX issues. From 6fb63c435aaf8b5416e64cd605409788d87fe6d2 Mon Sep 17 00:00:00 2001 From: Yiannis Psaras <52073247+yiannisbot@users.noreply.github.com> Date: Tue, 12 Nov 2019 12:48:03 +0000 Subject: [PATCH 15/19] Update ENHANCED_BITSWAP_GRAPHSYNC.md --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index 80fd0c7..da5bd0b 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -73,7 +73,7 @@ This way, the WANT-BLOCK messages are directed to the peers that are most likely - Bitswap Protocol Extensions: https://github.com/ipfs/go-bitswap/issues/186 - Reed-Solomon layer over IPFS: https://github.com/ipfs/notes/issues/196 - Bitswap Request Sharding: https://github.com/ipfs/go-bitswap/issues/167 -- Bitswap POC with HAVE / DONT_HAVE: ipfs/go-bitswap#189 +- Bitswap POC with HAVE / DONT_HAVE: https://github.com/ipfs/go-bitswap#189 ### Within the broad Research Ecosystem > How do people try to solve this problem? From 7565ff64aa086e25a4ff1ae3742d21ac3777888f Mon Sep 17 00:00:00 2001 From: Yiannis Psaras <52073247+yiannisbot@users.noreply.github.com> Date: Tue, 12 Nov 2019 12:52:21 +0000 Subject: [PATCH 16/19] Update ENHANCED_BITSWAP_GRAPHSYNC.md --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index da5bd0b..78bd787 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -7,7 +7,7 @@ Once IPFS has resolved the CID to a peerID and further to the exact location of In particular, IPFS asks Bitswap for the blocks corresponding to a set of CIDs. Upon this request, Bitswap sends a request for the CIDs to all of its directly connected peers. If none of the node's directly connected peers have one or more of the requested blocks, Bitswap falls back to querying the DHT for the root node of the DAG. That said, Bitswap also participates in the discovery phase and not only in the actual block exchange. -Although bitswap is simple and generally works, we feel that *its performance can be substantially improved*. One of the main factors that hold performance back is the fact that a node cannot request a subgraph of the DAG and results in many round-trips in order to “walk down” the DAG. The current operation of bitswap is also very often linked to duplicate transmission and receipt of content which overloads both the end nodes and the network. +Although bitswap is simple and generally works, we feel that *its performance can be substantially improved*. One of the main factors that hold performance back is the fact that *a node cannot request a subgraph of the DAG and results in many round-trips in order to “walk down” the DAG*. The current operation of bitswap is also very often linked to *duplicate transmission and receipt of content which overloads both the end nodes and the network*. ## Long Description From be3ab0b61855222bbfb9f50c6ae5d2407ad321de Mon Sep 17 00:00:00 2001 From: Yiannis Psaras <52073247+yiannisbot@users.noreply.github.com> Date: Tue, 12 Nov 2019 15:22:25 +0000 Subject: [PATCH 17/19] Update ENHANCED_BITSWAP_GRAPHSYNC.md --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index 78bd787..b718d4a 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -147,3 +147,5 @@ There are several desirable features discussed within IPFS that should be implem ### Extra notes + +[qri.io](https://qri.io/): a data transfer mechanism using IPFS components to keep track of blocks in a DAG using Manifest files (similar to bittorrent magnet files) - https://github.com/qri-io/dag From b9e91b43c92758e53cf10f4d620ac53d078e3116 Mon Sep 17 00:00:00 2001 From: Yiannis Psaras <52073247+yiannisbot@users.noreply.github.com> Date: Tue, 12 Nov 2019 15:34:48 +0000 Subject: [PATCH 18/19] Update ENHANCED_BITSWAP_GRAPHSYNC.md --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index b718d4a..07b2122 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -73,7 +73,8 @@ This way, the WANT-BLOCK messages are directed to the peers that are most likely - Bitswap Protocol Extensions: https://github.com/ipfs/go-bitswap/issues/186 - Reed-Solomon layer over IPFS: https://github.com/ipfs/notes/issues/196 - Bitswap Request Sharding: https://github.com/ipfs/go-bitswap/issues/167 -- Bitswap POC with HAVE / DONT_HAVE: https://github.com/ipfs/go-bitswap#189 +- Bitswap POC with HAVE / DONT_HAVE: https://github.com/ipfs/go-bitswap/pull/189 + ### Within the broad Research Ecosystem > How do people try to solve this problem? From 631cc94377f0e5ad2d65840b134cac54efc3d319 Mon Sep 17 00:00:00 2001 From: Yiannis Psaras <52073247+yiannisbot@users.noreply.github.com> Date: Wed, 20 Nov 2019 18:54:02 +0000 Subject: [PATCH 19/19] Update ENHANCED_BITSWAP_GRAPHSYNC.md --- OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md | 22 ++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md index 07b2122..0604d6d 100644 --- a/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md +++ b/OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md @@ -3,13 +3,13 @@ ## Short Description > In one sentence or paragraph. -Once IPFS has resolved the CID to a peerID and further to the exact location of the node that stores the requested content, it then uses a very simple protocol called bitswap to exchange content between the requestor and the server. +Bitswap is a simple protocol and it generally works. However, we feel that *its performance can be substantially improved*. One of the main factors that hold performance back is the fact that *a node cannot request a subgraph of the DAG and results in many round-trips in order to “walk down” the DAG*. The current operation of bitswap is also very often linked to *duplicate transmission and receipt of content which overloads both the end nodes and the network*. -In particular, IPFS asks Bitswap for the blocks corresponding to a set of CIDs. Upon this request, Bitswap sends a request for the CIDs to all of its directly connected peers. If none of the node's directly connected peers have one or more of the requested blocks, Bitswap falls back to querying the DHT for the root node of the DAG. That said, Bitswap also participates in the discovery phase and not only in the actual block exchange. +## Long Description -Although bitswap is simple and generally works, we feel that *its performance can be substantially improved*. One of the main factors that hold performance back is the fact that *a node cannot request a subgraph of the DAG and results in many round-trips in order to “walk down” the DAG*. The current operation of bitswap is also very often linked to *duplicate transmission and receipt of content which overloads both the end nodes and the network*. +IPFS starts by resolving the CID to a peerID and further to the exact location of the node that stores the requested content. It then uses a very simple protocol called bitswap to exchange content between the requestor and the server. -## Long Description +In particular, IPFS asks Bitswap for the blocks corresponding to a set of CIDs. Upon this request, Bitswap sends a request for the CIDs to all of its directly connected peers. If none of the node's directly connected peers have one or more of the requested blocks, Bitswap falls back to querying the DHT for the root node of the DAG. That said, Bitswap also participates in the discovery phase and not only in the actual block exchange. In order to synchronise a DAG between untrusted nodes, bitswap is exploiting the content-addressability feature of IPFS. This means that bitswap starts with the root hash of the DAG. Once it fetches the data, it verifies that its hash matches. The node can now trust this block and can thus continue with the rest of the blocks in the DAG (aka walk the DAG) until it gets the complete DAG and the data associated with it. @@ -17,7 +17,7 @@ In the current implementation of bitswap, requesting nodes send their WANT lists If none of the directly connected peers have any of the WANT list blocks, bitswap falls back to the DHT to find the requested content. This results in long delays to get to a peer that stores the requested content. -Once the recipient node starts receiving content from multiple peer nodes, it prunes down the long-latency peers and keeps the one to which the RTT is the shortest. Current proposals within the IPFS ecosystem are considering keeping the node with the highest throughput instead. It is not clear at this point which is the best approach. +Once the recipient node starts receiving content from multiple peer nodes, bitswap prioritises sending WANT lists to the peers with lower latencies. Current proposals within the IPFS ecosystem are considering prioritising sending the WANT lists to those nodes with the least amount of queued work, in the hope that this will result in higher througput. It is not clear at this point which approach will lead to the highest performance benefit. It is important to highlight that bitswap is a message-oriented protocol and _not_ a request-response protocol. That is, it sends messages including WANT lists and is then waiting for the blocks in the WANT list to be delivered back, as and when the discovery protocol manages to find the blocks in the WANT list. @@ -30,9 +30,9 @@ This process of bitswap results in the following major problems: In the latest release of Bitswap there is an optimisation of the above procedure that incorporates the concept of Sessions. Sessions are used to keep track of peers that have some of the blocks that the Session is interested in. In particular: - When IPFS asks Bitswap for the blocks corresponding to a set of CIDs, Bitswap: - - creates a new session + - creates a new sessions - starts a discovery phase to find the peers with a subset of the blocks. -- As peers are discovered they are remembered by the Session. Subsequent requests are sent only to those peers. +- As peers are discovered they are remembered by the Session. Bitswap _divides_ subsequent requests _between peers_ in a session. That is, it _does not_ ask every peer for every block, it asks a subset of the peers in the session (usually 1-2) for each block - If there is a timeout, the Session tries to discover more peers through the DHT. ## State of the Art @@ -48,7 +48,7 @@ The IPFS community has long been aware of the issues of bitswap and has experime - *DAG Block Interconnection.* Although bitswap does not/cannot recognise any relationship between different blocks of the same DAG, a requesting node can ask a node that provided a previous block for subsequent blocks of the same DAG. This approach intuitively assumes that a node that has a block of a DAG is very likely to have others. This is often referred to as “session” between the peers that have provided some part of the DAG. -- *Latency vs Throughput.* Bitswap is currently sorting peers by latency, i.e., it is pruning down the connections that incur higher latency. It has been suggested that this is changed to maximise throughput (i.e., keep the pipe full). +- *Latency & Throughput Optimisation.* Bitswap is currently sorting peers by latency, i.e., it is prioritising the connections that incur lower latency. Ideally, a hybrid approach needs to be implemented, where prioritisation based on latency applies to narrow but deep DAGs (e.g., blockchains), whereas prioritisation based on higher throughput applies to wide DAGs that are being traversed in parallel. - *Coding and Reed Solomon Codes.* This is a very efficient way to reduce storage space in general (trusted or untrusted) caching systems. The area of coded caching has seen significant development lately in the research community, hence, it is discussed further down. @@ -93,7 +93,11 @@ The bittorrent protocol has a different architectural design, which includes the There have been significant research efforts lately in the area of coded caching. The main concept has been proposed in 1960s in the form of error correction and targeted the area of content delivery over wireless, lossy channels. It has been known as Reed-Solomon error correction. Lately, with seminal works such as “Fundamental Limits of Caching”, Niesen et. al. have proposed the use of coding to improve caching performance. In a summary, the technique works as follows: if we have a file that consists of 10 chunks and we store all 10 chunks in the same or different memories/nodes, then we need to retrieve those exact 10 chunks in order to reconstruct the file. -In contrast, according to the coded caching theory, before storing the 10 chunks we encode the file using erasure codes. This results in some number of chunks x>10, say 13, for the sake of illustration. This clearly results in more data produced after adding codes to the original data. However, when attempting to retrieve the original file, a user needs to collect *any 10 of those 13 chunks*. By doing so, the user will be able to reconstruct the original file, without needing to get all 13 chunks. Although such approach does not save bandwidth (we still need to reconstruct 10 chunks of equal size to the original one), it makes the network more resilient to nodes being unavailable. In other words, in order to reconstruct the original file without coding, all 10 of the original peers that store a file have to be online and ready to deliver the chunks, whereas in the coded caching case, any 10 out of the 13 peers need to be available and ready to provide the chunks. Blind replication of the original chunks will not provide the same benefit, as the number of peers will need to be much higher (at least 20 as compared to 13) in order to operate with the same satisfaction ratio. +In contrast, according to the coded caching theory, before storing the 10 chunks, we encode the file using erasure codes. This results in some number of chunks x>10, say 13, for the sake of illustration. This clearly results in more data produced after adding codes to the original data. However, when attempting to retrieve the original file, a user needs to collect *any 10 of those 13 chunks*. By doing so, the user will be able to reconstruct the original file, without needing to get all 13 chunks. + +Although such approach does not save bandwidth (we still need to reconstruct 10 chunks of equal size to the original one), it makes the network more resilient to nodes being unavailable. In other words, in order to reconstruct the original file without coding, all 10 of the original peers that store a file have to be online and ready to deliver the chunks, whereas in the coded caching case, any 10 out of the 13 peers need to be available and ready to provide the chunks. Possibly more importantly, coded caching can provide significant performance benefits in terms of load-balancing. In case of large files, the upload link of nodes will be saturated for long periods of time. In contrast, delivering a few coded chunks and gathering the rest from other nodes will load-balance between nodes that store the requested content. + +Blind replication of the original object will not provide the same benefit, as the number of peers will need to be much higher (at least 20 as compared to 13) in order to operate with the same satisfaction ratio. - Reed-Solomon Error Correction: https://en.wikipedia.org/wiki/Reed–Solomon_error_correction - Maddah-Ali, M.A., Niesen, U. Fundamental limits of caching. IEEE Transactions on Information Theory,60(5):2856–2867, 2014