Skip to content
This repository has been archived by the owner on Apr 29, 2020. It is now read-only.

Open Problem: Enhanced Bitswap/GraphSync with more Network Smarts #9

Merged
merged 19 commits into from
Nov 26, 2019

Conversation

daviddias
Copy link
Contributor

No description provided.

@daviddias daviddias marked this pull request as ready for review November 7, 2019 08:43
### What defines a complete solution?
> What hard constraints should it obey? Are there additional soft constraints that a solution would ideally obey?

First and foremost, any complete solution should account for extensibility as the IPFS system needs to scale up and more applications are implemented on top. The active number of users of IPFS is increasing exponentially and the requests submitted to the network are following accordingly. That said, a complete solution should account for those numbers.
Copy link
Contributor

@jsoares jsoares Nov 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure unbounded exponential scaling is a realistic goal. Would be good to put some order of magnitude here, especially given the reference to "those numbers".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, need to clarify.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add that ideally IPFS should dynamically adapt to different environments, analogously to how TCP works in a data center and also works on the broader internet

yiannisbot and others added 8 commits November 7, 2019 16:03
Co-Authored-By: Jorge Soares <mail@jorgesoares.org>
Co-Authored-By: Jorge Soares <mail@jorgesoares.org>
Co-Authored-By: Jorge Soares <mail@jorgesoares.org>
Co-Authored-By: Jorge Soares <mail@jorgesoares.org>
Co-Authored-By: Jorge Soares <mail@jorgesoares.org>
Co-Authored-By: Jorge Soares <mail@jorgesoares.org>
Co-Authored-By: Jorge Soares <mail@jorgesoares.org>
Co-Authored-By: Jorge Soares <mail@jorgesoares.org>
Copy link

@dirkmc dirkmc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM 👍

I left a couple of comments with some more detail in case you need to incorporate that background info anywhere

OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md Outdated Show resolved Hide resolved
OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md Show resolved Hide resolved
OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md Outdated Show resolved Hide resolved
OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md Show resolved Hide resolved
### What defines a complete solution?
> What hard constraints should it obey? Are there additional soft constraints that a solution would ideally obey?

First and foremost, any complete solution should account for extensibility as the IPFS system needs to scale up and more applications are implemented on top. The active number of users of IPFS is increasing exponentially and the requests submitted to the network are following accordingly. That said, a complete solution should account for those numbers.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add that ideally IPFS should dynamically adapt to different environments, analogously to how TCP works in a data center and also works on the broader internet

@daviddias
Copy link
Contributor Author

@yiannisbot can you take in @dirkmc's review before I do the final review for the merge? Thank you!

@yiannisbot
Copy link
Collaborator

Yup, it's on the to-do list for this week as I prepare the RFPs.

@yiannisbot
Copy link
Collaborator

Overall LGTM 👍

I left a couple of comments with some more detail in case you need to incorporate that background info anywhere

Thanks a lot @dirkmc! Very useful feedback. Most of it now integrated in the main text.

@dirkmc
Copy link

dirkmc commented Nov 12, 2019

I'm not sure if we want to include it in this document, but I just want to make sure people are aware that the folks at qri.io have implemented a data transfer mechanism using some IPFS components that keeps track of blocks in a DAG using Manifest files, analagous to bittorrent magnet files.

@yiannisbot
Copy link
Collaborator

I'm not sure if we want to include it in this document, but I just want to make sure people are aware that the folks at qri.io have implemented a data transfer mechanism using some IPFS components that keeps track of blocks in a DAG using Manifest files, analagous to bittorrent magnet files.

Added in the "Extra Notes"


If none of the directly connected peers have any of the WANT list blocks, bitswap falls back to the DHT to find the requested content. This results in long delays to get to a peer that stores the requested content.

Once the recipient node starts receiving content from multiple peer nodes, it prunes down the long-latency peers and keeps the one to which the RTT is the shortest. Current proposals within the IPFS ecosystem are considering keeping the node with the highest throughput instead. It is not clear at this point which is the best approach.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly.

  • We currently prune to peers that have the content, then prioritize sending wants to peers with lower latencies. We still send wants to all peers (IIRC).
  • The plan is to change that second part to: prioritize sending wants to peers with the least amount of queued work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yiannisbot did you take @Stebalien's review in this comment?

OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md Show resolved Hide resolved

- *DAG Block Interconnection.* Although bitswap does not/cannot recognise any relationship between different blocks of the same DAG, a requesting node can ask a node that provided a previous block for subsequent blocks of the same DAG. This approach intuitively assumes that a node that has a block of a DAG is very likely to have others. This is often referred to as “session” between the peers that have provided some part of the DAG.

- *Latency vs Throughput.* Bitswap is currently sorting peers by latency, i.e., it is pruning down the connections that incur higher latency. It has been suggested that this is changed to maximise throughput (i.e., keep the pipe full).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really either/or. Really, we should:

  • Optimize for latency when traversing deep/narrow DAGs (e.g., a blockchain/path). Lower latency means we learn about the next node faster.
  • Optimize for throughput when traversing a wide DAG in parallel.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great! I was not aware this was the intention.


There have been significant research efforts lately in the area of coded caching. The main concept has been proposed in 1960s in the form of error correction and targeted the area of content delivery over wireless, lossy channels. It has been known as Reed-Solomon error correction. Lately, with seminal works such as “Fundamental Limits of Caching”, Niesen et. al. have proposed the use of coding to improve caching performance. In a summary, the technique works as follows: if we have a file that consists of 10 chunks and we store all 10 chunks in the same or different memories/nodes, then we need to retrieve those exact 10 chunks in order to reconstruct the file.

In contrast, according to the coded caching theory, before storing the 10 chunks we encode the file using erasure codes. This results in some number of chunks x>10, say 13, for the sake of illustration. This clearly results in more data produced after adding codes to the original data. However, when attempting to retrieve the original file, a user needs to collect *any 10 of those 13 chunks*. By doing so, the user will be able to reconstruct the original file, without needing to get all 13 chunks. Although such approach does not save bandwidth (we still need to reconstruct 10 chunks of equal size to the original one), it makes the network more resilient to nodes being unavailable. In other words, in order to reconstruct the original file without coding, all 10 of the original peers that store a file have to be online and ready to deliver the chunks, whereas in the coded caching case, any 10 out of the 13 peers need to be available and ready to provide the chunks. Blind replication of the original chunks will not provide the same benefit, as the number of peers will need to be much higher (at least 20 as compared to 13) in order to operate with the same satisfaction ratio.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, if I erasure encode my file, I can't reconstruct a part of the file without having the minimum number of chunks. Is this correct?

If so, I'm not sure if this buys us anything. Peers will likely either have or not have all the chunks necessary to reconstruct a file; having one chunk will be highly correlated with having the rest.

TL;DR: chunk deletion is not random. Yes, disks can fail but that should be handled at a lower layer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, if I erasure encode my file, I can't reconstruct a part of the file without having the minimum number of chunks. Is this correct?

Yes, it is.

If so, I'm not sure if this buys us anything. Peers will likely either have or not have all the chunks necessary to reconstruct a file; having one chunk will be highly correlated with having the rest.

That's correct for the case of small files. But in case of very large files, coded caching provides nice load-balancing properties, i.e., you don't keep someone's uplink saturated for hours to get some GBs worth of data. The replication you need to do in order to achieve equal load-balancing but without caching will be much higher, therefore, resulting in inefficient use of (storage) resources.

Copy link
Collaborator

@yiannisbot yiannisbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've addressed @Stebalien's comments and committed a new version.

Copy link
Contributor Author

@daviddias daviddias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some additional comments. This is looking really solid, We can merge it once the last comments are addressed

OPEN_PROBLEMS/ENHANCED_BITSWAP_GRAPHSYNC.md Show resolved Hide resolved

### Extra notes

[qri.io](https://qri.io/): a data transfer mechanism using IPFS components to keep track of blocks in a DAG using Manifest files (similar to bittorrent magnet files) - https://github.com/qri-io/dag
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should go as "one of the experiments within the IPFS Ecosystem", it is a tool that uses IPFS and its APIs for faster syncs.

@daviddias daviddias merged commit 2a78af6 into master Nov 26, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants