Leverage Content Path Affinity in routing #10251
Labels
effort/weeks
Estimated to take multiple weeks
exp/expert
Having worked on the specific codebase is important
kind/enhancement
A net-new feature or improvement to an existing feature
P2
Medium: Good to have, but can wait until someone steps up
topic/routing
Topic routing
topic/sharding
Topic about Sharding (HAMT etc)
Problem
Right now, we support three values in
Reprovider.Strategy
which tells reprovider what should be announced. Valid strategies are:If the repository gets too big,
all
andpinned
are too expensive and folks are forced to useroots
which is codec-agnostic and will only announce the root block of UnixFS DAG.But
roots
comes with a big downside:This is not an inherent limitation of IPFS as a whole – it is only a limitation of how things are implemented in Kubo:
/ipfs/cid/sub/dir/file
is resolved first, into/ipfs/file-CID
/ipfs/file-CID
starts/ipfs/cid
,/ipfs/cid/sub
, and/ipfs/cid/sub/dir
are already cached in local store, so Kubo does no network lookup for provider of these. It will ask for providers of the first missing block within/ipfs/file-CID
, and if these internal blocks are not announced (e.g. due toReprovider.Strategy
set toroots
), Kubo won't be able to resume download.Solution ideas
/ipfs/CID
(direct block get)/ipfs/cid/sub/dir/file
(resuming retrieval from the middle of the file)/ipfs/file-CID
can be found, look for providers of parent entity (directory) CIDs (dir
,sub
and finallycid
). With each step growing the probability of finding one. Or we could always ask for leas and the most distant ones in parallel. Depends if we expect majority of data being announced asroots
orentities
(Improved Reprovider.Strategy for entity DAGs (HAMT/UnixFS dirs, big files) #8676 (comment))The text was updated successfully, but these errors were encountered: