-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Analyzing the DAG reader #5192
Comments
Forgot to mention that |
I'm unclear in what case that |
The DAG reader has been completely reworked in ipfs/go-unixfs#60. |
It is comprised of the group of structures in the
unixfs.io
package capable of doing a DFS graph traversal of a file DAG. It reads (and performs similar operations) in the same sequence the DAG was built, so theimporter
is a good reference (once #5118 gets merged) to better understand it. I leave some notes here on what could be improved to make it more accessible to the new reader.The recursive approach is hidden behind the
PBDagReader
(parent) and itsReadSeekCloser
(child) attribute. The first doesn't call the second, instead it interacts with the interface which the second implements. The interface, although it provides a nice generalization, it obscures the very simple algorithm to traverse a file DAG that only has internalProtoNode
s and leafRawNode
s (represented at the UnixFS layer asFile
andRaw
types). This could be made more explicit to leverage the concepts that the reader could already have of what a file DAG is (e.g., if it's already familiarized with theimporter
). Could an iterative algorithm help clarify how the reader operates? Instead of the implicit call stack with thePBDagReader
s at different depths of the tree (making up the path to the current leaf being read) we could have an explicit stack with the information of each edge of the path at the different depths (this stack would pretty much be the reader itself with a few other attributes of control).Node promises. Many nodes are being requested at the same time (preloaded) to streamline the graph traversal, this mechanism is abstracted through the
ipld.NodePromise
structure and it should be made very clear to the reader what a promise is, why is it used (see #5162 (comment)). The name of thepromises
attribute of thePBDagReader
should be changed to reflect that those are actually the child nodes (that will be visited in the future), to focus on what they represent instead of how those nodes are fetched (through promises).The direct manipulation of the protobuf
unixfs.pb.Data
structure inPBDagReader.pbdata
should be replaced with the more refinedunixfs.FSNode
structure (which containsunixfs.pb.Data
).Remove unused
node
attribute (*mdag.ProtoNode
) of thePBDagReader
.As said before the role of the buffer (
buf
attribute) may be misleading, it should be clear that this is the child node one level below being accessed by anotherPBDagReader
(orBufDagReader
in case we've reached the leaf). Using theReadSeekCloser
interface instead ofDagReader
(which contains it) is correct but may be confusing if not clarified well enough. (Revisit this point.)The
WriteTo
method has almost an identical logic asCtxReadFull
but its code is structured differently (especially the error handling part) which makes it harder to assimilate what it does (or more precisely, how it does it).Read
andSeek
(fromio.SeekStart
) do very similar traversal operations, the first one counting sizes and the second reading file content (actually both are performing reading operations on UnixFS nodes), could that part of the code be merged? Maybe a new structure could be introduced (at theunixfs
level) that would handle how to traverse a DAG, this structure could receive a method of what to do in each node it visits during the DFS (something likeDFSTraversalFileDAG
orDFSFileDAG
).The name
io
of the package that contains it is a bit misleading, the DAG modifier that would normally be also considered as standard I/O has its ownmod
package, maybe the DAG reader could be moved to a separatereader
package and rethink what theio
package should contain (if anything).The text was updated successfully, but these errors were encountered: