Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publish UnixFS specifications at specs.ipfs.tech #331

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

Jorropo
Copy link
Contributor

@Jorropo Jorropo commented Oct 10, 2022

This need some touch up, (Table of Content, fixtures, ...) but I would to gather feedback first.

cc #162 #316

Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Jorropo! I know how time-consuming spelunking code and writing specs is, this is effort is very appreciated ❤️

Did a first pass read and dropped some comments – mostly questions about things I did not know and trying to confirm I understood them right + flagging sections we should research or reorder / rephrase.

UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated

- Node, Block
A node is a word from graph theory, this is the smallest unit present in the graph.
Due to how unixfs work, there is a 1 to 1 mapping between nodes and blocks.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HAMT node will be backed by more than one block, maybe rephrase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't a HAMT a concatenation of directories ?

I don't see why you couldn't reuse lower parts of the HAMTs (*assuming you find colliding hashes).

UNIXFSv1.md Outdated
Comment on lines 307 to 315
####### Link ordering

The cannonical sorting order is lexicographical over the names.

In theory there is no reason an encoder couldn't use an other ordering, however this lose some of it's meaning when mapped into most file systems today (most file systems consider directories are unordered-key-value objects).

A decoder SHOULD if it can, preserve the order of the original files in however it consume thoses names.

However when some implementation decode, modify then reencode some, the orignal links order fully lose it's meaning. (given that there is no way to indicate which sorting was used originally)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rephrase this as clear MUST (always creating sorted data) and SHOULD (try parsing data in original order, if possible) for implementers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean, @rvagg fixed the go implementations after the linux2ipfs incident.

They support reading "unsorted" data, however will resort them lexicographically if you modify it.

UNIXFSv1.md Outdated

####### Path Resolution

Pop the left most component of the path, and try to match it to one of the Name in Links.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Pop the left most component of the path, and try to match it to one of the Name in Links.
Pop the left most component of the path after the current root, and try to match it to one of the Name in Links.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my internal representation the root have already been poped when downloading the root.

UNIXFSv1.md Outdated

Pop the left most component of the path, and try to match it to one of the Name in Links.

<!--TODO: check Kubo does this-->If you find a match you can then remember the CID. You MUST continue your search, however if you find a match again you MUST error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is search strategy left to implementers? I feel we should suggest something other than iterating over Links list that is "kinda expected to be sorted lexicographically, but may not be".

Knowing what Kubo and JS do will be useful.

UNIXFSv1.md Outdated
optional string Name = 2;

// cumulative size of target object
optional uint64 Tsize = 3; // also known as dagsize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is dagsize a thing, or just more friendly label created for these specs?
Perhaps replacing dagsize and DagSize with "Tsize (DAG size)" will be more clear?

UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated
```

The two different schemas plays together and it is important to understand their different effect,
- `dag-pb` also named `PBNode` is the "outside" protobuf message, it is the first one you decode. It contain the list of links and some "opaque user data".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `dag-pb` also named `PBNode` is the "outside" protobuf message, it is the first one you decode. It contain the list of links and some "opaque user data".
- `dag-pb` also named `PBNode` is the "outside" protobuf message, it is the first one you decode. It contains the list of links and some "opaque user data".

Copy link
Contributor

@ElPaisano ElPaisano Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Piggybacking off of this, first bullet point suggestion:

- The `dag-pb` protobuf is the "outside" protobuf message; in other words, it is the first message decoded. This protobuf contains the list of links and some "opaque user data".

Also, as a noob reader, I wouldn't be clear what you mean by "opaque user data". Might be good to clarify this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'd suggest moving this callout dag-pb also named PBNode up to line 79, since that's the first dag-pb is mentioned.

UNIXFSv1.md Outdated
- Symlink
This represent a POSIX Symlink.<!--TODO: Add link to POSIX spec.-->

### Paths
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this paragraph explain how this maps onto the protobufs?

Copy link
Contributor Author

@Jorropo Jorropo Dec 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't. (I attempted to convey this because of the level 2 ## Paths)

This attempt to explain how you should read this/is/a/path as []string{"this", "is", "a", "path"}.

UNIXFSv1.md Outdated
// binary CID (with no multibase prefix) of the target object
optional bytes Hash = 1;

// UTF-8 string name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iiuc

Suggested change
// UTF-8 string name
// UTF-8 string name, used for pathing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx, but is it really UTF-8 ? Some current mainstream implementations does not prevent users from using arbitrary bytes in their file names (as long as they don't contain 0x2f)

(see utf8 war in dag-cbor ...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this is copied straight from dag-pb spec is not an authoritative section, I don't think I should update this.

UNIXFSv1.md Outdated
}

message PBNode {
// refs to other objects
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what links are used for in UnixFS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kinda what the whole dag-pb section bellow is dedicated to.
This is just some protobuf definition, not meant to self documenting code.

UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated
`node.Data.Data` is some bitfield, ones indicates weather or not the links are part of this HAMT or leaves of the HAMT.
The usage of this field is unknown given you can deduce the same information from the links names.

###### Path resolution on HAMTs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With my very limited knowledge of HAMTs, I'm having trouble understanding the problem and how it is solved.

UNIXFSv1.md Outdated
- Implementations encoding or decoding wire-representations must observe the following:
- An `mtime` structure with `FractionalNanoseconds` outside of the on-wire range `[1, 999999999]` is **not** valid. This includes a fractional value of `0`. Implementations encountering such values should consider the entire enclosing metadata block malformed and abort processing the corresponding DAG.
- The `mtime` structure is optional - its absence implies `unspecified`, rather than `0`
- For ergonomic reasons a surface API of an encoder must allow fractional 0 as input, while at the same time must ensure it is stripped from the final structure before encoding, satisfying the above constraints.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you specifying the API here? This sounds like it makes in Go, but this might not apply to other languages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont i copied the metadata spec over, someone else already spécified that.

UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
Copy link
Contributor

@aschmahmann aschmahmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad this is moving along 🎉.

Spec documents where things unrelated to the spec such as implementation details, alternatives considered, etc. are intertwined with the spec such that they're hard to distinguish is painful. The UnixFS spec is confusing enough without these extra distractions, many of which came from the previous version of the spec, so let's drop them. If people want to keep them around then moving them somewhere separate (e.g. to an appendix) would be great.

UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated

A so called "block limit" is in place, we do not allow any single block to be bigger than 2MiB.

Implementation SHOULD try to not emit 1MiB bigger blocks, but MUST decode blocks <= 2MiB.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably not the right place to discuss this

Agreed, as mentioned above I don't think the UnixFS spec is the place to discuss the block limit. We can add an IPLD/block limit spec though and discuss there.

but 1 MB seems really small.

I have no preference for what the magic number is. However, every number is too small or too big for someone.

I've seen a crew of people who believe having a 100MB (or a non-existent) block limit would make data transfer in IPFS magically fast and that it's a major problem in IPFS data transfer today. This opinion is only valid if you're willing to concede that BitTorrent-v2 is slow (16kib chunks) and standard BitTorrent-v1 settings (256kib) are also slow. I have yet to meet someone who holds both views, but maybe I haven't talked to enough people 🤷.

Can talk about this in another issue. Interested parties may also want to read this thread https://discuss.ipfs.tech/t/supporting-large-ipld-blocks/15093.

cc @aschmahmann do you know where this come from 🙃

@Stebalien would probably know more, but I think the TLDR is people like round numbers and are bad at math.

Longer version: kubo (formerly go-ipfs) had a 1MiB max size for UnixFS chunking.... but you could use 1MiB chunks with the extraneous original protobuf wrapping which bumps it over the limit so you need a new limit. People like round numbers so 2MiB.

Generally speaking people will be bad at math (off by one errors, forgetting about protobuf wrapping, miscounting link sizes/names for directories, ....) if you tell people to go for 1MiB and they mess up and go over a bit things will be fine. If you tell people 2MiB and they go over the limit things get tricky. People will say things like "Bitswap enforce slightly bellow 4MiB" (https://github.com/ipfs/specs/pull/331/files#r1038743285) which might convince someone their 2.1MiB block is fine... but that's only go-bitswap today, but some other services (e.g. web3.storage) hard limit you at 2MiB blocks and go-bitswap could reasonably make that change as well.

For those who insist that the block limit is a critical performance issue, see above.


That being said if people feel the SHOULD is unnecessary and just want the MUST, sure 🤷.

SHOULD in https://www.ietf.org/rfc/rfc2119.txt

This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.

That sounds close to accurate to me, given that if you choose to start using the full 2MiB limit without being careful to not exceed the limit you'll run into interoperability problems.

UNIXFSv1.md Outdated
optional uint64 filesize = 3;
repeated uint64 blocksizes = 4;
optional uint64 hashType = 5;
optional uint64 fanout = 6;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the word fanout mentioned in the rest of this document

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not used in kubo (kubo does emit it but IMO that is chunker implementation details,
I think we should remove this field and add a comment // field 6 is reserved for backward compatibility and SHOULD NOT be emited by implementations):

$ rgrep -I fanout | grep "\.go:" | grep -v libp2p
vendor/github.com/ipfs/go-unixfsnode/data/unmarshal.go:			fanout, n := protowire.ConsumeVarint(remaining)
vendor/github.com/ipfs/go-unixfsnode/data/unmarshal.go:			qp.MapEntry(ma, Field__Fanout, qp.Int(int64(fanout)))
vendor/github.com/ipfs/go-unixfsnode/hamt/errors.go:	// ErrNoFanoutField indicates the HAMT node's UnixFS structure lacked a fanout field, which is required
vendor/github.com/ipfs/go-unixfs/unixfs.go:func HAMTShardData(data []byte, fanout uint64, hashType uint64) ([]byte, error) {
vendor/github.com/ipfs/go-unixfs/unixfs.go:	pbdata.Fanout = proto.Uint64(fanout)
vendor/github.com/ipfs/go-unixfs/unixfs.go:// Fanout gets fanout of format
vendor/github.com/ipfs/go-unixfs/pb/unixfs.pb.go:	Fanout               *uint64        `protobuf:"varint,6,opt,name=fanout" json:"fanout,omitempty"`

I know TSize is also never red however I still talk about it, but it's tricky and easy to get wrong and important to mention that it MUST NOT be used in offset computation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not used in kubo

You sure? https://github.com/ipfs/go-unixfs/blob/707110f05dac4309bdcf581450881fb00f5bc578/hamt/hamt.go#L147-L149

Also https://github.com/ipfs/go-unixfsnode/blob/475ed658c35e67af7793da5a9dc86d57bde24fe3/hamt/util.go#L73-L74


You have to be more careful in the areas of the spec you assume are unused. This one is used in existing unixfs implementations and could be discovered with some cursory looking around in those repos. Your grepping even identified some lines to look at so I'm not sure how you reached the conclusion that the lines were unused and should not be emitted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aschmahmann ah mb, I should have done -iI not -I I'll check it out, I guess everything that I assume to be 256 wide in the HAMT are actually variable sized ... Great the hamt is more complex than I thought!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know TSize is also never red

I think you need to do more poking around at these areas of the spec where people are asking for clarification. This is incorrect as well. A bit of looking using the GitHub UI shows at the very least:

https://github.com/ipfs/kubo/blob/4d4841f41cdc2d797e87f7b62c230ee957513f94/core/commands/files.go

There may be more examples as well, but saying it's never read is incorrect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aschmahmann I don't see TSize nor DagSize nor Fanout in the last file you linked it's using Filesize.

Copy link
Contributor

@aschmahmann aschmahmann Dec 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aschmahmann I don't see TSize nor DagSize

@Jorropo https://github.com/ipfs/kubo/blob/4d4841f41cdc2d797e87f7b62c230ee957513f94/core/commands/files.go#L249

If you'd like to see it in action (commands using pwsh, but translate to whatever shell you want):

❯ ipfs name resolve /ipns/ipfs.io | %{ipfs files stat $_}
QmegA7HiEvLmyJgVcBxgZ2hjEp5YZ4aVxcjBdHcKvD2f73
Size: 0
CumulativeSize: 10776742
ChildBlocks: 16
Type: directory
❯ ipfs name resolve /ipns/ipfs.io | %{ipfs files stat $_/index.html}
Qmf5nTcgHNZ4jB29d4XN7JhPykZMpCGNQysEcXjtRYguwW
Size: 190713
CumulativeSize: 190727
ChildBlocks: 0
Type: file

You can take a look in go-merkledag (where dagpb nodes are defined) if you're trying to follow the code paths without using any tooling like an IDE or tracing execution paths.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx, I can't use my IDE effectively because everything is an interface.

UNIXFSv1.md Outdated

### SHOULD NOT names

Thoses names SHOULD NOT<!--MUST NOT ? in future revisions--> be used:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a bad idea/overkill, but it seems like a lot of the characters that are unfriendly could be marked as SHOULD NOT even if some implementations will let many of them through.

There's a whole slew of bad characters/path components mentioned in ipfs/kubo#1710, but basically we might be doing people some favors by gathering that information so that implementers don't have to go figure it out themselves until they get pressed to by their users. For example, most implementations should probably just error on path components with newline characters until their users start asking for support with some non-troll dataset.

UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated

A directory node is a named collection of file.

The minimum valid `PBNode.Data` field for a directory is (pseudo-json): `{"Type":"Directory"}`, other values are covered in Metadata.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, directories seem generally to have some sizing information in their PBNode.Data - maybe worth a SHOULD section of recommended Data for these

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example CID where it's the case ?
AFAIT Kubo always just set PBNode.Data to unixfsEncode({"Type": "Directory"})

UNIXFSv1.md Outdated

Currently symlinks are not followable, that mean implementations needs to return symlinks objects and fail if a consumer tries to follow it through.

This is a SHOULD level, you probably wont break much things if you start following them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear how to me how the unixfs format rather than the consumer would be able to follow symlinks, since the path doesn't provide a CID destination and the destination context will not be reliably clear at the point of link decoding (e.g. across mount, etc.)

UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated

### SHOULD NOT names

Thoses names SHOULD NOT<!--MUST NOT ? in future revisions--> be used:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another consideration is to avoid using / in node names, as some tooling (gateway, fs) considers this as a directory separator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I was about to say that it's actually fine because we could use \/ or %2F but \/ actually don't work on tmpfs and ext4 on linux so let's not allow that one, nice catch!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is already covered with a MUST NOT before:

Components MUST NOT contain / unicode codepoints because else it would break the path into two components.

UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated

They never have any childs, and thus are also known as single block files.

Their size (both `dagsize` and `blocksize`) is the length of the block body.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is previously refered to as TSize

UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated
```

3. Profit
Assuming we stored this block in some implementation of our choice which makes it accessible to our client, we can try to decode it:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some implementation of our choice of what?
you might want to refer to kubo directly, as an IPFS implementation with a datastore that persists data locally, and implements UnixFS.

Copy link
Contributor Author

@Jorropo Jorropo Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't wanted to include poking into flatfs or creating a car file as part of the example and used Assuming we stored this block in some implementation of our choice as a magic exercise left to the reader sentence.

I can add echo -n "test" | ipfs block put if you want.

UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
Copy link
Contributor

@ElPaisano ElPaisano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some initial writing suggestions here, please lmk if it's helpful @Jorropo. I didn't get to everything. I can make another pass shortly. I know this is a work-in-progress.

UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated

### IPLD `dag-pb`

A very important other spec for unixfs is the [`dag-pb`](https://ipld.io/specs/codecs/dag-pb/spec/) IPLD spec:
Copy link
Contributor

@ElPaisano ElPaisano Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phrasing suggestion:

The IPLD [`dag-pb`](https://ipld.io/specs/codecs/dag-pb/spec/) spec (also known as `PBNode`) is also used by unixfs, and is represented by the following protobuf:

I suggested moving the callout that dag-pb is also called PBNode up here, since it's the first time the reader encounters the term dab-pb in this doc

UNIXFSv1.md Outdated
}
```

The two different schemas plays together and it is important to understand their different effect,
Copy link
Contributor

@ElPaisano ElPaisano Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phrasing suggestion:

Each protobuf schema plays a different role in unixfs. These differences are described below:

UNIXFSv1.md Outdated
```

The two different schemas plays together and it is important to understand their different effect,
- `dag-pb` also named `PBNode` is the "outside" protobuf message, it is the first one you decode. It contain the list of links and some "opaque user data".
Copy link
Contributor

@ElPaisano ElPaisano Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Piggybacking off of this, first bullet point suggestion:

- The `dag-pb` protobuf is the "outside" protobuf message; in other words, it is the first message decoded. This protobuf contains the list of links and some "opaque user data".

Also, as a noob reader, I wouldn't be clear what you mean by "opaque user data". Might be good to clarify this

UNIXFSv1.md Outdated
```

The two different schemas plays together and it is important to understand their different effect,
- `dag-pb` also named `PBNode` is the "outside" protobuf message, it is the first one you decode. It contain the list of links and some "opaque user data".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'd suggest moving this callout dag-pb also named PBNode up to line 79, since that's the first dag-pb is mentioned.

UNIXFSv1.md Outdated
Comment on lines 178 to 186
They are always of type file.

They can be recognised because their CIDs have `Raw` codec.

The file content is purely the block body.

They never have any childs, and thus are also known as single block files.

Their size (both `dagsize` and `blocksize`) is the length of the block body.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: use bullet points, phrasing and grammar

- They are always of type `file`.
- Their CIDs have a `Raw` codec.
- The file content is the block body.
- They never have any children nodes, and thus are also known as single block files.
- Both the `dagsize` and `blocksize` fields specify the length of the block body.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Their CIDs have a Raw codec.

I don't agree with this sentence, here I understand Raw codecs in CIDs as a property of Raw nodes while it's an implication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both the dagsize and blocksize fields specify the length of the block body.

I'm not native english I need a check on this, when I read this I understand this as the wrong way around, I understand that dagsize and blocksize → length of the block body.
While in reality it is dagsize and blocksize ← length of the block body.

UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated Show resolved Hide resolved
UNIXFSv1.md Outdated

####### The sister-lists `PBNode.Links` and `decodeMessage(PBNode.Data).blocksizes`

The sister-lists are the key point of why `dag-pb` is important for files.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few thoughts here:

  • format sister-list using italics since it's a term i.e.
_sister-list_
_Sister-lists_ are a key reason  why `dag-pb` is important for files. In the following example, the `PBNode.Links` and `PBNode.Data.blocksizes` slices are sisters. This means that they must have the same length and each map to the same entity at the same index.
In other words, instead of having two lists of properties, we have a single list of properties where some of the properties are stored in `PBNode.Links[n]`, and other properties of the same object are stored in PNode.Data.blocksizes[n]
type PBNode struct {
  Links []struct{
    tsize uint64
    hash cid.Cid
  }
  Data struct{
    blocksizes []uint64
  }
}

UNIXFSv1.md Outdated
Comment on lines 225 to 228
This allows us to concatenate smaller files together.

Linked files would be loaded recursively with the same process following a DFS (Depth-First-Search) order.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion:

_Siter-lists_ allow us to concatenate smaller files together.
Otherwise, linked files would be loaded recursively during concatenation following Depth-First-Search order.

I might not be understanding what you're trying to convey here, lmk

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sister lists is relevant to ranging. the DFS is the default mode when you don't do ranging (such as while fetching the complete file or if your range cover a multiple blocks in the file)

UNIXFS.md Outdated
- JavaScript
- Data Formats - [unixfs](https://github.com/ipfs/js-ipfs-unixfs)
- Importer - [unixfs-importer](https://github.com/ipfs/js-ipfs-unixfs-importer)
- Exporter - [unixfs-exporter](https://github.com/ipfs/js-ipfs-unixfs-exporter)
Copy link
Member

@2color 2color Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add
https://github.com/ipld/js-unixfs

What about https://github.com/web3-storage/fast-unixfs-exporter

I just learned of these in the last couple of days and from what I understand they are actively maintained and used by DAGHouse

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/ipld/js-unixfs ✅ - yes we use this for encoding UnixFS DAGs now
https://github.com/web3-storage/fast-unixfs-exporter ❌ - temporary fork now deprecated

@BigLep
Copy link
Contributor

BigLep commented Jan 21, 2023

@ElPaisano : are you able to review, particularly for a grammar/organization regard?

UNIXFS.md Outdated
It MUST be murmur3-x64-64 (multihash `0x22`).
- `node.Data.Data` is some bitfield, ones indicates whether or not the links are part of this HAMT or leaves of the HAMT.
The usage of this field is unknown given you can deduce the same information from the links names.
- `node.Data.fanout` MUST be a power of two. This encode the number of hash permutations that will be used on each resolution step.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason someone would choose a power of 2 that's not a power of 4? Just thinking about encoding into hex, where there's 4 bits to a digit.

UNIXFS.md Outdated
Thoses nodes are also sometimes called sharded directories, they allow to split directories into many blocks when they are so big that they don't fit into one single block anymore.

- `node.Data.hashType` indicates a multihash function to use to digest path components used for sharding.
It MUST be murmur3-x64-64 (multihash `0x22`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to have backward compatibility? It would seem the most prominent implementations are only taking the first 64 bits of the digest into account.
Maybe replace "lowest" in step 2 down on line 259 with a detailed description of how to pull out the correct bits from the middle of the 128-bit digest, in which case you could be backward-compatible with existing data and still allow the longer digest going forward (for large fanout and/or many, many, many entries).

@bumblefudge
Copy link
Collaborator

2023-10-17 maintainer conversation: this needs someone to comb through the comments. This is a nice-to-have before Instanbul and a must-have before end of year nucleation.

Invite me to the next maintainer meeting and I can maybe scrum it a tiny bit? feels like a PR that's blocking other also-important PRs

The field `Name` of an element of `PBNode.Links` for a HAMT starts with an
uppercase hex-encoded prefix, which is `log2(fanout)` bits wide.

##### Path Resolution
Copy link
Member

@lidel lidel Feb 15, 2024

Comment on lines 55 to 56
Thus, the block must be retrieved; that is, the bytes which ,when hashed using the
hash function specified in the multihash, gives us the same multihash value back.
Copy link

@gammazero gammazero Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Thus, the block must be retrieved; that is, the bytes which ,when hashed using the
hash function specified in the multihash, gives us the same multihash value back.
Thus, when a block is retrieved and its bytes are hashed using the
hash function specified in the multihash, this gives the same multihash value contained in the CID.

optional uint64 hashType = 5;
optional uint64 fanout = 6;
optional uint32 mode = 7;
optional UnixTime mtime = 8;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make this compatible with proto3 which removes optional and required

Comment on lines 182 to 185
This allows for fast indexing into the file. For example, if someone is trying
to read bytes 25 to 35, we can compute an offset list by summing all previous
indexes in `blocksizes`, then do a search to find which indexes contain the
range we are interested in.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the example be made more clear by talking about reading 10 bytes starting at byte-offset 25? In "reading bytes 25 to 35", it might not be clear if byte 35 is included or not (ordinal vs offset)?

What is described is close enough to an implementation that people may attempt to implement as per the description, which is probably not what they really want. I think there should be a more precise algorithm description or a more general summarization. Choose one:

Here is a more precise algorithm description:

This allows for fast indexing into the file. For example, if trying to read 10 bytes starting at
byte-offset 25, iterate the `blocksizes` list until the sum of elements is greater the starting
byte-offset. The index of the current element is that of the first block. Continue iterating
and summing `blocksizes` until the sum is greater than or equal to byte-offset + size. The
index of the current element is that of the last block, which may be the same as the first
block if the byte range is contained within one block. The first block and all blocks up to and
including the last block are the blocks that need to be fetched.

Example implementation [here](https://go.dev/play/p/hHy0xy9j2Qz)

Here is a more general description:

This allows for fast indexing into the file. When trying to read some range of bytes starting at a
specified byte-offset, a cumulative sum can be calculated by iterating the block sizes to determine
which blocks are within the specified byte range.

indexes in `blocksizes`, then do a search to find which indexes contain the
range we are interested in.

In the example above, the offset list would be `[0, 20]`. Thus, we know we only need to download `Qmbar` to get the range we are interested in.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not refer to an "offset list" as that seems like a suggestion of something that needs to be constructed. We really on want to end up with an index list, or just the list of blocks to download.

Suggested change
In the example above, the offset list would be `[0, 20]`. Thus, we know we only need to download `Qmbar` to get the range we are interested in.
In the example above, if trying to read 10 bytes starting at byte-offset 25, the `blocksizes` sum at index 0 is 20, sum at index 1 is 50. So, the byte ranges starts within index 1 (50>25) and ends within index 1 (50>=35). This means we only need `Links[1]`, and only need to download `Qmbar` to get the range we are interested in.

Comment on lines 200 to 201
This field makes sense only in :ref[Directories] contexts and MUST be absent
when creating a new file. For historical reasons, implementations parsing
Copy link

@gammazero gammazero Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that a one large file spans multiple blocks? In the "sister-lists" section above it said that a file could be the concatenation of multiple files. Doesn't that mean it is a single file that spans multiple blocks? If that is the case, then is seems there could be a file, that is not a directory, that has multiple links. Or, does it mean that a large file that spans multiple blocks is transformed into a directory that holds the pieces of that file. I think some clarification would be helpful, or if this is stated make it stand out more.

- Any string containing a `NULL` (`0x00`) byte, as this is often used to signify string
terminations in some systems, such as C-compatible systems. Many unix
file systems do not accept this character in path components.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also want to restrict certain characters that may act as display control characters in some shells, such as backspace? While it would be somewhat of a departure from POSIX, it could help eliminate a class of security issues where portions of a path are effectively hidden when displayed in a shell.

metadata becomes part of the content.
2. Performance may be impacted as well as if we don't inline UnixFS root nodes
into [CID]s, so additional fetches will be required to load a given UnixFS entry.

Copy link

@gammazero gammazero Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has "Metadata in the File" been rejected? Seems like downside 1 is significant.

Also, it would be more expected IMO if CID calculation was similar to file hash calculation, in that the same hash of the file data is calculated regardless of the file's more or mtime.

Comment on lines +24 to +28
- name: Peter Rabbitson
github: ribasushi
affiliation:
name: Protocol Labs
url: https://protocol.ai/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this stage I would prefer to be removed. I no longer agree with a lot of what is here, especially the 1.5 update (compared to ~4 years ago when we started these)

Also the affiliation part should be removed from all lines.

Suggested change
- name: Peter Rabbitson
github: ribasushi
affiliation:
name: Protocol Labs
url: https://protocol.ai/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏃 In Progress
Status: 🔎 In Review
Development

Successfully merging this pull request may close these issues.