Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix!: keep ProtoNode deserialised state stable until explicit mutation #87

Merged
merged 1 commit into from
Aug 25, 2022

Conversation

rvagg
Copy link
Member

@rvagg rvagg commented Aug 16, 2022

When decoding a badly serialised block with Links out of order, don't sort
the list until we receive an explicit mutation operation. This ensures stable
DAG traversal ordering based on the links as they appear in the serialised
form and removes surprise-sorting when performing certain operations that
wouldn't be expected to mutate.

The pre-v0.4.0 behaviour was to always sort, but this behaviour wasn't baked
in to the dag-pb spec and wasn't built into go-codec-dagpb which now forms the
backend of ProtoNode, although remnants of sorting remain in some operations.
Almost all CAR-from-DAG creation code in Go uses go-codec-dagpb and
go-ipld-prime's traversal engine. However this can result in a different order
when encountering badly encoded blocks (unsorted Links) where certain
intermediate operations are performed on the ProtoNode prior to obtaining the
Links() list (Links() itself doesn't sort, but e.g. RawData() does).

The included "TestLinkSorting/decode" test is the only case that passes without
this patch.

Ref: ipld/ipld#233
Ref: filecoin-project/boost#673
Ref: filecoin-project/boost#675


Further notes:

  • Yes, in an ideal world we could rewind a couple of years and bake sort-on-decode into the dag-pb spec and go-codec-dagpb, but that's not the case and our most important stable-DAG-traversals are happening in CARs now which are almost exclusively using the new stack.
  • Our JavaScript codecs don't sort-on-decode.
  • All old and new Go and JavaScript codecs in the ipfs and ipld orgs properly sort on encode. However, there now exist multiple implementations of dag-pb that do not follow this part of the spec, and we're now encountering them in places where stable-DAG-traversals really matter.
  • This is really a continuation of the ipld-in-ipfs work that lead to v0.4.0, we're just fine tuning a weird corner case.

@rvagg
Copy link
Member Author

rvagg commented Aug 16, 2022

ipld/ipld#233 has spec updates that address this, and refer to a go-merkledag@v0.6.0 which should result from a release with this commit.

@rvagg rvagg self-assigned this Aug 16, 2022
Copy link

@aschmahmann aschmahmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also bubble this change up using a commit hash into kubo. There are a bunch of tests and history there such that if a test breaks it should link to what the original concern/issue was.

code archaeology tip: If you're curious why something is how it is you can use git blame and when you land on a commit for something that seems pretty old copy-paste the commit title (or hash) and search for it in kubo to find the related PR.

node.go Outdated
Comment on lines 388 to 415
// Links returns a copy of the node's links.
func (n *ProtoNode) Links() []*format.Link {
return n.links
links := make([]*format.Link, len(n.links))
copy(links, n.links)
return links
}
Copy link

@aschmahmann aschmahmann Aug 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems more reasonable. Hopefully no one is using mutation here, did you check for usage of this function or mostly assuming folks haven't been doing anything nuts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I haven't checked very broadly and I haven't seen anyone doing anything weird but I don't want to expose this if we're taking so much care with statefulness of sorting internally. But, I had to do this here filecoin-project/boost#675 to deal with the inverse - that go-merkledag was messing with it.

node.go Outdated Show resolved Hide resolved
node.go Outdated
func (n *ProtoNode) AddRawLink(name string, l *format.Link) error {
n.encoded = nil
n.links = append(n.links, &format.Link{
Name: name,
Size: l.Size,
Cid: l.Cid,
})

sort.Stable(LinkSlice(n.links))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we concerned about performance hits here during node creation since we're sorting after every link addition?

Would it make more sense to basically mark the node as dirty and then sort + clear the dirty bit the first time anyone encodes or tries to inspect the links?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I wasn't too concerned about performance because this is mostly an insertion-into-sorted operation where we're just shuffling pointers down the slice after the insertion point. But I've added a new commit that adds a linksDirty state to help defer the costs here. Note that it's a bit subtle because we can't call it dirty if it comes deserialized unsorted, we want to capture that state. So we have to make a decision when to sort - so linksDirty is done whenever you mutate the Links list, then we sort whenever we perform an operation that would need them to be in a sorted state (if they should be sorted!). But there's also operations where we have to re-sort regardless of linksDirty, like cloning the node, or re-serializing... Which brings up some additional questions that I've been meaning to add to the thread. Will comment separately on these.

@Jorropo Jorropo self-requested a review August 16, 2022 17:11
@rvagg
Copy link
Member Author

rvagg commented Aug 17, 2022

There's some additional options I wanted to flag here if anyone feels strongly enough to engage on them. IMO this currently is the right approach, but it's not the only approach. Consider the main case we're dealing with here - of a deserialized block that came with Links not properly sorted as per spec:

  1. When we re-serialize the badly serialized node using this code (and all other dag-pb codecs in ipfs and ipld orgs, past and present) we end up with a properly sorted Links list, and therefore entirely new bytes (and new CID). There are some cases where this can happen in the normal flow of operating with a ProtoNode, one of them was demonstrated in Boost: fix: don't allow go-merkledag to reorder loaded links filecoin-project/boost#675 (although this really is a bug, I'll comment inline here about that). We have the option of preserving the badly serialized form and saying - if you want this re-serialized then you'll get back what came in. I don't think this is a good option, our codecs should behave, even with bad, but acceptable, data.
  2. There is a subtle case where you're operating with a node in memory that you've built, but not serialized. If you use that node as part of a traversal and it has a form in-memory that it won't have after doing a round-trip through a codec then you might get unexpected results. (One example of this is demonstrated in ipld/go-ipld-prime@47336e2). What we do here is ensure that the in-memory form is always consistent with what will be serialized, unless it comes deserialized from a badly serialized form and hasn't been mutated. But we have the option of leaving it in the state it was constructed until we serialize it—so, e.g. if you added links in a random order to a ProtoNode you're building, then use it as part of a traversal (somehow) then you'll traverse it in the order you added them. We currently do this in go-ipld-prime, which we kind of have to because until a Node is serialized, it's not tied to a particular codec, all of which have different rules for sorting things (Eric often makes the argument that codecs shouldn't sort, that should be a higher-level concern, but that's another topic). I think, the appropriate choice here given the history and usage of go-merkledag is to just keep it consistent when you're building a ProtoNode so there are no surprises. It's always going to be dag-pb, there's no ambiguity except in the badly serialized form case.

Basically, sort your Links properly to avoid all of this.

@rvagg rvagg force-pushed the rvagg/stable-decoded-form branch from 5c480ad to 31cdb6f Compare August 17, 2022 07:01
@@ -132,7 +154,6 @@ func (n *ProtoNode) GetPBNode() *pb.PBNode {
// EncodeProtobuf returns the encoded raw data version of a Node instance.
// It may use a cached encoded version, unless the force flag is given.
func (n *ProtoNode) EncodeProtobuf(force bool) ([]byte, error) {
sort.Stable(LinkSlice(n.links)) // keep links sorted
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

best to highlight the biggest culprit of the problems we're having right now - re-sorting links regardless of whether we have a cached encoded form means that just calling RawData() will re-sort links regardless of whether we have the original raw data or not

rvagg added a commit to ipfs/kubo that referenced this pull request Aug 17, 2022
includes ipfs/go-merkledag#87 to keep deserialized
state of dag-pb blocks stable for stable traversals
rvagg added a commit to ipfs/kubo that referenced this pull request Aug 17, 2022
includes ipfs/go-merkledag#87 to keep deserialized
state of dag-pb blocks stable for stable traversals
@rvagg
Copy link
Member Author

rvagg commented Aug 17, 2022

ipfs/kubo#9202 👍 a dag export sharness with a DAG made of unsorted links blocks might be a good idea at some point, but since that entire path should be using the go-ipld-prime stack I don't think it's going to add much immediate value to this PR .. especially since we're trying to migrate away from this. If it works as is now, with all tests and sharness passing then this should be pretty good.

Copy link
Contributor

@Jorropo Jorropo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good, TL;DR can you please use slices.Clone for slice copy ?

node.go Outdated Show resolved Hide resolved
coding.go Show resolved Hide resolved
node.go Outdated Show resolved Hide resolved
node.go Outdated Show resolved Hide resolved
When decoding a badly serialised block with Links out of order, don't sort
the list until we receive an explicit mutation operation. This ensures stable
DAG traversal ordering based on the links as they appear in the serialised
form and removes surprise-sorting when performing certain operations that
wouldn't be expected to mutate.

The pre-v0.4.0 behaviour was to always sort, but this behaviour wasn't baked
in to the dag-pb spec and wasn't built into go-codec-dagpb which now forms the
backend of ProtoNode, although remnants of sorting remain in some operations.
Almost all CAR-from-DAG creation code in Go uses go-codec-dagpb and
go-ipld-prime's traversal engine. However this can result in a different order
when encountering badly encoded blocks (unsorted Links) where certain
intermediate operations are performed on the ProtoNode prior to obtaining the
Links() list (Links() itself doesn't sort, but e.g. RawData() does).

The included "TestLinkSorting/decode" test is the only case that passes without
this patch.

Ref: ipld/ipld#233
Ref: filecoin-project/boost#673
Ref: filecoin-project/boost#675
@rvagg rvagg force-pushed the rvagg/stable-decoded-form branch from 22e403f to 0a25b5f Compare August 22, 2022 06:07
Copy link
Contributor

@Jorropo Jorropo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rvagg rvagg merged commit 48c7202 into master Aug 25, 2022
@rvagg rvagg deleted the rvagg/stable-decoded-form branch August 25, 2022 05:44
rvagg added a commit to ipld/ipld that referenced this pull request Aug 29, 2022
rvagg added a commit to ipld/ipld that referenced this pull request Aug 29, 2022
rvagg added a commit to ipfs/kubo that referenced this pull request Aug 29, 2022
includes ipfs/go-merkledag#87 to keep deserialized
state of dag-pb blocks stable for stable traversals
rvagg added a commit to ipfs/kubo that referenced this pull request Aug 29, 2022
includes ipfs/go-merkledag#87 to keep deserialized
state of dag-pb blocks stable for stable traversals
Jorropo pushed a commit to Jorropo/go-ipfs that referenced this pull request Dec 12, 2022
includes ipfs/go-merkledag#87 to keep deserialized
state of dag-pb blocks stable for stable traversals
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants