Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't abort block processing when encountering SkipMe #251

Merged
merged 9 commits into from
Sep 30, 2021
Merged

Conversation

rvagg
Copy link
Member

@rvagg rvagg commented Sep 14, 2021

Loading a link from within a block and getting a SkipMe back from the loader
will return out of the node iteration. If using SkipMe to deal with block
load memoisation, this can lead to incomplete traversal by avoiding any
potential non-skippable links that follow a skipped one.

(still needs a test or two)

Copy link
Collaborator

@warpfork warpfork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that looks right

@rvagg
Copy link
Member Author

rvagg commented Sep 16, 2021

@warpfork I'm cringing at how much this is going to grind your gears, but I've pushed a couple more commits, including tests for SkipMe, but along the way I found a codec-related inconsistency that I needed to fix in order to make a more complex DAG to walk..

The fixtures from focus_test.go use blocks that have been encode()d with dag-json, so they have the lexical sorting applied. But rootNode is used in its original assembled form in most places, even though we encode() it and use the link that results from that. So, when I try to use rootNodeLnk in a new parent Node, I get inconsistent results doing a walk through its contents compared to if I walk through its contents using rootNode.

There's a few ways to deal with this obviously, the one I went for was:

  1. do a full round-trip with encode() to ensure stability
  2. do it with a custom JSON encoder that removes lexical sorting, so we don't have to touch many of the existing tests and the field ordering, where it's being checked, looks like the ordering during construction
  3. rework the newish chain walk tests (which were added after the dag-json lexical sorting fix) to have more explicit visit order checking

Another option was to just go with dag-json and rewrite the order dependent assertions and add some docs. I can back up and do that if you think that's preferable—the custom encoder is already pretty weird so maybe it'll be less weird to just fix up the ordering and explain why they are not in the original construction order.

Added walk tests for a few different selectors, including the manually constructed one used commonly in Filecoin and a few other places.

@rvagg rvagg marked this pull request as ready for review September 16, 2021 08:33
@warpfork
Copy link
Collaborator

warpfork commented Sep 18, 2021

Yeah, this does grind my gears. Sorry. Can't bring myself to beat around the bush on this one.

We should not delete tests that test that order is maintained in the data model. The reasoning for this is straightforward:

  1. Selectors should work over the Data Model; and as another way of saying the same thing, should work over all codecs.
  2. Not all codecs, and certainly not the Data Model, has this sorted order idea.
  3. Therefore it would be incorrect to delete tests that just because they aren't representable in, for example, round-tripped dag-json (which is an "incomplete" codec, as it does not successfully round-trip the total set of information the Data Model can describe).

This means I don't think we should do the encoding round-tripping, either. That would effectively mean we're not exercising some otherwise very-reachable behavior, and that would be a mistake.

(This is a problem we could've continued to just not have. Ya'll wanted to walk face-first into it... ┐_(ツ)_┌ This is the kind of stuff we have to deal with, all the dang time now, if dag-json is going to be incomplete on ordering grounds. We'll probably eventually want to... make another codec that doesn't have that problem, to avoid the sheer irritation of it all, I suspect. Meanwhile, we also have to support all this stuff unambiguously if only for the sheer reason we also still decode un-sorted dag-json data in the wild without complaint. The worst of all worlds! Who could've possibly seen this coming!)

@rvagg
Copy link
Member Author

rvagg commented Sep 20, 2021

k, I've reverted the special codec and encode round-trip, leaving testing of a non-encoded node intact.

I accept your disgruntlement with the codec shuffling, but this is also the world of codecs we swim in, it's going to be the case with all current codecs we deal with, especially the ones with wide real-world use - dag-pb, git, bitcoin, eth, etc. They all have special rules about how the bytes should be laid out, some stricter than others. So creating Node forms of them without passing through an encoder is going to have this same problem of difference of ordering. It's just a reality we have to accept and make sure we (a) test for and (b) educate users about in cases like traversal where the rubber hits the road in surprising ways.

One of the things I experimented with in JS, for dag-pb at least, was a prepare() function that would take your in-memory form and ensure that it matches what it'll look like after a round-trip, so if that's important then you have the ability to do it.

dag-jose has a really weird version of this where it'll accept a "compact" form which is just a single colon-delimited string that can be expanded out to be the full form. The codec takes both this compact and the full form but only encodes the full form. So a round-trip can get really surprising in that case.

This is the world of messy data and opinionated layers.

If you want a codec that preserves order then we should just do that. Or, just lean on lists to make the problem disappear entirely, listpairs is quite a nice abstraction for this.

@rvagg
Copy link
Member Author

rvagg commented Sep 29, 2021

Includes a working version of #252 with tests for it doubling what I already have for SkipMe.

rvagg and others added 9 commits September 30, 2021 16:33
Loading a link from within a block and getting a SkipMe back from the loader
will `return` out of the node iteration. If using SkipMe to deal with block
load memoisation, this can lead to incomplete traversal by avoiding any
potential non-skippable links that follow a skipped one.
dag-json has lexical sorting of map fields, so encode() here will be reordering
some of the fields. Since we are doing a traversal from rootNode for most tests
and it doesn't get loaded (since it's the start Node), it doesn't get the same
reordering treatment as the other blocks.

So, we round-trip all blocks through the codec to ensure that the Node that
we have matches the Link and is what it would be regardless of where it comes
from.

Additionally, we're using a custom codec here for the tests that _remove_ the
map sorting and leaves the blocks unmolested, so we don't have to rewrite all
of the parts of the tests where ordering matters and the fields match up
more nicely with the fixtures we've constructed.

For the chain test, the ordering is now different because it depended on
dag-json reordering, so instead of hard-wiring expectations in a switch() we
do it in an array of expected paths.
As per review, remove the special re-encode and codec for test fixtures so
we have test coverage of the case where we're traversing a node that's not been
through a codec that messes with field ordering.
Normally, it's possible to visit the same blocks of data repeatedly during a traversal -- which, if you look at the world with sufficiently strict definitions, "makes sense", because traversals are really visiting `(path, node)` tuples, not just the nodes themselves.

But often, we find people really do only care about visiting nodes and don't particularly care to have the traversal inform them if it reaches that same data again by a different path.

With this patch, we track where we've been, at the granularity of links.  If we've encountered a link before, we won't load it again.

There's a configuration option for this, because in rare occasions, I could imagine not wanting this dedup; also, if one knows one doesn't need it, it may be valuable to turn it off because it does of course consume a small amount of memory to track where we've been.
…ather than defaulting to on.

This makes it a much less breaking change.
Copy link
Collaborator

@warpfork warpfork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice tests. ✔️

@rvagg rvagg merged commit 8182b92 into master Sep 30, 2021
@rvagg rvagg deleted the rvagg/skipme branch September 30, 2021 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants