Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add partial-match traversal of large bytes #375

Merged
merged 14 commits into from
Mar 7, 2022
Merged

Conversation

willscott
Copy link
Member

  • add a streamBytes parallel to plainBytes for byte nodes backed by io.ReadSeekers rather than []byte
  • Implement Slice for subset matches
  • Extend the Decide(node) bool method for matching to Match(node) (node, error), which allows a match to be done on only a subset of a node.
  • Update walk_adv to make use of the Slice selector by way of the Match method.

Copy link
Contributor

@mvdan mvdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code SGTM with the review, so I think this is fine to merge as long as the spec changes are also approved and merged

traversal/selector/matcher.go Outdated Show resolved Hide resolved
traversal/selector/matcher.go Outdated Show resolved Hide resolved
traversal/selector/matcher.go Outdated Show resolved Hide resolved
traversal/selector/matcher_util.go Show resolved Hide resolved
traversal/selector/matcher_util.go Show resolved Hide resolved
@willscott willscott changed the title Add partial-match travesal of large bytes Add partial-match traversal of large bytes Mar 3, 2022
Copy link
Collaborator

@hannahhoward hannahhoward left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking comments but otherwise LGTM

}

// ReadAt provides the io.ReadAt method over a ReadSeeker.
// This implementation does not support concurrent calls to `ReadAt`,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my one worry here is: what are the odds the consumer of this code reads this comment, and not the comment on golang's ReaderAt interface -- is there a chance for misuse through the chain of readerAt -> SectionReader -> NewBytesFromReader? SectionReader certainly seems to use ReadAt to implement seek+read for example.

A seperate question: should AsLargeBytes return a ReadSeeker that is also a ReaderAt? os.File satisfies both. new section reader satisfies both.

Non blocking, but we should look at figuring this out when we can.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mvdan argued that ReadSeeker is a more common and lower effort to ask implementations of node to implement. This wrapper is internal (since the struct type is not exported) and so is only for use by the Slice matcher to wrap a SectionReader around a node's ReadSeeker. As such, I believe there's minimal potential for misuse.

@@ -106,6 +106,12 @@ type Selector interface {
// Only "Matcher" clauses actually implement this in a way that ever returns "true".
// See the Selector specs for discussion on "matched" vs "reached"/"visited" nodes.
Decide(node datamodel.Node) bool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we using decide any more? Should we just delete it? No one uses compile selectors outside of go-ipld-prime I think

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't see any uses of Decide in the go-ipld-prime directly before this PR either. I think the thought is to transition to Match as the signature that we actually need for these walks.

My preference would be to mark this as deprecated rather than turn this into a backwards-incompatible change.

@hannahhoward
Copy link
Collaborator

Also I rebased this for you since I was the one who created your merge conflict :)

@willscott willscott merged commit 78b188a into master Mar 7, 2022
@willscott willscott deleted the feat/largeByteTraversal branch March 7, 2022 11:42
@warpfork
Copy link
Collaborator

warpfork commented Mar 9, 2022

Crosslinking (since I just had to go re-find it): spec happened in: ipld/ipld#184

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants