Validate size in the DagReaders #4680

kevina · 2018-02-11T01:36:01Z

If a node is too large, it is truncated; if it is too small, it is zero extended.

Closes #4540. Closes #4667.

ivan386 · 2018-02-11T11:57:09Z

>ipfs get -LD QmUeec6DpWz2CA9uZ7GgPAB5nc6QYDg8yax4Wapxs1LSQL
2018-02-11 14:33:45.265625 DEBUG blockservice blockservice.go:183: BlockService GetBlock: 'QmS7RSorsfu1MJpBjcu7L1W2QfsoFDKSrvqAiHzLY4EUH4'
2018-02-11 14:33:45.281250 DEBUG blockservice blockservice.go:183: BlockService GetBlock: 'QmWWC1GHCa7at4K6oMK7Wm7odYCphhRvXYXKAJBreMhRLu'
2018-02-11 14:33:45.281250 DEBUG blockservice blockservice.go:183: BlockService GetBlock: 'QmSKboVigcD3AY4kLsob117KJcMHvMUu6vNFqk1PQzYUpp'
2018-02-11 14:33:45.281250 DEBUG blockservice blockservice.go:183: BlockService GetBlock: 'QmUNLLsPACCz1vLxQVkXqqLX5R1X345qqfHbsf67hvA3Nn'
2018-02-11 14:33:45.296875 DEBUG path resolver.go:55: Resolve: 'QmUeec6DpWz2CA9uZ7GgPAB5nc6QYDg8yax4Wapxs1LSQL'
2018-02-11 14:33:45.296875 DEBUG path resolver.go:144: resolve dag get
2018-02-11 14:33:45.296875 DEBUG blockservice blockservice.go:183: BlockService GetBlock: 'QmUeec6DpWz2CA9uZ7GgPAB5nc6QYDg8yax4Wapxs1LSQL'
Saving file(s) to QmUeec6DpWz2CA9uZ7GgPAB5nc6QYDg8yax4Wapxs1LSQL
 0 B / 3.05 KB [------------------------------------------------------------------------------------------------------------------]   0.00%2
018-02-11 14:33:45.296875 DEBUG blockservice blockservice.go:239: Blockservice: Got data in datastore
2018-02-11 14:33:45.312500 DEBUG blockservice blockservice.go:239: Blockservice: Got data in datastore
2018-02-11 14:33:45.312500 DEBUG blockservice blockservice.go:239: Blockservice: Got data in datastore
2018-02-11 14:33:45.312500 DEBUG blockservice blockservice.go:239: Blockservice: Got data in datastore
2018-02-11 14:33:45.312500 DEBUG blockservice blockservice.go:239: Blockservice: Got data in datastore
 3.05 KB / 3.05 KB [===========================================================================================================] 100.00% 0s
Error: unexpected EOF
2018-02-11 14:33:45.328125 INFO command request.go:105: Shutting down node...
2018-02-11 14:33:45.328125 DEBUG core core.go:544: core is shutting down...
2018-02-11 14:33:45.328125 DEBUG blockservice blockservice.go:274: blockservice is shutting down...

No blocksizes. File size bigger than all data. All links write to file.

Stebalien · 2018-02-11T14:18:54Z

Nice!

As @ivan386 says, we should check if filesize == sum(blocksize). Personally, I'd just fail the entire chunk if that check fails (if you want a hole at the beginning of a file, you can always point to the empty node and give it a non-empty size).

Edit: Alternatively, we could do the same zero-extend/truncate thing.

Stebalien

I wish there were a simpler way to do this but I'm pretty sure there isn't.

Stebalien · 2018-02-11T14:30:41Z

unixfs/io/sizeadj.go

+		r.offset += int64(n)
+		return n, err
+	}
+	for ; n < len(p) && r.offset+int64(n) < r.size; n++ { // pad


Nit: This may confuse the optimizer. I'd compute the bounds up-front. Possibly something like:

remaining := r.size - (r.offset + int64(n)) if remaining < len(p)-n { remaining = len(p)-n } for i := range p[n:n+remaining] { p[i] = 0 } n += remaining

I can still do this but I am reluctant to do to the increase in code size.

Entirely up to you. I seriously doubt this will be a bottleneck.

Stebalien · 2018-02-11T15:12:00Z

unixfs/io/sizeadj.go

+	}
+	n, err := r.base.Read(p)
+	if err == nil {
+		_, err = r.base.Read(nil)


I'd just allow the short read and force the outer reader to loop. Unfortunately, a lot of readers don't like read attempts with empty buffers (and I'd rather not do two reads per outer read).

I did, that initially and it broke some code I will look into it more closely.

That's not good...

Stebalien · 2018-02-11T15:16:24Z

unixfs/io/sizeadj.go

+	}
+	// Its easier just to always use io.SeekStart rather than
+	// correctly adjust offset for io.SeekCurrent and io.SeekEnd.
+	return r.base.Seek(r.offset, io.SeekStart)


Shouldn't this return r.offset? base.Seek could return a short offset.

Stebalien · 2018-02-11T15:19:59Z

unixfs/io/sizeadj.go

+		r.offset = r.size + offset
+	}
+	if r.offset < 0 {
+		return -1, errors.New("Seek will result in negative position")


Shouldn't this rollback the offset (technically)?

Yes, will fix.

kevina · 2018-02-11T15:28:00Z

@Stebalien, two questions

If there are links defined but no corresponding blocksizes, should we consider the block ill-formed and fail with an error, after all seeking is impossible (this came up in one of the test cases on a dag that was likely constructed manually, QmTTDNxNJUqF5PZicJqjDjsUed1bHaPWwFdEMvb1iY93Ur in t0110-gateway.sh test Add compact blocks)
If filesize != sum(blocksize) which should take precedence, I think it would be better to consider it ill-formed and fail with an error. Is that what you mean when you said "I'd just fail the entire chunk if that check fails"?

Stebalien · 2018-02-11T15:34:19Z

If there are links defined but no corresponding blocksizes, should we consider the block ill-formed and fail with an error...

IMO, yes.

If filesize != sum(blocksize) which should take precedence, I think it would be better to consider it ill-formed and fail with an error. Is that what you mean when you said "I'd just fail the entire chunk if that check fails"?

Yes. I consider this situation to be the same as (1).

ivan386 · 2018-02-11T21:42:15Z

if you want a hole at the beginning of a file, you can always point to the empty node and give it a non-empty size

For test: QmRqh1ckCLG6nXoE1aJQT2CHrByk5DWji8v3s4bHcmN975

>ipfs dag get QmRqh1ckCLG6nXoE1aJQT2CHrByk5DWji8v3s4bHcmN975
{"data":"CAIS4AMwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAYwBYg4AMg4AMg4AMg4AMg4AM=",
"links":[
{"Cid":{"/":"zb2rhmy65F3REf8SZp7De11gxtECBGgUKaLdiDj7MCGCHxbDW"},"Name":"1","Size":0},
{"Cid":{"/":"zb2rhb3bbxnrqKS2QopcDvbCz5F9nRW7MFzRkhH7LoqhG8AyD"},"Name":"2","Size":480},
{"Cid":{"/":"zb2rhahcgW9tVL7cbPnbdf8y5DzYdbLqCKh2AeBto1JnyCErj"},"Name":"3","Size":480},
{"Cid":{"/":"zb2rhbo1XsFWFP3LCF14NHDikCBn1o8hXVZ6BtR5hvntU5PRp"},"Name":"4","Size":480},
{"Cid":{"/":"zb2rhhkerjTtMxmCwrT9UZC3z7vVuLK8nPnMJzBFRwrEQtiyL"},"Name":"5","Size":480}
]}

Here zero size raw link: zb2rhmy65F3REf8SZp7De11gxtECBGgUKaLdiDj7MCGCHxbDW
Need to detect that zero data links and don't try to download it. We waste time on it.

I think that aligning blocks to the end of the file is the best solution. Less data in block.

ipfs add do not allow add zero sized file as raw link. It return: QmbFMke1KXqnYyBBWxB74N4c5SBnJMVAiMNRcGu6x1AwQH

ipfs dag get QmbFMke1KXqnYyBBWxB74N4c5SBnJMVAiMNRcGu6x1AwQH
{"data":"CAIYAA==","links":[]}

whyrusleeping · 2018-02-12T05:15:08Z

test/sharness/t0110-gateway.sh

@@ -188,7 +188,7 @@ test_expect_success "Add compact blocks" '
  printf "foofoo" > expected
 '

-test_expect_success "GET compact blocks succeeds" '
+test_expect_failure "GET compact blocks succeeds" '


These sorts of changes make me nervous. Whats the technical reason for the swap?

Because the test uses a unixfs node that does not define any blocksizes. Per earlier discussion:

@Stebalien, two questions

If there are links defined but no corresponding blocksizes, should we consider the block ill-formed and fail with an error, after all seeking is impossible (this came up in one of the test cases on a dag that was likely constructed manually, QmTTDNxNJUqF5PZicJqjDjsUed1bHaPWwFdEMvb1iY93Ur in t0110-gateway.sh test Add compact blocks)

IMO, yes.

That's not good... I assumed that all of the files we produced did had blocksizes listed. @whyrusleeping did we change this at some point?

@Stebalien I think this test is testing a specifically constructed block.

I think we should add a decent number of unit tests for stuff like this. Testing this sort of specific detailed logic via sharness feels wrong.

@whyrusleeping thoughs?

@kevina here is the background on this testcase: #4286 (comment)

Specifically:

Additionally - the way my project chunks data, my average blocksize is around 4096 bytes ( with a hard-limit on block sizes at 64k ), so forgoing any optional metadata is critical for me in order to reduce the size of my objects ( as the overhead adds up incredibly fast when one deals with hundreds of millions of objects in a DAG ).

@whyrusleeping's response:

The sizes are optional

@kevina please do not break this relied-upon feature ( neither in PB- nor in the future CBOR-UnixFS )

So, this is a problem. Unix programs expect files to be seekable and will break horribly if they aren't (and I'm not going to break every single unix program ever written). However, I also don't want to break existing uses if I can help it and this problem doesn't seem insurmountable.

We should be able to fall back on dumb seeking (although we may want to disable this from the gateway). Basically, the logic would be as follows:

If blocksizes are defined, use them.

If a block with children has no blocksizes defined (none at all, not some, not half, not incorrect ones, none), dumb seek through that portion of the file.

This actually gives us the best of both worlds. Files with lots of tiny blocks can put blocksizes at the top layers of the DAG and omit them in the lower levels.

In terms of implementation, we can punt on dumb seeking for now. Simply validate blocksizes if present and error when seeking when they're not present.

@kevina how hard would this be?

I will see what I can do.

whyrusleeping · 2018-02-12T05:17:03Z

unixfs/io/pbdagreader.go

 	curLinks := getLinkCids(n)
+	data := pb.GetData()
+
+	// validate


I'd like to have this logic broken out separately and have a few unit tests for it

Can you clarify what you mean here?

This validate logic that has been newly added. It should live in its own function. And then we should add a few units tests for that function.

I'll see what I can do.

Okay this is done now. Sorry for the delay.

Stebalien · 2018-02-12T17:37:25Z

@ivan386

I think that aligning blocks to the end of the file is the best solution. Less data in block.

It adds complexity. I don't want to have one case where we zero-pad the end and another where we zero-pad the beginning.

ipfs add do not allow add zero sized file as raw link. It return: QmbFMke1KXqnYyBBWxB74N4c5SBnJMVAiMNRcGu6x1AwQH

That's a bug. Would you mind reporting it? Also, we technically have an "identity" hash function so, eventually, we should be able to encode this empty block as 4 bytes: \x01\x37\x00\x00 (CIDv1, DagRaw, Multihash-ID, Length 0).

Note: The empty block is zb2rhWm2M1wXXqtqU6pHfovz3DZQ7D54ZD2xN3ynwankHCBCn (can add it with ipfs block put --format=raw <<<'').

ivan386 · 2018-02-12T19:34:24Z

@Stebalien

That's a bug. Would you mind reporting it?

Reported: #4688

Note: The empty block is zb2rhWm2M1wXXqtqU6pHfovz3DZQ7D54ZD2xN3ynwankHCBCn (can add it with ipfs block put --format=raw <<<'').

zb2rhWm2M1wXXqtqU6pHfovz3DZQ7D54ZD2xN3ynwankHCBCn is 1 byte block with line feed(0A)

Real zero length raw block is: zb2rhmy65F3REf8SZp7De11gxtECBGgUKaLdiDj7MCGCHxbDW

Also, we technically have an "identity" hash function so, eventually, we should be able to encode this empty block as 4 bytes: \x01\x37\x00\x00 (CIDv1, DagRaw, Multihash-ID, Length 0).

Convert it to base58: z2oTmZ

ipfs dag put -f"protobuf" test_dag_identity.json
Error: multihash too short. must be > 3 bytes

test_dag_identity.json:

{"data":"CAIS4AMwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAYwBYg4AMg4AMg4AMg4AMg4AM=",
"links":[
{"Name":"1","Size":0,"Cid":{"/":"z2oTmZ"}},
{"Name":"2","Size":480,"Cid":{"/":"zb2rhb3bbxnrqKS2QopcDvbCz5F9nRW7MFzRkhH7LoqhG8AyD"}},
{"Name":"3","Size":480,"Cid":{"/":"zb2rhahcgW9tVL7cbPnbdf8y5DzYdbLqCKh2AeBto1JnyCErj"}},
{"Name":"4","Size":480,"Cid":{"/":"zb2rhbo1XsFWFP3LCF14NHDikCBn1o8hXVZ6BtR5hvntU5PRp"}},
{"Name":"5","Size":480,"Cid":{"/":"zb2rhhkerjTtMxmCwrT9UZC3z7vVuLK8nPnMJzBFRwrEQtiyL"}}
]}

Report new bug?

PS:
01 37 00 03 00 00 00 -> z3fsQ7Xw35

>ipfs get z3fsQ7Xw35
Error: merkledag: not found

>ipfs block get z3fsQ7Xw35
Error: blockservice: key not found

PPS:
01 55 00 03 00 00 00 -> z3vosKrgsh

Same error

ipfs get QmPwC8o8wrTisWNbBijSL4A9JJTW4UavZNo93NqF4Mwgz2
Error: failed to fetch all nodes

Stebalien · 2018-02-12T22:15:20Z

Real zero length raw block is...

Good catch.

Report new bug?

The fact that we choke on a zero-length identity multihash is a bug in the multihash library. The fact that we don't currently support "inline blocks" is more of a missing feature (we only recently added the identity multihash and haven't added support for extracting blocks from it yet).

However, feel free to report both (we want inline blocks eventually anyways).

0x55 versus 0x37: You're right, the code is 0x55. I converted 55 into hex (but it's already hex).

whyrusleeping · 2018-02-15T22:02:28Z

progress here?

kevina · 2018-02-16T08:00:21Z

Assuming the test pass I think this should be good to go. We can clean up the failed test case later.

whyrusleeping · 2018-03-02T21:00:57Z

As @Stebalien has done most of the review here, i'd like to get his 👍 on merging

magik6k · 2018-03-02T21:17:13Z

@whyrusleeping see #4680 (comment)

~Kubuxu

Stebalien · 2018-03-06T21:07:33Z

@whyrusleeping, @kevina see my comment here (unless I missed some change and this is no longer an issue).

(Also, @kevina, sorry for giving you bad information. I was unaware of @whyrusleeping's promise).

kevina · 2018-03-16T04:04:27Z

@Stebalien I pushed a commit which I think should address your concern.

kevina · 2018-05-31T08:31:27Z

@Stebalien I fixed the test case in case that was what you where waiting for. @magik6k a review will would be helpful from you also

whyrusleeping · 2018-06-01T07:17:57Z

unixfs/io/sizeadj.go

+	}
+	// Its easier just to always use io.SeekStart rather than
+	// correctly adjust offset for io.SeekCurrent and io.SeekEnd.
+	_, err := r.base.Seek(newOffset, io.SeekStart)


what happens if r.size == 1000, but r.base.Size() == 500 and we try to call r.Seek(700)?

It seems like this would error out, when it should succeed and subsequent reads should return up to 300 bytes of zeros. I guess i'm not sure what errors Seek returns when you try to seek past its bounds.

Actually it won't error out, from the documentation (https://golang.org/pkg/io/#Seeker):

Seeking to an offset before the start of the file is an error. Seeking to any positive offset is legal, but the behavior of subsequent I/O operations on the underlying object is implementation-dependent.

whyrusleeping · 2018-06-01T07:26:56Z

unixfs/io/sizeadj.go

+func (w *truncWriter) Write(p []byte) (int, error) {
+	truncC := 0
+	if int64(len(p)) > w.size-w.offset {
+		truncC = int(int64(len(p)) - w.size - w.offset)


i think your order of operations here is wrong:
lets say we try to write 10 bytes to a thing that has a size of 15, and a current offset of 12:

10 - 15 - 12 == -17

you want: int64(len(p)) - (w.size - w.offset)

whyrusleeping · 2018-06-01T07:27:56Z

unixfs/io/sizeadj.go

+	truncC := 0
+	if int64(len(p)) > w.size-w.offset {
+		truncC = int(int64(len(p)) - w.size - w.offset)
+		p = p[:w.size]


i think you mean p = p[:w.size - w.offset] here

whyrusleeping · 2018-06-01T07:28:17Z

unixfs/io/sizeadj.go

+}
+
+// Write implemented Write method as defined by io.Writer
+func (w *truncWriter) Write(p []byte) (int, error) {


add tests for this function once the issues here are fixed please

Okay done. Sorry about that. I wrote this code several months ago and i thought it was ready for review, I guess I was wrong as I did not double check things carefully enough.

Stebalien · 2018-06-08T20:20:44Z

unixfs/unixfs.go

+	for _, blocksize := range pb.Blocksizes {
+		total += blocksize
+	}
+	if total != pb.GetFilesize() {


So, one case here is pb.Filesize == nil. Is Filesize required?

I double check, and if not I will add a check for that.

Note: the function pb.GetFilesize() returns a uint64, if pb.Filesize == nil it will return 0

Sorry, my question was more: If Filesize is nil, that means it wasn't specified. Do we require Filesize to be specified? Should we calculate it from the blocksizes if left unspecified?

@Stebalien given there must be a source of size, and given that blocksizes are optional, my hunch would be to just make Filesize hard-required.

Hm. I'd like @whyrusleeping and @diasdavid to 👍 this.

@whyrusleeping, @diasdavid:

Will all IPFS files include a Filesize?

I think its probably fine to assume that any correctly formatted ipfs/unixfs file will contain a filesize. Seems pretty hard to do a lot of things without it.

Okay I added a test to make sure it is not nil.

Stebalien · 2018-06-08T20:21:25Z

unixfs/unixfs.go

+// ValidatePB validates a unixfs protonode.
+func ValidatePB(n *dag.ProtoNode, pb *pb.Data) error {
+	if len(pb.Blocksizes) == 0 { // special case
+		return nil


There's also the case where there are no blocks but we still have inline data.

If a node is too large, it is truncated; if it is too small, it is zero extended. License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>

License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>

One failed test due to the use of a ill-formed unixfs node. License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>

License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>

Stebalien · 2018-06-22T01:28:00Z

unixfs/io/pbdagreader.go

 	}
+	if len(dr.pbdata.Blocksizes) > 0 {
+		// if blocksizes are set, ensure that we read exactly that amount of data
+		dr.buf, err = newSizeAdjReadSeekCloser(dr.buf, dr.pbdata.Blocksizes[linkPos])


You know, I think we can do something simpler here. Given that we know the Filesize of the buffer, we can require that buf export a Size() method. Then, we can just have a check here:

buf.Size() > blocksize -> make a limit seeker

== -> return as-is

< -> zero extend

That way, we can do all of these checks up-front. Thoughts? Too late?

Stebalien · 2018-06-22T01:40:32Z

What currently happens if there are no blocksizes but the filesize doesn't match? Both with and without inline data? We may need to truncate/zero extend in this case as well.

Current rules:

Filesize is required.
If blocksizes are specified:
1. They must sum to the filesize.
2. There must be one for each block.
3. If a blocksize is greater than the size of a block, zero extend.
4. If smaller, truncate.
If blocksizes are not specified:
1. Fail (or truncate?) if the inline data < Filesize? NEW
2. Truncate/zero extend the entire file to Filesize NEW

One thing is a bit inconsistent. If no blocksizes are specified, the filesize is used to truncate/zero extend. If they are specified, the filesize can't do this. We could also just truncate/extend always (i.e., don't check that sum(blocksizes) == filesize)...

So, I know this is kind of cold feet at the 11th hour however, after seeing all the weird corner-cases that come up in practice, I'm starting to wonder if we should fail more. That is:

Filesize is required.
If blocksizes are specified:
1. They must sum to the filesize.
2. There must be one for each block.
3. Each block's blocksize must match the filesize specified in the block (or the implied filesize for raw leaves). Basically, when we load a block, we call GetFilesize on the block and check it against the size.
If blocksizes are not specified: Keep a running tally of how much data we've read so far. If we hit an EOF before we hit the filesize, return a read error. Same if we hit the filesize before reading the end of the file. We can even do this at the block level: If we load a block that would cause us to exceed the filesize, fail; if the last block isn't big enough, fail.

The downsides are:

We lose this truncation/extension technique.
We change much of this PR (although, really, it should mostly just be code deletions).

Q: Ditch the truncate/extend and be strict?
- Yes.
- No. Q: Truncate/zero extend on sum(blocksize)/filesize mismatch?
  - Yes.
  - No. Q: Truncate or fail if inline data < filesize?
    - Truncate.
    - Fail.

magik6k · 2019-05-14T21:57:44Z

I'm not sure what the status on this currently is, but:

This code now lives in go-unixfs
DagReader received a major refactor since this was opened

Stebalien · 2019-05-14T23:06:36Z

This needs to be worked out at the spec level first. We jumped into coding too quickly, IMO.

ghost assigned kevina Feb 11, 2018

ghost added the status/in-progress In progress label Feb 11, 2018

This was referenced Feb 11, 2018

Error: archive/tar: write too long #4667

Open

Validate size in the DagReaders #4540

Open

Stebalien reviewed Feb 11, 2018

View reviewed changes

kevina force-pushed the kevina/validate-size branch from 85203e7 to 055026c Compare February 12, 2018 03:46

whyrusleeping reviewed Feb 12, 2018

View reviewed changes

ivan386 mentioned this pull request Feb 13, 2018

identity hash #4697

Closed

kevina mentioned this pull request Feb 15, 2018

Extracting importers requires extracting many things, let me know if I can go ahead #4679

Closed

kevina force-pushed the kevina/validate-size branch from b1c5ffb to bf51ef3 Compare February 16, 2018 08:10

kevina requested a review from Stebalien March 16, 2018 04:04

Kubuxu self-requested a review March 16, 2018 09:20

kevina requested a review from magik6k May 31, 2018 05:08

kevina force-pushed the kevina/validate-size branch from 8e412b3 to 1c63543 Compare May 31, 2018 07:49

whyrusleeping reviewed Jun 1, 2018

View reviewed changes

kevina force-pushed the kevina/validate-size branch 2 times, most recently from 14856b3 to 77149fd Compare June 1, 2018 22:02

Stebalien reviewed Jun 8, 2018

View reviewed changes

kevina force-pushed the kevina/validate-size branch from 2c59305 to bbc034e Compare June 12, 2018 08:20

kevina added 11 commits June 12, 2018 04:24

Validate size in the DagReaders

16a4cf1

If a node is too large, it is truncated; if it is too small, it is zero extended. License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>

Hack to allow reading of broken unixfs nodes.

429e5c9

License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>

Validate unixfs node in DagReader.

a5ed149

One failed test due to the use of a ill-formed unixfs node. License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>

Tweak EOF behaviour.

38026fa

License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>

Tweak seek behaviour.

af70e97

License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>

Factor out code to validate unixfs node and add unit tests.

924cfc8

License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>

Allow unixfs nodes without Blocksizes defined, but make them unseekable.

482f2ac

License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>

Address code review.

f99b765

License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>

Fix truncWriter.Write and add test case.

d4f16e2

License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>

Write zeros in small chunks rather than all at once.

cb0d1f4

License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>

Make ValidatePB stricter.

b563ea5

License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>

kevina force-pushed the kevina/validate-size branch from bbc034e to b563ea5 Compare June 12, 2018 08:31

Add (failing) test case.

cc93f66

License: MIT Signed-off-by: Kevin Atkinson <k@kevina.org>

Stebalien reviewed Jun 22, 2018

View reviewed changes

kevina mentioned this pull request Jul 30, 2018

foofoo.block has more links than UnixFS Blocksizes #5312

Closed

magik6k closed this May 14, 2019

Stebalien mentioned this pull request Jan 8, 2020

TODO: concept page with discussion layout validity ipld/specs#233

Open

hacdias deleted the kevina/validate-size branch May 9, 2023 10:55

Validate size in the DagReaders #4680

Validate size in the DagReaders #4680

Conversation

kevina commented Feb 11, 2018 • edited Loading

ivan386 commented Feb 11, 2018 • edited Loading

Stebalien commented Feb 11, 2018 • edited Loading

Stebalien left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kevina commented Feb 11, 2018

Stebalien commented Feb 11, 2018

ivan386 commented Feb 11, 2018

Choose a reason for hiding this comment

kevina Feb 12, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kevina Feb 12, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kevina Feb 12, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Stebalien commented Feb 12, 2018 • edited Loading

ivan386 commented Feb 12, 2018 • edited Loading

Stebalien commented Feb 12, 2018 • edited Loading

whyrusleeping commented Feb 15, 2018

kevina commented Feb 16, 2018

whyrusleeping commented Mar 2, 2018

magik6k commented Mar 2, 2018

Stebalien commented Mar 6, 2018 • edited Loading

kevina commented Mar 16, 2018

kevina commented May 31, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Stebalien commented Jun 22, 2018

magik6k commented May 14, 2019

Stebalien commented May 14, 2019

kevina commented Feb 11, 2018 •

edited

Loading

ivan386 commented Feb 11, 2018 •

edited

Loading

Stebalien commented Feb 11, 2018 •

edited

Loading

kevina Feb 12, 2018 •

edited

Loading

kevina Feb 12, 2018 •

edited

Loading

kevina Feb 12, 2018 •

edited

Loading

Stebalien commented Feb 12, 2018 •

edited

Loading

ivan386 commented Feb 12, 2018 •

edited

Loading

Stebalien commented Feb 12, 2018 •

edited

Loading

Stebalien commented Mar 6, 2018 •

edited

Loading