-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: pin API #1045
feat: pin API #1045
Conversation
src/core/components/pin-set.js
Outdated
|
||
const multihashes = require('multihashes') | ||
const CID = require('cids') | ||
const protobuf = require('protocol-buffers') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should use the protons
package instead
@AdamStone stunning work with owning the Pin API endeavour on specs, js-ipfs-api and js-ipfs 👏🏽👏🏽👏🏽👏🏽👏🏽👏🏽 I went ahead and merged and released interface-ipfs-core + js-ipfs-api 🌟 As for merging this PR, since it is such a large code addition, I want to review it thoroughly one more time and I also want to invite other js-ipfs contributors to give their feedback. Let's shoot to merge this week :) @AdamStone will you be around to finish any last details? PS: I've also invited you to the JavaScript team on the IPFS org on Github, you now can commit directly into this branch or create other branches :) |
@diasdavid Thanks very much, I didn't expect it was that close to ready for merge though. Were you able to resolve the issue with the pin datastore incompatibility with the go repo? As I left it, the JS implementation would save its pins under a different key than the go implementation. Having finished some high priority things for work, I think I should have some free time again for another week or two, but after that I'll be full time taking care of a newborn. |
src/core/components/key-set.js
Outdated
@@ -0,0 +1,31 @@ | |||
'use strict' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any specific reason we're implementing our own set here and not using the built in one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did it this way to work around the problem mentioned in the comment
// Buffers with identical data are still different objects, so
// they need to be cast to strings to prevent duplicates in Sets
in order to have set-like storage for multihash buffers but indexed by hash string. But since it's straightforward to convert the hash string back to the buffer object, maybe it would be better to just use a Set to store the strings and use multihashes.fromB58String()
where the object is needed.
src/core/components/pin-set.js
Outdated
const maxItems = 8192 | ||
|
||
// Protobuf interface | ||
const pbSchema = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can probably use backticks here (`) - but that's a nitpick :)
test/cli/pin.js
Outdated
}) | ||
}) | ||
|
||
// it('ls (quiet)', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this still needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure about this one. It's part of go-ipfs but wasn't mentioned in the js-ipfs specs, and I had some trouble getting it to work.
src/core/components/pin.js
Outdated
// try to accept a variety of hash options including | ||
// multihash Buffers, base58 strings, and ipfs path | ||
// strings, either individually or as an array | ||
if (Buffer.isBuffer(hashes) || typeof hashes.forEach !== 'function') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe use Array.isArray()
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard to remember now, but I think I had used that originally and changed it after someone asked to avoid using it... Or maybe it had been instanceof Array
. Another option is hashes.constructor === Array
. I'm fine with any of these. Currently in the project I see three examples of instanceof Array
and three of Array.isArray()
.
src/core/components/pin.js
Outdated
each(hashes, (hash, cb) => { | ||
if (typeof hash === 'string') { | ||
// example: '/ipfs/QmRootHash/links/by/name' | ||
const matched = hash.match(/^(?:\/ipfs\/)?([^/]+(?:\/[^/]+)*)\/?$/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice to have utils around this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually wonder if I should be doing this at all here, it seems like there should already be some built-in way to follow paths like this using the existing interface? All this is just for the sake of getting the cid of the thing at the end of the path. I could use files.get
with the full path but I'm not sure how to get the cid from the result. @diasdavid any suggestion?
src/core/components/pin.js
Outdated
|
||
function getRecursive (multihash, callback) { | ||
// gets flat array of all DAGNodes in tree given by multihash | ||
// (should this be part of dag.js API? it was in ipfs-merkle-dag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen several of this questions, did we get this addressed - @diasdavid?
Thanks for the code review, I'm just checking in for now to mention that my son was born about a week early, so it will be some time before I can follow up with this, probably at least a few weeks. |
@AdamStone congrats on your newborn! Hope you're enjoying the moment - it flies by fast 😜 |
@AdamStone just read your last message. Congratulations!!! 🎉🌟👶🌟🎉 I hope everything is great with the 3 of you :) |
@diasdavid @dryajov Thanks, we are doing well and getting used to our new life. I found some time to look into this and figured out why my saving and loading of the pin set wasn't working with the go test repo. I have it working now locally, but only if I comment out this assert to enable DAG links with size 0 -- it seems the go repo allows size 0 links and the pin set implementation includes them. Not sure what to do about that. The hashes I see for the same sets of pins pinned through the js interface are also different, so there must still be some discrepancy in the structure of the pin sets, but it doesn't seem to affect the end result. |
It seems like I can't push to this PR, so I pushed these changes to my fork. |
@AdamStone just invited you to the JS team :) https://github.com/orgs/ipfs/teams/javascript-team/members |
@AdamStone note that there are merge conflicts due to recent changes in js-ipfs master, it might be wise to resolve them now so that you build on the current structure and avoid a big hairy rebase at the end. |
@diasdavid Thanks, I probably got dropped before from inactivity. Pushed an update. There were a couple of inline questions above about utility functions, any thoughts on those? Or the matter of zero-size links? For now I'll take a look at the merge conflicts. |
@AdamStone re: zero size links - The reality is that the size value on a link can't really be trusted since there is no way to create a proof that the size is the real one other than checking manually. In JS land at least we make the obvious claim that if there is a link, at least it has to be bigger than 0 but it seems that Go didn't and ignores the size completely for Pinsets. Given that and that we want to be feature by feature (and sometimes bug by bug), we can remove that condition from ipld/dag-pb. Wanna submit that PR? |
I ended up moving getRecursive to the dag module because it just seems more sensible to have it there than buried in the pin code (since all it does is recursively traverse a dag node's links). |
I merged the latest master, updated the ipfs-api dependency to the latest version, but tests are still failing on "a link requires a size." The latest ipfs-api version requires an ipld-dag-pb version that should include the change merged above to allow empty links, so I'm not sure why the tests are still seeing this assert. |
@AdamStone Just ran a fresh npm install (note, it is important to remote the package-lock.json) and got
Which looks good. Running the tests also gave me:
Perhaps you just didn't fresh install your node modules? I've this alias on my zshrc to avoid any confusion: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tons of good work here @AdamStone!! This PR is super close to being merged and tests are looking good as well.
I'm making some comments asking for some bits of the code to be a lil' more simplified, just to prevent to confuse our future selfs :)
src/cli/commands/pin/ls.js
Outdated
const type = argv.type | ||
const quiet = argv.quiet | ||
argv.ipfs.pin.ls(paths, { type }, (err, results) => { | ||
if (err) { throw err } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this print a useful error message to the user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose probably not. I just followed the same pattern as the other CLI modules.
src/cli/commands/pin/ls.js
Outdated
const paths = argv.path && argv.path.split(' ') | ||
const type = argv.type | ||
const quiet = argv.quiet | ||
argv.ipfs.pin.ls(paths, { type }, (err, results) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mind being explicit and using { type: type }
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, no problem
src/cli/commands/pin/rm.js
Outdated
const paths = argv['ipfs-path'].split(' ') | ||
const recursive = argv.recursive | ||
argv.ipfs.pin.rm(paths, { recursive }, (err, results) => { | ||
if (err) { throw err } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. Let's make sure that these error messages are printed in a way that it provides useful information to the user and it doesn't look like the whole program crashed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it looks like all the CLI modules just throw the error, would that be better as a separate PR to replace all of them with some consistent response?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good 👍
src/core/components/dag.js
Outdated
@@ -73,6 +75,30 @@ module.exports = function dag (self) { | |||
self._ipldResolver.treeStream(cid, path, options), | |||
pull.collect(callback) | |||
) | |||
}), | |||
|
|||
getRecursive: promisify((multihash, callback) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This getRecursive is only working for ipld-dag-pb
right now. Either make it an internal method by prefixing with a _
and add a comment saying just that or move it to https://github.com/ipld/js-ipld-resolver and leverage the .tree
and .isLink
API calls to make it agnostic to any kind of IPLD Format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll see if I can move it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@diasdavid Should the existing treeStream method already do what I'm trying to do here, if the recursive: true
option is passed? I tried testing it with a CID for this directory but I'm getting an error in the recursive part. Specifically this new CID(...)
errors because paths
includes separate objects for link name, size and hash that all have p.link
(so it tries to recurse all of them, when only the hash ones should be recursed).
[
{
"path":"Links/0/Name",
"link":{
"/":"about"
}
},
{
"path":"Links/0/Tsize",
"link":{
"/":1688
}
},
{
"path":"Links/0/Hash",
"link":{
"/":"QmZTR5bcpQD7cFgTorqxZDYaew1Wqgfbd2ud9QqGPAkK2V"
}
},
...
]
Is it a bug that the hashes array is getting these as separate links? Or should the recursion check specifically for the "hash" links?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that definitely looks like a bug as it brings more info than what it promises. The method was created just to return all the paths. That link
property is just a left over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@diasdavid Based on these discussions it looks like the presence of separate "paths" for name, size and hash for each link is correct/necessary, and the bug is that the isLink method (formerly isCID) was meant to identify specifically the "hash" paths but is actually more permissive. The tests pass because the non-valid CID case is checking an empty value, which happens to return false
correctly, whereas other non-empty non-CID values would not. I made another PR to address that and update the test.
The same pattern for isLink appears in other types of resolvers as well, so I expect the same problem would occur for those as well. If the fix I made in the above PR looks good, I can do the same for the other resolvers.
src/core/components/pin-set.js
Outdated
hashed[h].push(items[i]) | ||
} | ||
const storeItemsCb = (err, child) => { | ||
if (callback.called) { return } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use http://npmjs.com/once for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, looks like I had onced the callbacks but missed removing some of these.
src/core/components/pin.js
Outdated
}) | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as the function above. This whole method needs to be simplified.
src/core/components/pin.js
Outdated
return callback(null, result.payload) | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
src/core/components/pin.js
Outdated
}), | ||
|
||
isPinned: (multihash, callback) => { | ||
// callback (err, pinned, reason) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove if not used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it is preferable for the callback to be use function (err, result)
to follow the same pattern as everything else
src/core/utils.js
Outdated
} | ||
} | ||
|
||
exports.normalizeHashes = function (ipfs, hashes, callback) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add tests specifically to this function?
src/core/utils.js
Outdated
exports.OFFLINE_ERROR = 'This command must be run in online mode. Try running \'ipfs daemon\' first.' | ||
|
||
const splitIpfsPath = exports.splitIpfsPath = function (pathString) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add tests specifically to this function?
* initial sweep through to understand how pin works. Did make some changes but mostly minor. * refactor pb schema to it's own file * fix: don't pin files during files.add if opts.pin === false * feat: add some http qs parsing, http route/resources cleanup, cleanup core/utils.parseIpfsPath * feat: expand pin tests. \nFirst draft. still needs some further work. * feat: Add logging for entry/exit of pins: add/rm/flush/load. Clean some documentation. * feat: add --pin to files.add, fix: improper pin option parsing in core. * feat: Use ipfs.files.add to add init-docs instead of directly using the unix-fs importer. * feat(tests): Add tests for cli --pin option. I know this should be more of an integration test. Should be written in /core. Maybe talk with Victor about testing different layers * feat: use isIPFS to valiate a multihash.
refactor: change pin.flush logging message
fix: don't need to cast the object.get result with toJSON revert: use interface-datastore.Key for datastore pin storage The proper change would be that datastore-level automatically casts operations into Keys fix: do not invoke callback within a try/catch feat(test): make cli pin tests more robust By using files that aren't added on IPFS initialization. Still needs work on files.rm (direct) and ipfs ls (indirect). fix: remove commented code, traced test failures to pin-set Got to go for the night, though, so will checkpoint here and address tomorrow. feat: parseIpfsPath now throws errors for consistency feat: resolveIpfsPaths error message lists the relative path that failed feat: use follow.bind instead of mutating the links Also decided not show relative paths. Less human friendly but probably cleaner otherwise. refactor: resolveIpfsPaths -> resolvePaths feat: promisify resolvePaths test: change parseIpfsPath failure tests to use try/catch docs: edit resolvePath doc revert: accidentally deleted commands/pin.js
I think my original rebase for this branch 2 weeks ago might have changed history for the intervening commits, indirectly causing some of these missed changes. or I just rebase onto the wrong oldparent. fix: some onlyHash and pin tests broke after merging onlyHash and pin interact: shouldn't pin when --only-hash. fix: trim output for 'pin ls when no hash is passed' test: indirect pins supersede direct pins: turns out we had a bug feat: add expectTimeout test utility feat: promisify some additional pin utils
I think I'll end up moving most tests here. test: add tests for pin.ls and pin.rm Based tests on other pin fixtures, need to migrate the isPinned* tests to them as well. fix: direct pins are now deleted by a default pin.rm(hash) test: prepare for pin.add tests 'indirect supersedes direct' test exposes a bug in pin.ls feat: switch away from multihashes for isPinned* tests test: impl pin.add tests fix: add fixture files only once test: add test for a potential bug, clean isPinned* tests refactor: remove a test that's no longer needed fix: pin.ls, indirect pins should supersede direct pins test: naive pin.load, pin.flush tests feat: remove most pin cli tests as functionality is tested in pin core tests refactor: rename solarSystem
fix: attempt to find a way to use http-api/inject test structure for pin tests test: fix pin.rm http-api tests test: fix pin.add http-api tests docs: docs and cleanup of http-api pin tests refactor: renaming fix: lint errors fix: resolvePaths tests are failing on CI, it might be long ops, testing a timeout bump fix: add files explicitly before testing resolvePaths fix: remove mocha.only from resolvePaths. let's hope tests pass, they are passing CI now fix: rename test/core/utils.spec.js -> utils.js so it's not run during browser tests
Need to leave computer, this is a checkpoint. test: add sanity test for walkItems and hasChild, clean others These tests are more descriptive than really pushing the impl. I'd love others' thoughts on what else should be hit and how. I also need to compare go's pinset impl against ours fix: stop daemons feat: documentation and multihash buffer handling for dag.get fix: lint
fix: pinset.hasChild buffer check feat: hardcode expected length for flush/load tests
I must have missed a commit during a rebase.
fix: yarg arugment naming for pin cli commands fix: convert file multihashes to a b58 string fix: another way of checking for CID-ness fix: lint fix: toB58String handles non-buffers fix: key-exchange core tests now shutdown daemon.
refactor: pinset.hasChild -> pinset.hasDescendent fix: invoke someCb if we've seen the hash before refactor: async patterns in dag._getRecursive refactor: pinset.hasDescendant refactor: pinset.storeItems async patterns refactor: pinset.loadSet and pin.walkItem async patterns docs: add link to go-ipfs' fanout bin implementation refactor: async patterns of pin.load/flush refactor: lint refactor: privatize internal pin key storage refactor: change encapsulation of ipfs.pin, fix resulting issues fix: lint fix: 'files add --pin=false' test was giving a false positive refactor: use is-ipfs to check CID-ability of a string refactor: remove last instance of 'once' in the pin code
They're simple enough, documented elsewhere, and not used by any exposed functionality.
@alanshaw are the pin tests being run https://github.com/ipfs/js-ipfs/tree/master/test/core ? doens't seem like it. |
🤦♂️ oh no it doesn't |
Let's see what #1405 says :) |
No description provided.