-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "ipfs diff" command #4656
Comments
FYI that you can currently compare two different hashes in IPFS using |
Yes. Local directory which was obtained by running |
Unfortunately, > NEWHASH=$(ipfs add --pin=false /my/data)
> ipfs object diff $OLDHASH $NEWHASH (if you start running low on space, just run ipfs gc to remove everything that isn't pinned). Now, in the ideal workflow, you'd never move anything out of ipfs. Eventually, we'd like to be able to mount MFS using, e.g., fuse so you can just play with your files there. Unfortunately, we're not quite there yet. |
Since we deprecated |
Version information:
0.4.13-cc01b7f
Type:
Feature request.
Description:
I tried to use git LFS to distribute a large dataset, but it didn't work well. GitHub has limit on 2 GB per file. And cloning is really slow, without any deduplication of files as well. (My dataset has a lot of duplication because it has multiple versions of data: for machine learning training, testing, and then full data. So data is the same, just permutations in directories change.)
So I decided to use try IPFS to distribute datasets. The issue is that after user gets a dataset (
ipfs get
) there does not seem to be any good tooling around managing those just-got files. #4655 is asking for something likegit checkout
. And here I would like to ask for something likegit status
orgit diff
. That I could say, compare this directory against this IPFS hash and tell me if anything changed, and what are those changes.So after researchers get a dataset, they like to play with it. Sometimes they destroy something and would like to restore, but sometimes they want to persist changes. So before calling
ipfadd -w -r <dir>
to publish a new version of a dataset, they might want to see if all changes they want are really there. Soipfs diff
or something would be great.The text was updated successfully, but these errors were encountered: