Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "ipfs diff" command #4656

Closed
mitar opened this issue Feb 4, 2018 · 4 comments
Closed

Add "ipfs diff" command #4656

mitar opened this issue Feb 4, 2018 · 4 comments

Comments

@mitar
Copy link

mitar commented Feb 4, 2018

Version information:

0.4.13-cc01b7f

Type:

Feature request.

Description:

I tried to use git LFS to distribute a large dataset, but it didn't work well. GitHub has limit on 2 GB per file. And cloning is really slow, without any deduplication of files as well. (My dataset has a lot of duplication because it has multiple versions of data: for machine learning training, testing, and then full data. So data is the same, just permutations in directories change.)

So I decided to use try IPFS to distribute datasets. The issue is that after user gets a dataset (ipfs get) there does not seem to be any good tooling around managing those just-got files. #4655 is asking for something like git checkout. And here I would like to ask for something like git status or git diff. That I could say, compare this directory against this IPFS hash and tell me if anything changed, and what are those changes.

So after researchers get a dataset, they like to play with it. Sometimes they destroy something and would like to restore, but sometimes they want to persist changes. So before calling ipfadd -w -r <dir> to publish a new version of a dataset, they might want to see if all changes they want are really there. So ipfs diff or something would be great.

@leerspace
Copy link
Contributor

FYI that you can currently compare two different hashes in IPFS using ipfs object diff. However, it sounds like you're looking for a command that will diff a local directory that isn't in IPFS with one that is.

@mitar
Copy link
Author

mitar commented Feb 6, 2018

Yes. Local directory which was obtained by running ipfs get with what is in ipfs.

@Stebalien
Copy link
Member

Unfortunately, ipfs diff /ipfs/something /local/files won't be that much faster than simply re-importing the dataset and then doing an object diff (unless there are a lot of changes in which case there'll be the overhead of writing the new data to disk). So, right now the workflow would be:

> NEWHASH=$(ipfs add --pin=false /my/data)
> ipfs object diff $OLDHASH $NEWHASH

(if you start running low on space, just run ipfs gc to remove everything that isn't pinned).


Now, in the ideal workflow, you'd never move anything out of ipfs. Eventually, we'd like to be able to mount MFS using, e.g., fuse so you can just play with your files there. Unfortunately, we're not quite there yet.

@lidel
Copy link
Member

lidel commented Jul 2, 2021

Since we deprecated object API, I believe this is superseded by #4801

@lidel lidel closed this as completed Jul 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants