Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipfs pin add hangs after ipfs add hashcode -t (possible workaround found) #5683

Closed
JazzTp opened this issue Oct 28, 2018 · 13 comments
Closed
Assignees
Labels
topic/docs-ipfs Topic docs-ipfs topic/UnixFS Topic UnixFS

Comments

@JazzTp
Copy link

JazzTp commented Oct 28, 2018

I mentioned the problem in #3505 (before finding this possible workaround) and was invited to open a new issue.

Version information:

go-ipfs_v0.4.17_linux-amd64

repo version 7

Type: Bug

Description:

ipfs pin add was hanging, it happened so far with three files only, of a total of 95

In order to add a DTube file to this node, I've been recommended to follow these steps, after retrieving the binary with wget from the gateway https://video.dtube.top/ipfs/ (except hash codes for "Snap" files which used to be retrievable via ipfs get).

ipfs add hashcode -t
ipfs pin add hashcode

I've been said:

-t is trickledag, it optimises sequential read performance which is useful for videos, that's what DTube uses

With those three files (out of 95):

  • the first step ipfs add hashcode -t was fast and gave no complains
  • the second step ipfs pin add hashcode was hanging and those hash codes were not in the output of ipfs pin ls -t recursive

I just found a workaround, but I'm unable to evaluate its implications:

ipfs add hashcode
ipfs add hashcode -t
ipfs pin add hashcode

This does not hang and those hash codes are now appearing in the output of ipfs pin ls -t recursive


(Further info:

  • I was also reading IPFS add hangs #5321 with a possible alternative method to pin a file, @Stebalien post July 31st, but at the moment I'm unable to understand how to apply that workaround.
    IPFS add hangs #5321 (comment) "Note: you can add a "best effort" pin by linking the file into mfs (ipfs files cp /ipfs/MyHash /some_file_name)".
  • ipfs pin ls -t recursive | wc -l gives 95, ipfs refs local | wc -l gives 16899
    This node has 4.1 GB so far, Max set at 500 GB for tests, it could be more, I basically stopped using ipfs because I was unable to proxify the daemon: internet service provider not allowing customers to open ports since a few months ago, filed two complains nothing new so far, I found the problem has been mentioned here https://discuss.ipfs.io/t/using-ipfs-over-a-socks-proxy/440/4 with no answers offered so far.

)

@Stebalien
Copy link
Member

It looks like the trickledag importer may not be correctly flushing certain blocks to disk. If possible, could you try:

  1. Creating a new repo.
  2. Importing just the bad file.
  3. Extracting the file from ipfs (ipfs get).

@Stebalien Stebalien added the kind/bug A bug in existing code (including security flaws) label Oct 28, 2018
@JazzTp
Copy link
Author

JazzTp commented Oct 28, 2018

@Stebalien done, new repo, only this file:

ipfs add hashcode -t ends with no complains
ipfs get hashcode hangs

Then on the same repo:
ipfs add (without -t) ends with no complains
ipfs get hashcode does not hang and diff says the binary is OK (I got the originally uploaded one from the author, the source video)

Again made a new repo:
ipfs add (without -t) ends with no complains
ipfs get hashcode does not hang and diff says the binary is OK

ALSO, on the "full" repo where I had also issued ipfs add hashcode (without -t):
ipfs get hashcode does not hang and diff says the binary is OK

The file size is 142 MB, I've added bigger files without any problems.

@schomatis schomatis self-assigned this Oct 28, 2018
@schomatis schomatis added the topic/UnixFS Topic UnixFS label Oct 28, 2018
@schomatis
Copy link
Contributor

Hey @JazzTp, thanks for the detailed analysis, could you upload the file that causes the hang please? I'm trying to reproduce it with a 150 MB random file with no luck.

@schomatis schomatis added the status/in-progress In progress label Oct 29, 2018
@JazzTp
Copy link
Author

JazzTp commented Oct 29, 2018

Hi @schomatis I had sent an URL via e-mail to @Stebalien just after posting (checked your profile page, I'm also in Buenos Aires).

Here it goes, you can issue:

wget https://video.dtube.top/ipfs/QmWTViQbbngUAPyD62PpmC1obPK7kxUPT6m2NnjTDHRSak

It is the 1080p "source video" associated to this Steemit-DTube post:
https://steemit.com/movies/@peaceandfreedom/grxv6cd7

Here you have the two other files which gave the same problem, I have just repeated the same steps to be sure.

wget https://video.dtube.top/ipfs/QmXQkpUo9xmDygiKSW8TUajQpMaaFMykAUAHN3UbjWXMiJ
wget https://video.dtube.top/ipfs/Qmf6GSrbYzReff7Aj7oPGneT5UEceWCZfPMCFya8c95qMH


Alas, I can not successfully upload from my node, I spent a huge amount of hours/days trying. This ISP is not allowing to open ports ATM, I complained twice. I do have an OpenVPN service but ports forwarding works temporarily and very seldom, which is when I can retrieve my files through external gateways. I tried proxifiers but couldn't proxify the ipfs daemon, so I didn't bother to buy a proxy-VPN service with dedicated IP allowing ports forwarding. As I mentioned in my first post above here, I found that the problem has been mentioned here https://discuss.ipfs.io/t/using-ipfs-over-a-socks-proxy/440/4 with no answers offered so far.

@schomatis schomatis removed the kind/bug A bug in existing code (including security flaws) label Oct 29, 2018
@schomatis
Copy link
Contributor

So, while trying to reproduce the issue I found out the source of the confusion, the hash you'll get while adding a file with the -t argument will be different than the one obtained from DTube because it was (probably) added without that flag.

That is, ipfs add <dtube-hash> will return added <dtube-hash> while ipfs add -t <dtube-hash> will return added <another-hash> (if you add that again with -t you'll get the same <another-hash>). Something that we don't clarify enough here (and I'm the main responsible of that :) is the meaning of content when talking about hashes. The hash you get (CID, actually) when you add something to IPFS is not the same as the one you'd get with, say, the sha1sum command, it also depends on how we organize the content (in which DAG topology, e.g,. trickle) and the metadata we add to it.

Also, the "ipfs hangs" problem is just go-ipfs indefinitely trying to look for something that doesn't exist while communicating nothing to the user who (understandably) assumes the command is dead and needs to be Ctrl+Ced.

I'm going to close this issue since this wasn't a bug after all but feel free to reopen it if you think this wasn't the cause of the problem. Regarding connecting IPFS through a proxy I think there are a couple of issues opened about it but if you don't find any please open a new one, I can help testing the solution with my VPN provider since we are both in the same city.


OT: I think I've seen you play in a Miles Davis tribute, drop me a line anytime 😄

@ghost ghost removed the status/in-progress In progress label Oct 29, 2018
@schomatis schomatis added the topic/docs-ipfs Topic docs-ipfs label Oct 29, 2018
@JazzTp
Copy link
Author

JazzTp commented Oct 29, 2018

Thank you very much @schomatis

You nailed it.


(new repo)

$ ipfs add QmWTViQbbngUAPyD62PpmC1obPK7kxUPT6m2NnjTDHRSak
added QmWTViQbbngUAPyD62PpmC1obPK7kxUPT6m2NnjTDHRSak QmWTViQbbngUAPyD62PpmC1obPK7kxUPT6m2NnjTDHRSak

ipfs pin add QmWTViQbbngUAPyD62PpmC1obPK7kxUPT6m2NnjTDHRSak works


(new repo)

$ ipfs add QmWTViQbbngUAPyD62PpmC1obPK7kxUPT6m2NnjTDHRSak -t
added QmUN23DH5MW9zKVcGL7sVfR9jUVkzWPzv3t9LK5NsYJS5S QmWTViQbbngUAPyD62PpmC1obPK7kxUPT6m2NnjTDHRSak

ipfs pin add QmWTViQbbngUAPyD62PpmC1obPK7kxUPT6m2NnjTDHRSak doesn't exit
ipfs pin add QmUN23DH5MW9zKVcGL7sVfR9jUVkzWPzv3t9LK5NsYJS5S works

and QmUN23DH5MW9zKVcGL7sVfR9jUVkzWPzv3t9LK5NsYJS5S is the hash code which appears in the output of ipfs pin ls -t recursive

Previously, I had added the files of 40-50 DTube videos with no problems, but 2-3 months must have passed since the last time and DTube must have change things a bit (*).

I should modify the script (bash, in turn called by a bash or a perl script) which calls ipfs add so that it checks if the call gives any different hash code, and in that case adds a pin on that code instead of the original one.

I should also store the correspondence in an associative list, to be able to check if that content is in the node, pinned, and to be able to remove the pin when needed.

On the other hand:
ipfs get theOriginalHashCode works and gives an identical file, named with the original hashcode.

(*) Also, snaps used to be available via ipfs get hashcode and not via the DTube gateway, while I think I got this snap file via the gateway, not sure however (it's unavailable in either way right now, I'm seeing, apart on this node in the "full" repo).


Also, the "ipfs hangs" problem is just go-ipfs indefinitely trying to look for something that doesn't exist while communicating nothing to the user who (understandably) assumes the command is dead and needs to be Ctrl+Ced.

Slightly OT: that also happens with ipfs get hashCodeOfUnavailableFile which is a problem in case of a script running on a remote server adding lists of hashcodes.

Is there a way to set a timeout, a maximum delay after which ipfs get hashcode exits if the download is stuck or didn't even begin? Or do I need to fork the bash script, a child calls ipfs get hashcode and its father checks in the folder how download is proceeding and possibly kills the child, or something like that...?

(It could be a global timeout value, also regulating ipfs pin add hashcode, or they could be different timeouts delays, or there could be a global one optionally replaceable by different values for certain actions.)


OT: LOL we played that show on Thursday 16th, next date already fixed so far is 29/11 😄

@schomatis
Copy link
Contributor

Is there a way to set a timeout, a maximum delay after which ipfs get hashcode exits if the download is stuck or didn't even begin?

There are timeouts in the code but I'm not sure if they are exposed in the commands or in the ~/.ipfs/config file. I also have the feeling there are issues open about this but if you don't find any please submit a new one :)

@Stebalien
Copy link
Member

See: #5541. Basically, you can set a timeout with ipfs --timeout=... get however, that's probably not what you want.

@JazzTp
Copy link
Author

JazzTp commented Oct 29, 2018

Thanks again @schomatis and @Stebalien

Is there a way to set a timeout, a maximum delay after which ipfs get hashcode exits if the download is stuck or didn't even begin?

See: #5541. Basically, you can set a timeout with ipfs --timeout=... get however, that's probably not what you want.

I guess it's more appropriate that I ask there (although it's a closed issue as this one... opening many new issues on the same topic would be spreading info all around I'm afraid, although GitHub connects issues by simply mentioning one from inside the other).


As for this issue... one totally marginal observation:

On the other hand:
ipfs get theOriginalHashCode works and gives an identical file, named with the original hashcode.

I still find it a bit mindboggling that ipfs add -t hashcode can give another hash code which is the one we have to pin, although ipfs get theOriginalHashCode works and correctly produces the identical file named with the original hashcode (good!). Best for simplest handling would be if the association with the new hashcode were handled internally and we could pin/unpin that file via either of the two hash codes. But of course I have no inside knowledge of what the trickledag upload does and no way to catch the logic of this choice.

@Stebalien
Copy link
Member

A bit late but... to answer your question, the "hash" of a file is actually the hash of the root node of a merkle-tree (if you're used to file systems, the leaves are "blocks" and the intermediate nodes are indirect blocks).

When importing a file through the trickle importer, we end up creating a an intentionally lopsided tree that's optimized for streaming (start to end) but not for random access. The normal importer is optimized for random access.

Basically, you end up with two entirely different data structures depending on how you import the data.

Note: we need to use the hash of the root, not the entire file so we can verify pieces of the file as we receive them instead of having to download the entire file to check the hash.

(well, that's not the only reason but that's the most practical)

@JazzTp
Copy link
Author

JazzTp commented Nov 20, 2018

Thank you for explaining @Stebalien

The conclusion I draw is that I need to know for which files DTube uses the trickle importer. As I mentioned, that had apparently changed for some of the files associated with the posted video (I simply have to ask via Discord to the same person who had given me the DTube-ipfs steps I was using).

@Stebalien
Copy link
Member

Ah, you mean if you want to "recover" videos by re-adding them locally? Yes.

@JazzTp
Copy link
Author

JazzTp commented Nov 21, 2018

Yes, to keep specific DTube files available. Thank you again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/docs-ipfs Topic docs-ipfs topic/UnixFS Topic UnixFS
Projects
None yet
Development

No branches or pull requests

3 participants