Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POST to /api/v0/dag/put?pin=true causes high CPU usage #4673

Closed
iamruinous opened this issue Feb 8, 2018 · 13 comments
Closed

POST to /api/v0/dag/put?pin=true causes high CPU usage #4673

iamruinous opened this issue Feb 8, 2018 · 13 comments
Labels
kind/bug A bug in existing code (including security flaws) topic/perf Performance

Comments

@iamruinous
Copy link

Version information:

go-ipfs version: 0.4.14-dev-9014c64f8
Repo version: 6
System version: amd64/darwin
Golang version: go1.9.3

Type:

Bug

Description:

When posting to /api/v0/dag/put?pin=true, my CPU is maxed out. pin=false does not have the same CPU utilization.

echo '""' > file.json
for run in {0..49}; do curl -X POST "http://localhost:5001/api/v0/dag/put?pin=true" -F "file=@file.json"; done
@Stebalien
Copy link
Member

Mind running curl 'http://localhost:5001/debug/pprof/profile' > profile.out while you do that? I can't reproduce here and that'll let me see what's pegging your CPU.

@iamruinous
Copy link
Author

Here you go. I ran the curl loop a few times while it was profiling. htop showed 130-140% CPU usage.

profile.out.zip

@Stebalien
Copy link
Member

Do you have a lot of pins? It looks like our handling of the pinset is really inefficient.

@iamruinous
Copy link
Author

Well, considering ipfs pin ls has been running for 12 minutes and still hasn't returned... probably.

Is there a faster way to see how many pins I may have?

@Stebalien
Copy link
Member

ipfs pin ls --type=recursive; ipfs pin ls --type=direct. Otherwise, it'll list all reachable pinned blocks (much slower).

However, I'm pretty sure you have a lot of directly pinned nodes.

@iamruinous
Copy link
Author

⟩ ipfs pin ls --type=recursive | wc -l
   38956

⟩ ipfs pin ls --type=direct | wc -l
       0

indirect or all never returned

@Stebalien
Copy link
Member

Hm. Yeah, we need to optimize that.

@whyrusleeping
Copy link
Member

Yeap. @iamruinous is well into the land of "wow, we really didnt think through how the cdb pinset would handle that many things". Since this is now hurting real users, lets put a little more fuel on that fire. I like the proposal in #4675

@whyrusleeping whyrusleeping added the kind/bug A bug in existing code (including security flaws) label Feb 10, 2018
@whyrusleeping whyrusleeping added this to the go-ipfs 0.4.15 milestone Feb 10, 2018
@XertroV
Copy link

XertroV commented Feb 19, 2018

I'm having a similar issue for similar reasons I presume.

$ ipfs pin ls --type=recursive | wc -l
   53805

Using it as the initial block propagation layer for a blockchain, hence the pinning.

AFAIK it'd be unsafe not to pin data.

Note: I'm not using dag/put, just add - lots of little files. Those 53k objects are about 30MB.

@whyrusleeping
Copy link
Member

@XertroV You should probably use ipfs pin update. You don't want to individually pin each and every block recursively, instead, pin the genesis block, and then for each new block you accept, do an ipfs pin update to update the pin to the new block.

@XertroV
Copy link

XertroV commented Feb 20, 2018

@whyrusleeping - thanks for the advice. I've been peripherally aware there's a way for a blockchain structure to integrate with the merkledag but haven't found any good docs on how to link things together. Is there anything you could point me too?

Right now each block header, block, and tx have their own MH (there are some good reasons we do this at the tx level), and we need to be able to deal with multiple heads. Any pointers (ha!) would be very much appreciated.


Followup: Had a google and a look and it seems like ipfs object put looks like the right idea; using {Data: , Links: [allMHLinks]}.

Initial thoughts:

  • Seems good - I presume the links are ordered and that order is preserved (makes sense from a hashing PoV)
  • I'm wondering what the forwards-compatibility of this model will be - one advantage we have of using 'simple' objects is that it's easy to upgrade. Alternatively I could re-write the serialization code to output a Tuple Buffer (List Links) and strip that from the objects themselves.
  • Are there libraries to recreate the hashes of these objects exactly? Would prefer not to commit 100% to IPFS in case we want to change to a different content layer later on (e.g. custom P2P or w/e really). I actually looked for such a library a few months ago but couldn't find one, and it seemed like the hashing code was deep withing a unixfs lib (I never ended up finding the actual hashing/chunking code, just narrowed it down)

And answering my own questions:

  • Recreating hashes: It seems like js-ipld-dag-pb will give me back a multihash (e.g. via: dagPB.DAGNode.create(Buffer.from("genesis"), [], cb)).

@whyrusleeping
Copy link
Member

Hey @XertroV I wrote a really simple/dumb blockchain on ipfs a little bit ago just to see how it would go. I pushed it up here just now: https://github.com/whyrusleeping/toychain
It might be worth a look, it just embeds the bits from ipfs it needs, and runs as its own daemon.

Regarding ipfs object put, You'll actually probably want to use ipfs dag put. It accepts any json structure as input, and i wrote a short "how-to" here: http://ipfs.git.sexy/sketches/ipld_intro.html

The code for the hashing of things (depending on how you're asking) is here: https://github.com/ipfs/go-ipld-cbor

@Stebalien
Copy link
Member

Closing in favor of #5221.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) topic/perf Performance
Projects
None yet
Development

No branches or pull requests

4 participants