Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make spawning super fast! #190

Closed
wants to merge 5 commits into from
Closed

Make spawning super fast! #190

wants to merge 5 commits into from

Conversation

richardschneider
Copy link
Contributor

@dryajov Some details for #188

Well it was having a bad night, the 10s/command is now down to ~1.5s/command. It's still slow.

I'll be very interested in seeing what unix reports. To get these results, you need to set the environment variable, in windows its set DEBUG=ipfsd-ctl*. Then npm run test:node.

  ipfsd-ctl:run ipfs.exe init -b 2048 +3s
  ipfsd-ctl:run ipfs.exe config API.HTTPHeaders.Access-Control-Allow-Origin ["*"] --json +2s
  ipfsd-ctl:run ipfs.exe config API.HTTPHeaders.Access-Control-Allow-Methods ["PUT","POST","GET"] --json +1s
  ipfsd-ctl:run ipfs.exe config Addresses.Swarm ["/ip4/127.0.0.1/tcp/0"] --json +1s
  ipfsd-ctl:run ipfs.exe config Addresses.API "/ip4/127.0.0.1/tcp/0" --json +1s
  ipfsd-ctl:run ipfs.exe config Addresses.Gateway "/ip4/127.0.0.1/tcp/0" --json +1s
          √ create node and init (7084ms)
  ipfsd-ctl:run ipfs.exe daemon +1s
          √ start node (3790ms)```

@ghost ghost added the status/in-progress In progress label Jan 27, 2018
@richardschneider
Copy link
Contributor Author

richardschneider commented Jan 28, 2018

@diasdavid @dryajov ipfsd-ctl uses subcommandte to run the command. It creates another NodeJS process that spawns the actual command. Seems very inefficient to me. subcommandte is fine when a long running child process needs to be killed when the parent the process is killed. However, this is not our case, our child process are short lived and we wait for them to close.

I'll try calling child_process directly and see what happens.

@codecov
Copy link

codecov bot commented Jan 28, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@1f3f172). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #190   +/-   ##
=========================================
  Coverage          ?   87.67%           
=========================================
  Files             ?       18           
  Lines             ?      698           
  Branches          ?        0           
=========================================
  Hits              ?      612           
  Misses            ?       86           
  Partials          ?        0
Impacted Files Coverage Δ
src/utils/exec.js 100% <100%> (ø)
src/utils/subcomandante-lite.js 100% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1f3f172...54cd1a8. Read the comment docs.

@richardschneider
Copy link
Contributor Author

It's now ~350ms/command.

        spawn an initialized node
  ipfsd-ctl:run ipfs.exe init -b 2048 +2s
  ipfsd-ctl:run ipfs.exe config API.HTTPHeaders.Access-Control-Allow-Origin ["*"] --json +1s
  ipfsd-ctl:run ipfs.exe config API.HTTPHeaders.Access-Control-Allow-Methods ["PUT","POST","GET"] --json +303ms
  ipfsd-ctl:run ipfs.exe config Addresses.Swarm ["/ip4/127.0.0.1/tcp/0"] --json +286ms
  ipfsd-ctl:run ipfs.exe config Addresses.API "/ip4/127.0.0.1/tcp/0" --json +289ms
  ipfsd-ctl:run ipfs.exe config Addresses.Gateway "/ip4/127.0.0.1/tcp/0" --json +291ms
          √ create node and init (2584ms)
  ipfsd-ctl:run ipfs.exe daemon +288ms

@daviddias
Copy link
Member

@richardschneider historically, we had many issues with ipfs process spawning and killing, to the point that using ipfs-desktop would spawn dozens of IPFS nodes without being able to effectively shut them down.

In order to move away from subcomandante, we need to be extremely careful and test well that we don't end up in that world again.

@richardschneider
Copy link
Contributor Author

@diasdavid "Test well", does that mean that all tests run on all platforms?

@daviddias
Copy link
Member

More than unit tests. I'm talking about linking it to apps like ipfs-desktop and using it in multiple platforms + run the js-ipfs-api tests with it to ensure that everything is running as expected.

@richardschneider
Copy link
Contributor Author

@diasdavid ran the js-ipfs-api tests in node and except for the 2 known pubsub test issues, everything passed. I also checked for zombie processes and none were found. During the test, at most only 4 js/go processes were active.

Then tried running the tests in the browser and as you can guess webpack failed with heap error.

I don't know about ipfs-desktop. How should I go about testing it?

package.json Outdated
"detect-node": "^2.0.3",
"hapi": "^16.6.2",
"hat": "0.0.3",
"ipfs-api": "^17.3.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we keep this as a devDependency? We don't want to bundle this, in case the consuming project already defines their own version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about that. It seems that npm install just does that.

@dryajov
Copy link
Member

dryajov commented Jan 29, 2018

@richardschneider I wonder why such a performance bust? I'd imagine go being a bit lighter than node in general?

@richardschneider
Copy link
Contributor Author

richardschneider commented Jan 29, 2018

@dryajov I beginning to feel that killing a node subprocess is causing the issue, especially on Windows. See @diasdavid comment. This is the reason for #195

I'm fighting other fires right now, so will eventually get back to this.

@daviddias
Copy link
Member

I don't know about ipfs-desktop. How should I go about testing it?

@hacdias can you give a hand here?

@richardschneider
Copy link
Contributor Author

@diasdavid I've been waiting for ipfsd-ctl to settle down before continuing with this PR.

I'll rebase real soon and continue my investigations.

@daviddias
Copy link
Member

@richardschneider thank you :)

@daviddias
Copy link
Member

@richardschneider wanna rebase now?

@richardschneider
Copy link
Contributor Author

@diasdavid I need graceful stop before rebasing. What does another week matter.

@daviddias
Copy link
Member

@richardschneider time to rebase master onto this branch! :D

@daviddias daviddias changed the title [WIP] Performance Make spawning super fast! Mar 14, 2018
@dryajov
Copy link
Member

dryajov commented Mar 14, 2018

I've ran this branch locally and got this results:

real    4m20.749s
user    3m19.980s
sys     0m24.760s

In comparison, master yields this:

real    4m45.159s
user    3m39.185s
sys     0m27.601s

@daviddias
Copy link
Member

I've rebased master onto this branch, here are my results:

# master
npm test  159.45s user 27.56s system 65% cpu 4:45.73 total
# perf
npm test  134.43s user 21.23s system 63% cpu 4:06.89 total

@dryajov
Copy link
Member

dryajov commented Mar 16, 2018

This seems to work as expected. Should we go with it (I'm 👍)?

@richardschneider if we merge this, lets not forget to remove subcomandante from the package.json deps

@richardschneider
Copy link
Contributor Author

@dryajov done!

Copy link
Member

@dryajov dryajov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@daviddias
Copy link
Member

Please take into account my comment above -- #190 (comment) -- simply trusting CI won't do it for this PR.

@dryajov
Copy link
Member

dryajov commented Mar 20, 2018

@hacdias how can we test this with ipfs-desktop?

@ghost ghost assigned dryajov Mar 23, 2018
@travisperson
Copy link
Member

TL;DR; - I think we need to look into this further to make sure we are handling things correctly and allowing gracefully shutdown across all platforms, while also being a friendly module to users consuming ipfsd-ctl.


I believe it is generally frowned upon to listen to the main process signals in a library. The issue is that when listening on SIGINT and SIGTERM the default node handlers are removed that cause the process to exit. Any module / application must implement its own handlers if it wishes to exit from the signals when the event loop is not empty if it consumes this module.

The subcomandante module gets around this as it does not listen on these signals from the consumed module, but from the forked child subcom, and polls for the parent PID.

It's a little dense and hard to parse, but the section on Signal Events provides some insight to how nodejs signals are handled.

I think it's also important to note this line from the process docs with regard to windows

Sending SIGINT, SIGTERM, and SIGKILL cause the unconditional termination of the target process.

The standard ChildProcess.prototype.kill sends a SIGTERM, which cannot be handled on windows, subcomandante doesn't really handle this well either. It will turn a SIGKILL into a SIGHUP though.

Both nodejs and golang exit the process on a SIGHUP.

Side note of Windows

SIGHUP is generated on Windows when the console window is closed [...] however, Node.js will be unconditionally terminated by Windows about 10 seconds later

The go-ipfs project gracefully handles the SIGHUP. However, js-ipfs does not and instead will immediately terminate.

I don't have a good solution to this at the moment, but I do think we need to dig a bit further to see if we can come up with a better solution.


This thread (nodejs/node#12378) on the node project provides more insight as well, and is worth a read through.

Signals in golang can also be read up on (https://golang.org/pkg/os/signal) here.

@hacdias
Copy link
Member

hacdias commented Apr 5, 2018

@dryajov sorry. Completely missed this.

Time elapsed from 'DaemonFactory.create().spawn(...' to 'node.start':

v0.31.0: 3251.473ms
This branch: 3177.366ms

Don't forget that between the process of those two functions there are a lot more things going on on IPFS Desktop.

Everything else seems to be working as expected. I'll leave IPFS Desktop running with this branch to see if I fall into any issues.

@dryajov
Copy link
Member

dryajov commented Apr 5, 2018

Thanks @hacdias!

@dryajov
Copy link
Member

dryajov commented Apr 5, 2018

I agree, we need to look into this further. I think that by removing the intermediate subcomandante process we fall back into the same issues subcomandante is trying to mitigate - namely forcefully killing ipfsd-ctl will leave a zombi ipfs process.

We should definitely handle the SIGHUP signal in js-ipfs, and it even works on windows - amazing 👅 !

@daviddias
Copy link
Member

I'm closing this PR given that the risk / speed tradeoff doesn't justify spending much more time into it right now.

@daviddias daviddias closed this Jun 4, 2018
@ghost ghost removed the status/in-progress In progress label Jun 4, 2018
@daviddias daviddias deleted the perf branch June 4, 2018 08:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants