Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graceful stop #205

Merged
merged 25 commits into from
Mar 14, 2018
Merged

Graceful stop #205

merged 25 commits into from
Mar 14, 2018

Conversation

richardschneider
Copy link
Contributor

  • Use api/v0/shutdown to kill a daemon process
  • Get tests working on windows

See ipfs/js-ipfs#1192 for background details.

Note that this is blocking on ipfs/js-ipfs#1224. When fixed, change the package.json to use the latest ipfs.

@richardschneider
Copy link
Contributor Author

@diasdavid any comments?

package.json Outdated
@@ -100,7 +100,7 @@
"detect-port": "^1.2.2",
"dirty-chai": "^2.0.1",
"go-ipfs-dep": "0.4.13",
"ipfs": "^0.27.5",
"ipfs": "github:ipfs/js-ipfs",
Copy link
Member

@daviddias daviddias Feb 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's hold this PR until we have a new release of js-ipfs (it can happen soon).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's here, can't wait to get this in :)

screen shot 2018-02-21 at 12 38 57 pm

@codecov
Copy link

codecov bot commented Feb 21, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@4cc69a0). Click here to learn what that means.
The diff coverage is 90%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #205   +/-   ##
=========================================
  Coverage          ?   86.92%           
=========================================
  Files             ?       17           
  Lines             ?      673           
  Branches          ?        0           
=========================================
  Hits              ?      585           
  Misses            ?       88           
  Partials          ?        0
Impacted Files Coverage Δ
src/factory-in-proc.js 86.95% <100%> (ø)
src/ipfsd-daemon.js 91.15% <77.77%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4cc69a0...36aa183. Read the comment docs.

@richardschneider
Copy link
Contributor Author

@dryajov I've rebased on you changes. The config hack really speeds things up!

Still waiting on an js-ipfs release that has a shutdown route. Maybe @diasdavid will give us some love.

Could you look into these issues

  • The spawn tests are failing when testing on the browser
  • Still get port 9999 issue on macos

@dryajov
Copy link
Member

dryajov commented Feb 22, 2018

@richardschneider I'm trying to figure out why the tests are failing now, but having been able to do so yet. As for the port issue, we were able to determine that it's not related to ipfsd-ctl itself, but rather something running on those ports on mac os - #209.

@richardschneider
Copy link
Contributor Author

@dryajov cheers!

@dryajov
Copy link
Member

dryajov commented Feb 22, 2018

The issue seems to be somehow related to the version entry missing from IndexDB on first run. I can't debug it, since running the set of tests in the browser a second time around works fine, it only seems to fail on first run. Here is the error:

screen shot 2018-02-22 at 4 47 55 pm

Skipping the version test allows the rest of the tests to pass.

@dryajov
Copy link
Member

dryajov commented Feb 26, 2018

@diasdavid @victorbjelkholm any idea of what might be causing this - #205 (comment)?

@richardschneider
Copy link
Contributor Author

@diasdavid Thanks for the ipfs v0.28 release. This PR is now ready for review and merge.

@daviddias
Copy link
Member

@richardschneider all CI is failing. Errors:

image

image

Let's be attentive of everyone's time and request a review or a merge only when the code is fully ready.

@JonKrone
Copy link
Contributor

JonKrone commented Mar 8, 2018

image

This happens on proc nodes during the ipfsd.version() test because the IPFS instance it creates to get the version hasn't finished booting. There's a fix for this in my PR that addresses some ipfsd initialization bugs. @dryajov This is the same error you posted earlier.

image

This issue is basically the same thing at a different point - proc nodes aren't waiting for their IPFS instance to finish booting before attempting to work with them. Here's the fix.

Those initialization changes should address the CI issues here and then I think we've got green-green-green IPFSD CI! 😁

@ghost ghost assigned dryajov Mar 8, 2018
@JonKrone
Copy link
Contributor

JonKrone commented Mar 8, 2018

@richardschneider @dryajov

Jenkins and Travis are green ✔️

@richardschneider
Copy link
Contributor Author

@diasdavid Is this now ready for a merge?

const ipfs = new IPFS(options)
ipfs.once('ready', () => {
ipfs.version((err, _version) => {
if (err) { callback(err) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs a return. Otherwise chaos will happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, extremely bad code!!!

*
* @param {function()} callback - Called when the process was killed.
* @returns {undefined}
*/
killProcess (callback) {
// need a local var for the closure, as we clear the var.
const self = this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self is not needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disagree, it is needed in subprocess.once('exit', ...

this should not be used in a lambdaexpression! See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Arrow_functions

Copy link
Member

@daviddias daviddias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not finished. CI is also not always green.

ipfs.version((err, _version) => {
if (err) { return callback(err) }
callback(null, _version)
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this starting the node? I'm sure this is causing repo lock just to check the version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @dryajov, curious about this as there is a TODO to fix this on the go/js factory. Did the mentioned solution hit roadblocks originally? Can we make an issue and work on it separate from this PR?

? node.start(options.args, cb)
: cb()
], (err) => callback(err, node))
node.exec.once('ready', () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something isn't quite right here. the node is created but then what we listen on the ready event is exec??

@dryajov
Copy link
Member

dryajov commented Mar 12, 2018

I'm reworking proc nodes to propagate the ready event. That is the only reliable way of ensuring that the node is fully constructed when interacting with it. Ideally, the user doesn't have to care about it as it would be taken care of by the factory.

@richardschneider
Copy link
Contributor Author

@dryajov Seems that this PR is being re-purposed. Can we get some closure?

@dryajov
Copy link
Member

dryajov commented Mar 12, 2018

This change is required to get all CIs green. I'm fine with making the change in a different PR but this would depend on it. I'll create a separate PR.

@richardschneider
Copy link
Contributor Author

@dryajov If it's need then please add it here. I'm just concerned that changes are creeping in here that have nothing to do with stopping.

@dryajov
Copy link
Member

dryajov commented Mar 13, 2018

jenkins is failing because of:

screen shot 2018-03-13 at 12 07 50 am

@daviddias
Copy link
Member

@dryajov Travis still fails with a bunch of timeouts

node.init((err) => {
if (err) { return callback(err) }
node.version(callback)
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is very odd to me that we start a new instance of IPFS to get the version. This will create a memory leak + it will lock the repo for a consequent node spawn. @dryajov can you expand why to do it this way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diasdavid I'm not aware of any other way to get the version - is there a better one you can suggest?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, the instance is not started, it is simply initialized with a repo, to start it we would have to call node.start.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dryajov are you saying that a repo is created to just get the version of ipfs? Sound like the node.version should work offline/un'init for the info it can provide, namely, the impl version.

If it was absolutely impossible (which it isn't), at least there should be a cleanup step here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I thought about just using ipfs-repo directly to read the version, but it turns out we still need to initialize a repo to get the version, since there is no guarantee that the default repo in ~/.jsipfs corresponds to the version of the IPFS in our exec param.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it turns out that the version is read by the repo here - https://github.com/ipfs/js-ipfs/blob/494da7f8c5b35f491f22a986ff5e8c456cc0e602/src/core/components/version.js#L14...L24. I think that's an overkill and we should separate the repo version from the ipfs version, that way we don't need a repo. If noone objects I can go ahead and make that change in IPFS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a need for a Repo version and IPFS version, that is correct. It should be ok to get the IPFS version without the Repo existing though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - I'm making the change in IPFS to allow for that.

@@ -68,6 +72,8 @@ class Node {
EXPERIMENTAL: this.opts.EXPERIMENTAL,
libp2p: this.opts.libp2p
})

this.exec.once('ready', () => this.emit('ready'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use a different pattern than the one used for ipfs-daemon (new Node + node.start)?

Copy link
Member

@dryajov dryajov Mar 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe not doing that can end up in race conditions - take a look at maybeOpenRepo, its called by the IPFS constructor through boot(self), but it is async, and the only way of knowing when it finished is by hooking into the ready event.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm weird'ed out is that in ipfsd-in-proc, the IPFS instance starts (and therefore does network activity, lock repo, etc) on the constructor, while in ipfsd-daemon, the daemon only starts when .start is called. This asymmetry will lead to confusion and debugging issues in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I see - we're not actually starting the node since IPFS is constructed with both, init and start as false, take a look here - https://github.com/ipfs/js-ipfsd-ctl/blob/a61fa50d6834e61465398cd1d54e60ec913358eb/src/ipfsd-in-proc.js#L67...L74. This was done specifically to keep that symmetry with the other types of daemons, the ready event is there to make sure we don't trip over anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Makes sense then. Thanks for clarifying :)

@dryajov
Copy link
Member

dryajov commented Mar 13, 2018

So the issue ended up being browser only. By default, if the repo doesn't exist, the static repo version is used, but in the browser the error returned was not being matched correctly, hence to version was being returned at all. The fix in ipfs/js-ipfs#1262, should address that.

@daviddias
Copy link
Member

Thank you @dryajov :) I'll release a patch version of js-ipfs as soon as I have decent Internet again (npm is not friendly of in-flight wifi)

@ghost ghost assigned daviddias Mar 14, 2018
@daviddias
Copy link
Member

Seems that it is all good \o/

image

Mac OS fails because Jenkins doesn't remember how to install npm anymore //cc @victorbjelkholm

@daviddias daviddias merged commit 359dd62 into master Mar 14, 2018
@ghost ghost removed the status/in-progress In progress label Mar 14, 2018
@daviddias daviddias deleted the graceful-stop branch March 14, 2018 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants