Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Timers broken in 0.10.34 #8897

Closed
apendua opened this issue Dec 18, 2014 · 22 comments
Closed

Timers broken in 0.10.34 #8897

apendua opened this issue Dec 18, 2014 · 22 comments

Comments

@apendua
Copy link

apendua commented Dec 18, 2014

After updating to 0.10.34 my app keeps throwing this error:

timers.js:432
    timeSinceLastActive = now - cur._monotonicStartTime;
                                   ^
TypeError: Cannot read property '_monotonicStartTime' of null
    at Timer.unrefTimeout [as ontimeout] (timers.js:432:36)

well, I looked into your source code here:

https://github.com/joyent/node/blob/v0.10.34-release/lib/timers.js#L430

and it seems that something is going wrong there, right?

  var cur = unrefList._idlePrev;
  while (cur != unrefList) {
    timeSinceLastActive = now - cur._monotonicStartTime;

    if (timeSinceLastActive < cur._idleTimeout) {

Well I think you forgot to make unrefList a Timer object here:

https://github.com/joyent/node/blob/v0.10.34-release/lib/timers.js#L508

it's just a guess based on the fact that _idlePrev is only initialized in the Timer constructor.

@apendua
Copy link
Author

apendua commented Dec 18, 2014

BTW, don't guys test your software before releasing it to public?

@micnic
Copy link

micnic commented Dec 18, 2014

do you have a minimal code to produce this exception?

@apendua
Copy link
Author

apendua commented Dec 18, 2014

@micnic Good question. I am currently at work, but will try to help you with that later this day.

@misterdjules
Copy link

Thanks @apendua for reporting the issue, I'm investigating. Please let us know when you find the code that reproduces it.

@apendua
Copy link
Author

apendua commented Dec 18, 2014

@misterdjules Currently working on this one.

@ploer
Copy link

ploer commented Dec 18, 2014

Just wanted to add that I started seeing this one in production as well on upgrade to 0.10.34; haven't yet been able to find a simple reproduction but will keep looking.

@ploer
Copy link

ploer commented Dec 18, 2014

The following code demonstrates the same error for me:

var timers = require('timers');

var foo = new function() {
  this._onTimeout = function() {
    console.log("in foo._onTimeout");
  }
}();

var bar = new function() {
  this._onTimeout = function() {
    console.log("in bar._onTimeout");
    timers.unenroll(foo);
  }
}();

timers.enroll(bar, 1000);
timers._unrefActive(bar);

timers.enroll(foo, 3000);
timers._unrefActive(foo);

setTimeout(function() {}, 5000);

Output:

ploer-macbook-4:timer_test ploer$ node t2.js
unrefList initialized
unrefTimer initialized
unrefTimer scheduled
unrefList append to end
unrefList append to end
unrefTimer fired
unreftimer firing timeout
in bar._onTimeout
unenroll
unreftimer firing timeout
in foo._onTimeout

timers.js:432
    timeSinceLastActive = now - cur._monotonicStartTime;
                                   ^
TypeError: Cannot read property '_monotonicStartTime' of null
    at Timer.unrefTimeout [as ontimeout] (timers.js:432:36)
ploer-macbook-4:timer_test ploer$ 

@ploer
Copy link

ploer commented Dec 18, 2014

And under 0.10.33, this same code appears to work fine:

ploer-macbook-4:timer_test ploer$ node --version
v0.10.33
ploer-macbook-4:timer_test ploer$ node t2.js
unrefList initialized
unrefTimer initialized
unrefList empty
unrefTimer scheduled
unrefList find where we can insert
unrefList append to end
unrefTimer fired
unreftimer firing timeout
in bar._onTimeout
unenroll
unrefList is empty
timeout callback 5000
now: 104385711
5000 list empty
ploer-macbook-4:timer_test ploer$ 

@ploer
Copy link

ploer commented Dec 18, 2014

As a side note, with the new list traversal, the following lines at timers.js:457 look like they might be a problem if they ever triggered:

    if (!cur._onTimeout) continue;

    if (domain && domain._disposed)
      continue;

It doesn't look to me like cur has been advanced down the list at this point, so it seems like it might just loop indefinitely on either of these conditions.

@ploer
Copy link

ploer commented Dec 18, 2014

The source of the bug appears to be that when you call itemToDelete._onTimeout(), any node in the list can be mutated, so holding open a list iteration and continuing it after this call is unsafe.

With the current list implementation, it seems like to be safe, you would need to do something like scan the full list, and build a local list of all of the items that look like they're ready to fire, and only then iterate over that list and call each item's _onTimeout() (and I guess check as you iterated to see if any of them had become unenrolled in the meantime).

This issue appears to have been introduced in 934bfe2.

@misterdjules
Copy link

@ploer Thanks for the minimal reproduction. Yes, the block of code that early-outs in the loop seems problematic, we may need to open a separate issue for that.

@misterdjules
Copy link

I'm working on a fix, I'll keep you posted. Thank you again!

@trevnorris
Copy link

@apendua

BTW, don't guys test your software before releasing it to public?

I appreciate you taking the time to report this issue, but blatantly snide remarks like this aren't welcome. If you'd taken the time to look at the commit where this code was changed (934bfe2) you'd see there was one additional test added. Not including all the other additional timers tests that are already there.

@apendua
Copy link
Author

apendua commented Dec 18, 2014

@trevnorris I totally agree with you. I am sorry guys, didn't mean to insult you! I should have waited until emotions go away before submitting this issue report. I really appreciate how fast you responded.

misterdjules pushed a commit to misterdjules/node that referenced this issue Dec 18, 2014
Commit 934bfe2 had introduced a
regression where node would crash trying to access a null unref timer if
a given unref timer's callback would remove other unref timers set to
fire in the future.

More generally, it makes the unrefTimeout function more solid by not
mutating the unrefList while traversing it.

Fixes nodejs#8897.
misterdjules pushed a commit to misterdjules/node that referenced this issue Dec 19, 2014
Commit 934bfe2 had introduced a
regression where node would crash trying to access a null unref timer if
a given unref timer's callback would remove other unref timers set to
fire in the future.

More generally, it makes the unrefTimeout function more solid by not
mutating the unrefList while traversing it.

Fixes nodejs#8897.
misterdjules pushed a commit to misterdjules/node that referenced this issue Dec 19, 2014
@misterdjules
Copy link

@apendua @ploer We now have a candidate for a fix with #8905, thank you very much for your help.

@ploer
Copy link

ploer commented Dec 19, 2014

@misterdjules, thanks for the quick fix!

Reading through the change, I am wondering if there is still a slight change from the behavior in 0.10.33.

In 0.10.33, I believe that if timer event bar unenrolls timer event foo which would otherwise be ready to fire inside the same unrefTimeout(), then timer event foo never fires.

With this change, it appears to me that if bar and foo are both added to timersToTimeout in the same pass, both will now fire, even if bar fires first and unenrolls foo.

I have no idea if this difference matters, it seems likely to be a rare edge case where the only impact is that a timeout callback fires unnecessarily, and I would expect any reasonable caller to tolerate that. But I thought I'd point it out for your consideration.

@misterdjules
Copy link

@ploer Indeed, that seems like a very good observation, thank you again! I'm working on fixing that too.

@misterdjules
Copy link

@ploer Fixed in f78bd0e, please let me know what you think.

@ploer
Copy link

ploer commented Dec 19, 2014

@misterdjules, looks good to me.

@trevnorris
Copy link

@apendua No worries. And again thanks for taking the time to report the issue. :)

@misterdjules misterdjules added this to the 0.10.35 milestone Dec 22, 2014
misterdjules pushed a commit to misterdjules/node that referenced this issue Dec 22, 2014
Commit 934bfe2 had introduced a
regression where node would crash trying to access a null unref timer if
a given unref timer's callback would remove other unref timers set to
fire in the future.

More generally, it makes the unrefTimeout function more solid by not
mutating the unrefList while traversing it.

Fixes nodejs#8897.
misterdjules pushed a commit to misterdjules/node that referenced this issue Dec 22, 2014
Commit 934bfe2 had introduced a
regression where node would crash trying to access a null unref timer if
a given unref timer's callback would remove other unref timers set to
fire in the future.

More generally, it makes the unrefTimeout function more solid by not
mutating the unrefList while traversing it.

Fixes nodejs#8897.
misterdjules pushed a commit to misterdjules/node that referenced this issue Dec 22, 2014
Commit 934bfe2 had introduced a
regression where node would crash trying to access a null unref timer if
a given unref timer's callback would remove other unref timers set to
fire in the future.

More generally, it makes the unrefTimeout function more solid by not
mutating the unrefList while traversing it.

Fixes nodejs#8897.
misterdjules pushed a commit to misterdjules/node that referenced this issue Dec 22, 2014
Commit 934bfe2 had introduced a
regression where node would crash trying to access a null unref timer if
a given unref timer's callback would remove other unref timers set to
fire in the future.

More generally, it makes the unrefTimeout function more solid by not
mutating the unrefList while traversing it.

Fixes nodejs#8897.

Reviewed-By: Timothy J Fontaine <tjfontaine@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
@misterdjules
Copy link

Thank you all, fixed in fd2cb7c.

@trevnorris
Copy link

@misterdjules Thanks. I'll need to make sure this gets into #8886 .

jasnell pushed a commit to jasnell/node-joyent that referenced this issue Jan 5, 2015
Commit 934bfe2 had introduced a
regression where node would crash trying to access a null unref timer if
a given unref timer's callback would remove other unref timers set to
fire in the future.

More generally, it makes the unrefTimeout function more solid by not
mutating the unrefList while traversing it.

Fixes nodejs#8897.

Reviewed-By: Timothy J Fontaine <tjfontaine@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
jasnell pushed a commit to jasnell/node-joyent that referenced this issue Jan 5, 2015
Commit 934bfe2 had introduced a
regression where node would crash trying to access a null unref timer if
a given unref timer's callback would remove other unref timers set to
fire in the future.

More generally, it makes the unrefTimeout function more solid by not
mutating the unrefList while traversing it.

Fixes nodejs#8897.

Reviewed-By: Timothy J Fontaine <tjfontaine@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Fishrock123 pushed a commit to Fishrock123/node that referenced this issue Aug 31, 2015
Commit 934bfe2 had introduced a
regression where node would crash trying to access a null unref timer if
a given unref timer's callback would remove other unref timers set to
fire in the future.

More generally, it makes the unrefTimeout function more solid by not
mutating the unrefList while traversing it.

Fixes: nodejs/node-v0.x-archive#8897
PR-URL: nodejs/node-v0.x-archive#8905

Reviewed-By: Timothy J Fontaine <tjfontaine@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>

Conflicts:
	lib/timers.js
misterdjules pushed a commit to nodejs/node that referenced this issue Sep 2, 2015
Commit 934bfe2 had introduced a
regression where node would crash trying to access a null unref timer if
a given unref timer's callback would remove other unref timers set to
fire in the future.

More generally, it makes the unrefTimeout function more solid by not
mutating the unrefList while traversing it.

Fixes: nodejs/node-v0.x-archive#8897

Conflicts:
	lib/timers.js

Fixes: nodejs/node-convergence-archive#23
Ref: #268
PR-URL: #2540
Reviewed-By: bnoordhuis - Ben Noordhuis <info@bnoordhuis.nl>
misterdjules pushed a commit to nodejs/node that referenced this issue Sep 3, 2015
Commit 934bfe2 had introduced a
regression where node would crash trying to access a null unref timer if
a given unref timer's callback would remove other unref timers set to
fire in the future.

More generally, it makes the unrefTimeout function more solid by not
mutating the unrefList while traversing it.

Fixes: nodejs/node-v0.x-archive#8897

Conflicts:
	lib/timers.js

Fixes: nodejs/node-convergence-archive#23
Ref: #268
PR-URL: #2540
Reviewed-By: bnoordhuis - Ben Noordhuis <info@bnoordhuis.nl>
Fishrock123 pushed a commit to Fishrock123/node that referenced this issue Sep 3, 2015
Commit 934bfe2 had introduced a
regression where node would crash trying to access a null unref timer if
a given unref timer's callback would remove other unref timers set to
fire in the future.

More generally, it makes the unrefTimeout function more solid by not
mutating the unrefList while traversing it.

Fixes: nodejs/node-v0.x-archive#8897

Conflicts:
	lib/timers.js

Fixes: nodejs/node-convergence-archive#23
Ref: nodejs#268
PR-URL: nodejs#2540
Reviewed-By: bnoordhuis - Ben Noordhuis <info@bnoordhuis.nl>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants