-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed test: test-child-process-fork-regr-gh-2847 #3635
Comments
Also failed in #3631 (comment), I'm not sure if that is one of those two runs. The test was added in 36b969f by @indutny and then modified in e9e837c by @mhdawson. |
@gireeshpunathil can you take a look at the windows failures since you worked on the the most recent change. |
@mhdawson , sure, I will debug to see what the issue is. |
When run in many iterations, I am able to see it failing with the original test case as well (36b969f). So this is not a regression. There are three types of errors observed in total. With e9e837c (a) and (b) seem to have eliminated, but (c) persists. (a)
(b)
(c)
The summary of the test case is to validate that the worker process being closed have its _channel field nullified. To re-inforce this assertion, a number of requests are sent from the master. The problem with the test case is that it does not handle the failure scenarios which can occur when the server is shutdown at arbitrary time intervals: Tweaking test case further, I see that even sending just two messages to the worker is sufficient to cause (c). More interestingly, as the error type (c) is coming from the TCP stack, detached from the net connect, send APIs, their error callbacks are incapable of catching this error. This essentially means that we cannot reliably pass messages, or suppress the errors. Any suggestions to improve the test case is welcome. |
See nodejs#3635 for details and failure examples. Ref: nodejs#3635
See nodejs#3635 for details and failure examples. Ref: nodejs#3635 PR-URL: nodejs#4005 Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Hm, this also just failed on smartos https://ci.nodejs.org/job/node-test-commit-smartos/1245/nodes=smartos14-32/console |
Could you paste the errors? Thanks |
Oh, right, the CI is locked for now. It's just this:
|
A few tests have started failing on Raspberry Pi devices in CI. https://ci.nodejs.org/job/node-test-binary-arm/943/ Ref: nodejs#4830 Ref: nodejs#3635 Ref: nodejs#4526
It reproduces more or less consistently in my |
I've tried fixing this. See #5121 |
Windows is still sometimes failing with ECONNRESET. Bring back the handling of this error as was initially introduced in PR nodejs#4442. PR-URL: nodejs#5179 Reviewed-By: Rich Trott <rtrott@gmail.com> Fixes: nodejs#3635
A few tests have started failing on Raspberry Pi devices in CI. https://ci.nodejs.org/job/node-test-binary-arm/943/ PR-URL: #5082 Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Roman Klauke <romaaan.git@gmail.com> Ref: #4830 Ref: #3635 Ref: #4526
The test would sometimes time out on some platforms. Take the initial version of the test and instead of sending 100 handles, just send 2. It still fails when run with the code before the change was introduced and passes afterwards. Remove test from parallel.status. PR-URL: #5121 Reviewed-By: Brian White <mscdex@mscdex.net> Reviewed-By: Rich Trott <rtrott@gmail.com> Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com> Fixes: #3635
See nodejs#3635 for details and failure examples. Ref: nodejs#3635 PR-URL: nodejs#4005 Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
A few tests have started failing on Raspberry Pi devices in CI. https://ci.nodejs.org/job/node-test-binary-arm/943/ PR-URL: nodejs#5082 Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Roman Klauke <romaaan.git@gmail.com> Ref: nodejs#4830 Ref: nodejs#3635 Ref: nodejs#4526
According to the explanation in nodejs#3635#issuecomment-157714683 And as a continuation to nodejs#5422 we also ignore EMFILE "No more file descriptors are available,so no more files can be opened" PR-URL: nodejs#12698 Fixes: nodejs#10286 Refs: nodejs#3635 (comment) Refs: nodejs#5178 Refs: nodejs#5179 Refs: nodejs#4005 Refs: nodejs#5121 Refs: nodejs#5422 Refs: nodejs#12621 (comment) Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com> Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
According to the explanation in #3635#issuecomment-157714683 And as a continuation to #5422 we also ignore EMFILE "No more file descriptors are available,so no more files can be opened" PR-URL: #12698 Fixes: #10286 Refs: #3635 (comment) Refs: #5178 Refs: #5179 Refs: #4005 Refs: #5121 Refs: #5422 Refs: #12621 (comment) Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com> Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
In test-child-process-fork-closed-channel-segfault.js, race condition is observed between the server getting closed and the worker sending a message. Accommodate the potential errors. Earlier, the same race was observed between the client and server and was addressed through ignoring the relevant errors through error handler. The same mechanism is re-used for worker too. The only difference is that the filter is applied at the callback instead of at the worker's error listener. Refs: #3635 (comment) Fixes: #20836 PR-URL: #20973 Reviewed-By: Rich Trott <rtrott@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Jon Moss <me@jonathanmoss.me> Reviewed-By: Matheus Marchini <matheus@sthima.com> Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com> Reviewed-By: Bartosz Sosnowski <bartosz@janeasystems.com>
In test-child-process-fork-closed-channel-segfault.js, race condition is observed between the server getting closed and the worker sending a message. Accommodate the potential errors. Earlier, the same race was observed between the client and server and was addressed through ignoring the relevant errors through error handler. The same mechanism is re-used for worker too. The only difference is that the filter is applied at the callback instead of at the worker's error listener. Refs: #3635 (comment) Fixes: #20836 PR-URL: #20973 Reviewed-By: Rich Trott <rtrott@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Jon Moss <me@jonathanmoss.me> Reviewed-By: Matheus Marchini <matheus@sthima.com> Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com> Reviewed-By: Bartosz Sosnowski <bartosz@janeasystems.com>
Looks like this test is starting to fail in our windows environment: fail 1, fail 2
The text was updated successfully, but these errors were encountered: