Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: fix flaky http(s)-set-timeout-server tests #14380

Closed
wants to merge 8 commits into from

Conversation

Trott
Copy link
Member

@Trott Trott commented Jul 20, 2017

The tests include a callback that might not be invoked but is wrapped in
common.mustCall(). Remove the common.mustCall() wrapper and add a
comment explaining that it should not be added.

Fixes: #11768

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines
Affected core subsystem(s)

test http https

@Trott Trott added flaky-test Issues and PRs related to the tests with unstable failures on the CI. http Issues or PRs related to the http subsystem. https Issues or PRs related to the https subsystem. test Issues and PRs related to the tests. labels Jul 20, 2017
@Trott
Copy link
Member Author

Trott commented Jul 20, 2017

/cc @nodejs/testing

@refack
Copy link
Contributor

refack commented Jul 20, 2017

Aren't we excluding an important step in the test?
IMHO a possible alternative solution will be to call server.close(); cb(); only after both server.setTimeout and the connection handlers have both run. And using common.platformTimeout(100)

Copy link
Contributor

@refack refack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss

@Trott Trott changed the title test: fix flaky http(s)-set-server-timeout tests test: fix flaky http(s)-set-timetout-server tests Jul 20, 2017
@Trott Trott changed the title test: fix flaky http(s)-set-timetout-server tests test: fix flaky http(s)-set-timeout-server tests Jul 20, 2017
@Trott
Copy link
Member Author

Trott commented Jul 20, 2017

@refack You have a point. Less elegant but more correct fix coming in a bit...

@Trott
Copy link
Member Author

Trott commented Jul 20, 2017

IMHO a possible alternative solution will be to call server.close(); cb(); only after both server.setTimeout and the connection handlers have both run.

This solution still fails under load. The connection handler still might never get called, and now without calling server.close(), the test times out.

@Trott
Copy link
Member Author

Trott commented Jul 20, 2017

Moving the socket.destroy() as well to a place where it can wait for both callbacks to fire works, though. Trying to understand if that means we're not testing the "right" thing anymore.

Ultimately, may just add a test case: one that solution and one the solution that's here. Then we're getting at everything one way or another (hopefully).

@Trott
Copy link
Member Author

Trott commented Jul 20, 2017

OK, new test case added. PTAL @refack.

@cjihrig may want to re-review as well.

@Trott
Copy link
Member Author

Trott commented Jul 20, 2017

@Trott
Copy link
Member Author

Trott commented Jul 20, 2017

Stress test showing failures on master (expected to fail):
https://ci.nodejs.org/job/node-stress-single-test/1332/

Stress test on this PR (expected to pass):
https://ci.nodejs.org/job/node-stress-single-test/1333/

@@ -152,11 +152,12 @@ test(function serverResponseTimeoutWithPipeline(cb) {
});

test(function idleTimeout(cb) {
const server = http.createServer(common.mustCall((req, res) => {
// Do not wrap the callback in common.mustCall(). It might not be invoked.
const server = http.createServer((req, res) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: If it "might" not be invoked maybe remove?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is invoked, we want to make sure timeout event does not fire on req and res (I guess?).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that's what the new test case does.
Superficially this is not wrong or harmful, but might be distracting for future maintainers.
IMHO either remove, or add to the comment that the next test case makes sure this is always true.

@refack
Copy link
Contributor

refack commented Jul 20, 2017

Fast timeout makes sense.

@cjihrig
Copy link
Contributor

cjihrig commented Jul 20, 2017

LGTM

@Trott
Copy link
Member Author

Trott commented Jul 21, 2017

This PR fails in stress test, but much less frequently and in a different way.

@Trott
Copy link
Member Author

Trott commented Jul 21, 2017

Made some adjustments (PTAL, I did not squash the commits for hopefully easier review).

Stress test on this PR: https://ci.nodejs.org/job/node-stress-single-test/1338/

@refack
Copy link
Contributor

refack commented Jul 21, 2017

Made some adjustments (PTAL, I did not squash the commits for hopefully easier review).

  1. Stress still fails 😞
  2. Still, could you explain the reasoning for moving the timeout handler registration?
  3. I think you can relax the second test, and just assert the connection handler. That way first case tests the timeout handler, and the second that req, and res don't fire timeouts.

@refack
Copy link
Contributor

refack commented Jul 21, 2017

P.S.

const server = http.createServer().on('connection', (req, res) => {
   req.on('timeout', common.mustNotCall());
   res.on('timeout', common.mustNotCall());
})

Can a timeout event happen in the "dead space" between the time req and res are created and connection is fired?
/cc @Fishrock123 @trevnorris (Trev, cause you dug into the timer/socket quagmire)

@trevnorris
Copy link
Contributor

@refack first need to correct your code. it should be:

const print = process._rawDebug;
const server = require('http').createServer();

server.on('request', (req, res) => {
  // this will never fire b/c the request is complete before this callback fires.
  req.on('timeout', () => print('req timeout'));
  // will fire a default 2 mins in the future if the connection isn't closed.
  res.on('timeout', () => print('res timeout'));
});
sever.listen(8080);

The 'connection' event fires the moment the socket is created, while 'request' doesn't fire until the headers have been parsed. IOW, in this case req and res are never created until after the net.Socket is created so there never is a "dead space".

This is different from http.request(). Take the following:

const print = process._rawDebug;
const req = require('http').request({ hostname: 'www.reddit.com' }, res => {
  print('req callback fired');
});

const t = process.hrtime();
req.setTimeout(10, () => {
  const u = process.hrtime(t);
  print('timeout:', (u[0] * 1e3 + u[1] / 1e6) >>> 0, 'ms');
});
req.end();

My machine prints timeout: 22 ms. Meaning the timeout wasn't created until 12 ms in the future. i.e. how long it took to create the connection with the server. This is verified by the documentation

Once a socket is assigned to this request and is connected socket.setTimeout() will be called.

@Trott
Copy link
Member Author

Trott commented Jul 22, 2017

@trevnorris wrote:

  // will fire a default 2 mins in the future if the connection isn't closed.
  res.on('timeout', () => print('res timeout'));

Ah, of course, but if we're calling server.setTimeout(50, ...) as this test does, then it's 50ms and not 2 minutes.

Just confirmed running the test locally under heavy load that reducing the timeout to 1ms makes the test far more reliable and increasing it to 5000 makes it reliable under the amount of load I was using.

I think the thing to do is remove the test for that timeout to not fire, as clearly it can.

Probably won't solve the ECONNRESET issue that's also showing up, but I think that's a different problem altogether.

Trott added 8 commits July 24, 2017 12:50
The tests include a callback that might not be invoked but is wrapped in
common.mustCall(). Remove the common.mustCall() wrapper and add a
comment explaining that it should not be added.

Add a new test case that sets the timeout to 1ms and waits for both the
connection handler and the timeout handler to be invoked. This version
keeps the common.mustCall() wrapper intact around the connection handler
(although it's mostly semantic and not necessary for the test as the
test will certainly fail or time out if that handler isn't invoked).

Fixes: nodejs#11768
@Trott
Copy link
Member Author

Trott commented Jul 24, 2017

Comments added per nits from @refack.

Rebased against master, so running CI again for good measure.

CI: https://ci.nodejs.org/job/node-test-pull-request/9332/ EDIT: Windows failure unrelated.

Copy link
Contributor

@refack refack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@Trott
Copy link
Member Author

Trott commented Jul 25, 2017

Jenkins was restarted so previous CI either didn't run or we lost the results.

CI: https://ci.nodejs.org/job/node-test-pull-request/9337/

Trott added a commit to Trott/io.js that referenced this pull request Jul 25, 2017
The tests include a callback that might not be invoked but is wrapped in
common.mustCall(). Remove the common.mustCall() wrapper and add a
comment explaining that it should not be added.

Add a new test case that sets the timeout to 1ms and waits for both the
connection handler and the timeout handler to be invoked. This version
keeps the common.mustCall() wrapper intact around the connection handler
(although it's mostly semantic and not necessary for the test as the
test will certainly fail or time out if that handler isn't invoked).

PR-URL: nodejs#14380
Fixes: nodejs#11768
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@Trott
Copy link
Member Author

Trott commented Jul 25, 2017

Landed in c5bed4c

@Trott Trott closed this Jul 25, 2017
addaleax pushed a commit that referenced this pull request Jul 27, 2017
The tests include a callback that might not be invoked but is wrapped in
common.mustCall(). Remove the common.mustCall() wrapper and add a
comment explaining that it should not be added.

Add a new test case that sets the timeout to 1ms and waits for both the
connection handler and the timeout handler to be invoked. This version
keeps the common.mustCall() wrapper intact around the connection handler
(although it's mostly semantic and not necessary for the test as the
test will certainly fail or time out if that handler isn't invoked).

PR-URL: #14380
Fixes: #11768
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@addaleax addaleax mentioned this pull request Aug 2, 2017
MylesBorins pushed a commit that referenced this pull request Aug 15, 2017
The tests include a callback that might not be invoked but is wrapped in
common.mustCall(). Remove the common.mustCall() wrapper and add a
comment explaining that it should not be added.

Add a new test case that sets the timeout to 1ms and waits for both the
connection handler and the timeout handler to be invoked. This version
keeps the common.mustCall() wrapper intact around the connection handler
(although it's mostly semantic and not necessary for the test as the
test will certainly fail or time out if that handler isn't invoked).

PR-URL: #14380
Fixes: #11768
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this pull request Aug 15, 2017
The tests include a callback that might not be invoked but is wrapped in
common.mustCall(). Remove the common.mustCall() wrapper and add a
comment explaining that it should not be added.

Add a new test case that sets the timeout to 1ms and waits for both the
connection handler and the timeout handler to be invoked. This version
keeps the common.mustCall() wrapper intact around the connection handler
(although it's mostly semantic and not necessary for the test as the
test will certainly fail or time out if that handler isn't invoked).

PR-URL: #14380
Fixes: #11768
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this pull request Aug 16, 2017
The tests include a callback that might not be invoked but is wrapped in
common.mustCall(). Remove the common.mustCall() wrapper and add a
comment explaining that it should not be added.

Add a new test case that sets the timeout to 1ms and waits for both the
connection handler and the timeout handler to be invoked. This version
keeps the common.mustCall() wrapper intact around the connection handler
(although it's mostly semantic and not necessary for the test as the
test will certainly fail or time out if that handler isn't invoked).

PR-URL: #14380
Fixes: #11768
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@MylesBorins MylesBorins mentioned this pull request Aug 16, 2017
MylesBorins pushed a commit that referenced this pull request Sep 5, 2017
The tests include a callback that might not be invoked but is wrapped in
common.mustCall(). Remove the common.mustCall() wrapper and add a
comment explaining that it should not be added.

Add a new test case that sets the timeout to 1ms and waits for both the
connection handler and the timeout handler to be invoked. This version
keeps the common.mustCall() wrapper intact around the connection handler
(although it's mostly semantic and not necessary for the test as the
test will certainly fail or time out if that handler isn't invoked).

PR-URL: #14380
Fixes: #11768
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@Trott Trott deleted the fix-flakies branch January 13, 2022 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky-test Issues and PRs related to the tests with unstable failures on the CI. http Issues or PRs related to the http subsystem. https Issues or PRs related to the https subsystem. test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate flaky parallel/test-http-set-timeout-server
6 participants