test: move --cpu-prof tests to sequential #28210

joyeecheung · 2019-06-13T17:16:40Z

The tests still fail after being split into multiple files,
(2 out of 30 runs in roughly 48 hours) and the causes are missing
target frames in the samples. This patch moves them to sequential
to observe if the flakiness can be fixed when the tests are
run on a system with less load.

If the flake ever shows up again even after the tests are moved
to sequential, we should consider make the tests more
lenient - that is, we would only assert that there are some frames
in the generated CPU profile but do not look for the target
function there.

Refs: #27611

Checklist

make -j4 test (UNIX), or vcbuild test (Windows) passes
commit message follows commit guidelines

The tests still fail after being split into multiple files, (2 out of 30 runs in roughly 48 hours) and the causes are missing target frames in the samples. This patch moves them to sequential to observe if the flakiness can be fixed when the tests are run on a system with less load. If the flake ever shows up again even after the tests are moved to sequential, we should consider make the test conditions more lenient - that is, we would only assert that there are *some* frames in the generated CPU profile but do not look for the target function there.

nodejs-github-bot · 2019-06-13T17:16:43Z

Sadly, an error occurred when I tried to trigger a build. :(

joyeecheung · 2019-06-13T17:20:16Z

hmm, the stats above may be wrong because out of those 2 failed runs, one of them is https://ci.nodejs.org/job/node-test-pull-request/23810/ which was

ReferenceError: common is not defined

But it is still flaky as it affects https://ci.nodejs.org/job/node-test-pull-request/23832/

Trott

LGTM but I'll kick off a pair of node-stress-single-test runs to compare parallel vs. sequential to gather some data.

Trott · 2019-06-13T18:04:09Z

Parallel (master): https://ci.nodejs.org/job/node-stress-single-test/2222/
Sequential (this PR): https://ci.nodejs.org/job/node-stress-single-test/2223/

Trott · 2019-06-13T18:17:09Z

hmm, the stats above may be wrong because out of those 2 failed runs, one of them is https://ci.nodejs.org/job/node-test-pull-request/23810/ which was
ReferenceError: common is not defined
But it is still flaky as it affects https://ci.nodejs.org/job/node-test-pull-request/23832/

FWIW, here are some failures from node-test-pull-request. It's 5 in roughly the last 24 hours:

https://ci.nodejs.org/job/node-test-commit-freebsd/26873/ (test-cpu-prof-dir-worker)
https://ci.nodejs.org/job/node-test-commit-freebsd/26850/ (test-cpu-prof-dir-and-name)
https://ci.nodejs.org/job/node-test-commit-freebsd/26845/ (test-cpu-prof-drained)
https://ci.nodejs.org/job/node-test-commit-freebsd/26842/ (test-cpu-prof-dir-absolute)
https://ci.nodejs.org/job/node-test-commit-freebsd/26840/ (test-cpu-prof-dir-and-name)

It's all on FreeBSD, but that's not FeeBSD per se but rather our specific config in CI which has a lot of cores so runs a lot of things in parallel

Trott · 2019-06-13T18:24:06Z

Hmmm...FreeBSD 11 build failed for both stress tests. Will run again on FreeBSD 10 and FreeBSD Latest if the other platforms don't show anything interesting, since this definitely seems to affect FreeBSD a lot.

mhdawson

LGTM

joyeecheung · 2019-06-13T22:29:44Z

@Trott Thanks for starting the stress tests. It looks like on macOS which manage to build and run the tests, there was a difference though (17 v.s. 0 out of 1000 runs).

I am wondering whether running that on FreeBSD 10 alone is enough - the flake has only showed up in 11 in the CI ever since the tests were split.

Trott · 2019-06-14T16:56:52Z

Unfortunately, the FreeBSD hosts are not successfully building in the stress tests for either master or this branch. I'll take a look to see if I can figure anything out, or maybe someone else on @nodejs/build will have an idea.

But I think the OS X results are enough to show that it's unreliable in parallel (19 failures in 1000 runs) and seems reliable in sequential (0 failures in 1000 runs). And I strongly suspect the FreeBSD hosts would show a significantly higher rate of failure (because they usually do in these types of cases).

The tests still fail after being split into multiple files, (2 out of 30 runs in roughly 48 hours) and the causes are missing target frames in the samples. This patch moves them to sequential to observe if the flakiness can be fixed when the tests are run on a system with less load. If the flake ever shows up again even after the tests are moved to sequential, we should consider make the test conditions more lenient - that is, we would only assert that there are *some* frames in the generated CPU profile but do not look for the target function there. PR-URL: nodejs#28210 Refs: nodejs#27611 Reviewed-By: Rich Trott <rtrott@gmail.com> Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com> Reviewed-By: James M Snell <jasnell@gmail.com>

Trott · 2019-06-16T22:49:48Z

Looks like this landed this in 7e5e1c2 but wasn't closed? Closing, but of course, re-open if I'm wrong somehow. /ping @joyeecheung

The tests still fail after being split into multiple files, (2 out of 30 runs in roughly 48 hours) and the causes are missing target frames in the samples. This patch moves them to sequential to observe if the flakiness can be fixed when the tests are run on a system with less load. If the flake ever shows up again even after the tests are moved to sequential, we should consider make the test conditions more lenient - that is, we would only assert that there are *some* frames in the generated CPU profile but do not look for the target function there. PR-URL: #28210 Refs: #27611 Reviewed-By: Rich Trott <rtrott@gmail.com> Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com> Reviewed-By: James M Snell <jasnell@gmail.com>

nodejs-github-bot added the test Issues and PRs related to the tests. label Jun 13, 2019

joyeecheung mentioned this pull request Jun 13, 2019

investigate flaky test-cpu-prof #27611

Open

Trott approved these changes Jun 13, 2019

View reviewed changes

mhdawson approved these changes Jun 13, 2019

View reviewed changes

jasnell approved these changes Jun 14, 2019

View reviewed changes

joyeecheung mentioned this pull request Jun 16, 2019

CI failures: 20190612-20190615 nodejs/reliability#23

Closed

Trott closed this Jun 16, 2019

BridgeAR mentioned this pull request Jun 17, 2019

v12.5.0 proposal #28268

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: move --cpu-prof tests to sequential #28210

test: move --cpu-prof tests to sequential #28210

joyeecheung commented Jun 13, 2019 •

edited

Loading

nodejs-github-bot commented Jun 13, 2019

joyeecheung commented Jun 13, 2019 •

edited

Loading

Trott left a comment

Trott commented Jun 13, 2019

Trott commented Jun 13, 2019

Trott commented Jun 13, 2019

mhdawson left a comment

joyeecheung commented Jun 13, 2019

Trott commented Jun 14, 2019

Trott commented Jun 16, 2019

test: move --cpu-prof tests to sequential #28210

test: move --cpu-prof tests to sequential #28210

Conversation

joyeecheung commented Jun 13, 2019 • edited Loading

Checklist

nodejs-github-bot commented Jun 13, 2019

joyeecheung commented Jun 13, 2019 • edited Loading

Trott left a comment

Choose a reason for hiding this comment

Trott commented Jun 13, 2019

Trott commented Jun 13, 2019

Trott commented Jun 13, 2019

mhdawson left a comment

Choose a reason for hiding this comment

joyeecheung commented Jun 13, 2019

Trott commented Jun 14, 2019

Trott commented Jun 16, 2019

joyeecheung commented Jun 13, 2019 •

edited

Loading

joyeecheung commented Jun 13, 2019 •

edited

Loading