Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test-cluster-net-listen-ipv6only-none is unreliable/flaky #29679

Closed
Trott opened this issue Sep 23, 2019 · 2 comments
Closed

test-cluster-net-listen-ipv6only-none is unreliable/flaky #29679

Trott opened this issue Sep 23, 2019 · 2 comments
Labels
cluster Issues and PRs related to the cluster subsystem. flaky-test Issues and PRs related to the tests with unstable failures on the CI.

Comments

@Trott
Copy link
Member

Trott commented Sep 23, 2019

  • Version: 13.0.0-pre (master branch)
  • Platform: all, although observed on various Linux in CI
  • Subsystem: cluster test

test-cluster-net-listen-ipv6only-none has faulty logic in its Countdown callback. The check there assumes that if a port is unavailable for IPv4 but available for IPv6, that the operating system won't supply that port for IPv6. This is apparently incorrect as seen in https://ci.nodejs.org/job/node-test-commit-linux/nodes=centos7-64-gcc6/29869/testReport/junit/(root)/test/parallel_test_cluster_net_listen_ipv6only_none/:

Error: listen EADDRINUSE: address already in use 0.0.0.0:44400
    at Server.setupListenHandle [as _listen2] (net.js:1298:14)
    at listenInCluster (net.js:1346:12)
    at doListen (net.js:1485:7)
    at processTicksAndRejections (internal/process/task_queues.js:81:21)
Emitted 'error' event on Server instance at:
    at emitErrorNT (net.js:1325:8)
    at processTicksAndRejections (internal/process/task_queues.js:80:21) {
  code: 'EADDRINUSE',
  errno: -98,
  syscall: 'listen',
  address: '0.0.0.0',
  port: 44400
}

One option to fix it would be to remove the check entirely. But then we're perhaps not checking that it's not opening the port on IPv4, which is part of the test.

In my opinion, a better option would be to switch to using common.PORT rather than an operating-system-allocated port. That would mean moving the test to sequential.

We'd also want to check the other tests in parallel that use ipv6Only to make sure they don't suffer from the same issue.

@Trott
Copy link
Member Author

Trott commented Sep 23, 2019

Another example of this failing in CI:

https://ci.nodejs.org/job/node-test-commit-linux/nodes=debian8-64/29824/console

Error: listen EADDRINUSE: address already in use 0.0.0.0:46730
    at Server.setupListenHandle [as _listen2] (net.js:1298:14)
    at listenInCluster (net.js:1346:12)
    at doListen (net.js:1485:7)
    at processTicksAndRejections (internal/process/task_queues.js:81:21)
Emitted 'error' event on Server instance at:
    at emitErrorNT (net.js:1325:8)
    at processTicksAndRejections (internal/process/task_queues.js:80:21) {
  code: 'EADDRINUSE',
  errno: -98,
  syscall: 'listen',
  address: '0.0.0.0',
  port: 46730
}

@Trott
Copy link
Member Author

Trott commented Sep 23, 2019

And...one more:

https://ci.nodejs.org/job/node-test-commit-arm/nodes=ubuntu1604-arm64/26344/consoleText

not ok 240 parallel/test-cluster-net-listen-ipv6only-none
  ---
  duration_ms: 1.547
  severity: fail
  exitcode: 1
  stack: |-
    events.js:186
          throw er; // Unhandled 'error' event
          ^
    
    Error: listen EADDRINUSE: address already in use 0.0.0.0:42781
        at Server.setupListenHandle [as _listen2] (net.js:1298:14)
        at listenInCluster (net.js:1346:12)
        at doListen (net.js:1485:7)
        at processTicksAndRejections (internal/process/task_queues.js:81:21)
    Emitted 'error' event on Server instance at:
        at emitErrorNT (net.js:1325:8)
        at processTicksAndRejections (internal/process/task_queues.js:80:21) {
      code: 'EADDRINUSE',
      errno: -98,
      syscall: 'listen',
      address: '0.0.0.0',
      port: 42781
    }
  ...

@Trott Trott added flaky-test Issues and PRs related to the tests with unstable failures on the CI. cluster Issues and PRs related to the cluster subsystem. labels Sep 23, 2019
Trott added a commit to Trott/io.js that referenced this issue Sep 25, 2019
test-cluster-net-listen-ipv6only-none was using port `0` for an
IPv6-only operation and assuming that the operating system would supply
a port that was also available in IPv4. However, CI results seem to
indicate that a port can be supplied that is in use by IPv4 but
available to IPv6, resulting in the test failing. Use `common.PORT` to
avoid this issue.

Fixes: nodejs#29679
@Trott Trott closed this as completed in 1c5a3f0 Oct 1, 2019
targos pushed a commit that referenced this issue Oct 1, 2019
test-cluster-net-listen-ipv6only-none was using port `0` for an
IPv6-only operation and assuming that the operating system would supply
a port that was also available in IPv4. However, CI results seem to
indicate that a port can be supplied that is in use by IPv4 but
available to IPv6, resulting in the test failing. Use `common.PORT` to
avoid this issue.

Fixes: #29679

PR-URL: #29681
Reviewed-By: Sam Roberts <vieuxtech@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cluster Issues and PRs related to the cluster subsystem. flaky-test Issues and PRs related to the tests with unstable failures on the CI.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant