Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

codecov OOM crash #11781

Closed
connorjclark opened this issue Dec 7, 2020 · 2 comments · Fixed by #11770
Closed

codecov OOM crash #11781

connorjclark opened this issue Dec 7, 2020 · 2 comments · Fixed by #11770

Comments

@connorjclark
Copy link
Collaborator

connorjclark commented Dec 7, 2020

While moving code coverage to GitHub Actions so we can sunset Travis #11770 #11194, an OOM crash was found when running our unit tests.

./node_modules/.bin/nyc --all --hook-run-in-context ./node_modules/.bin/jest lighthouse-core/test --runInBand --ci --coverage

node: v12.13.0 (happens on 10 too)

crash:

<--- Last few GCs --->

[1934:0x108008000]   191476 ms: Mark-sweep 2034.7 (2071.2) -> 2014.1 (2061.2) MB, 717.1 / 0.1 ms  (average mu = 0.269, current mu = 0.282) allocation failure scavenge might not succeed
[1934:0x108008000]   192387 ms: Mark-sweep 2031.7 (2073.7) -> 2014.8 (2061.8) MB, 806.7 / 0.1 ms  (average mu = 0.197, current mu = 0.114) allocation failure scavenge might not succeed


<--- JS stacktrace --->

==== JS stack trace =========================================

    0: ExitFrame [pc: 0x10092fbd9]
Security context: 0x181b829808a1 <JSObject>
    1: deepCyclicCopyObject(aka deepCyclicCopyObject) [0x181b2805bde9] [/Users/cjamcl/src/lighthouse/node_modules/jest-util/build/deepCyclicCopy.js:~50] [pc=0x3a2d67113223](this=0x181b288004a9 <undefined>,0x181b33a459a1 <Object map = 0x181b182862e1>,0x181b47860b09 <Object map = 0x181b02332111>,0x181b2fc33f39 <JSWeakMap>)
    2: deepCyclicCopy(aka deepCyclicCopy...

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory

Writing Node.js report to file: report.20201207.155933.1934.0.001.json
Node.js report completed
 1: 0x10007e743 node::Abort() [/Users/cjamcl/.nvm/versions/node/v12.13.0/bin/node]
 2: 0x10007e8c7 node::OnFatalError(char const*, char const*) [/Users/cjamcl/.nvm/versions/node/v12.13.0/bin/node]
 3: 0x100176267 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/Users/cjamcl/.nvm/versions/node/v12.13.0/bin/node]
 4: 0x100176203 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/Users/cjamcl/.nvm/versions/node/v12.13.0/bin/node]
 5: 0x1002fa2b5 v8::internal::Heap::FatalProcessOutOfMemory(char const*) [/Users/cjamcl/.nvm/versions/node/v12.13.0/bin/node]
 6: 0x1002fb984 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [/Users/cjamcl/.nvm/versions/node/v12.13.0/bin/node]
 7: 0x1002f8857 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/Users/cjamcl/.nvm/versions/node/v12.13.0/bin/node]
 8: 0x1002f683d v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/Users/cjamcl/.nvm/versions/node/v12.13.0/bin/node]
 9: 0x100301f54 v8::internal::Heap::AllocateRawWithLightRetry(int, v8::internal::AllocationType, v8::internal::AllocationAlignment) [/Users/cjamcl/.nvm/versions/node/v12.13.0/bin/node]
10: 0x100301fcf v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationType, v8::internal::AllocationAlignment) [/Users/cjamcl/.nvm/versions/node/v12.13.0/bin/node]
11: 0x1002cebc7 v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationType) [/Users/cjamcl/.nvm/versions/node/v12.13.0/bin/node]
12: 0x1005f7725 v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [/Users/cjamcl/.nvm/versions/node/v12.13.0/bin/node]
13: 0x10092fbd9 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit [/Users/cjamcl/.nvm/versions/node/v12.13.0/bin/node]

(Note: narrowing to fewer tests doesn't remove the crash, it simply makes it more flaky. for example, lighthouse-core/test/audits crashes...some of the time. running all is more consistent)
(Note: initially I thought removing --runInBand prevented the crash, however that was an artifact of running just a subset of the tests and not realizing crashing was flaky behavior)

Quick probes for which versions of Lighthouse crash:

v6.1.0: crash
v6.0.0: crash
v5.0.0: crash

At this point I'm scratching my head. If this crashes going back to v5, how did this ever work in travis?


Here's a bisect script I wrote when I thought that was a good idea. (tries to) handle flakiness, and only fails if node crashes with OOM exit code

#!/usr/bin/env bash
set -ux

yarn

for ((n=0;n<5;n++))
do
  status=0
  ./node_modules/.bin/nyc --all --hook-run-in-context ./node_modules/.bin/jest lighthouse-core/test --runInBand --ci --coverage
  status=$?

  git rev-parse HEAD >> bisect-log.txt
  echo "$status" >> bisect-log.txt

  # Fail if OOM.
  if [ "$status" -eq 125 ] || [ "$status" -gt 127 ]; then
    exit 1
  fi
done

@benschwarz
Copy link
Contributor

@connorjclark, have you tried tweaking --max-old-space-size? We've found this to be a requirement for LH in certain scenarios. Holler if you've questions.

@connorjclark
Copy link
Collaborator Author

connorjclark commented Dec 7, 2020

@connorjclark, have you tried tweaking --max-old-space-size? We've found this to be a requirement for LH in certain scenarios. Holler if you've questions.

I tried setting to 4GB but it still crashed.

Removing --coverage from unit:ci resolves it. Turns out using babel to modify your code with line counters finally caught up to us. It crashes node nearly every time now. But, it forces us to use babel plugins to use new language features. We don't like that. So we are gonna switch to https://github.com/bcoe/c8 , which instruments the code using the v8 developer protocols. I verified it doesn't crash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants