Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

NNI breaks down when there are too many intermediate results #907

Closed
Crysple opened this issue Mar 22, 2019 · 2 comments
Closed

NNI breaks down when there are too many intermediate results #907

Crysple opened this issue Mar 22, 2019 · 2 comments
Labels
bug Something isn't working nnidev

Comments

@Crysple
Copy link
Contributor

Crysple commented Mar 22, 2019

Short summary about the issue/question:
NNI breaks down when there are too many intermediate result
Brief what process you are following:
Generate a trial that has over hundreds of thousands of intermediate results and refresh Web UI

How to reproduce it:
Generate a trial that has over hundreds of thousands of intermediate results and refresh Web UI

nni Environment:

  • nni version: v0.5.2
  • nni mode(local|pai|remote): local
  • OS: Ubuntu
  • python version: 3.6.1
  • is conda or virtualenv used?: conda venv
  • is running in docker?: no

Anything else we need to know:

Error message from stdout:

[9:55 AM] Zejun Lin (FA Talent)
    (node:19228) ExperimentalWarning: The fs.promises API is experimental

URIError: Failed to decode param '/wp%-login'

    at decodeURIComponent (<anonymous>)

    at decode_param (/home/core/v-zejlin/miniconda3/envs/pynni/nni/node_modules/express/lib/router/layer.js:172:12)

    at Layer.match (/home/core/v-zejlin/miniconda3/envs/pynni/nni/node_modules/express/lib/router/layer.js:123:27)

    at matchLayer (/home/core/v-zejlin/miniconda3/envs/pynni/nni/node_modules/express/lib/router/index.js:574:18)

    at next (/home/core/v-zejlin/miniconda3/envs/pynni/nni/node_modules/express/lib/router/index.js:220:15)

    at jsonParser (/home/core/v-zejlin/miniconda3/envs/pynni/nni/node_modules/body-parser/lib/types/json.js:110:7)

    at Layer.handle [as handle_request] (/home/core/v-zejlin/miniconda3/envs/pynni/nni/node_modules/express/lib/router/layer.js:95:5)

    at trim_prefix (/home/core/v-zejlin/miniconda3/envs/pynni/nni/node_modules/express/lib/router/index.js:317:13)

    at /home/core/v-zejlin/miniconda3/envs/pynni/nni/node_modules/express/lib/router/index.js:284:7

    at Function.process_params (/home/core/v-zejlin/miniconda3/envs/pynni/nni/node_modules/express/lib/router/index.js:335:12)

FATAL ERROR: MarkCompactCollector: semi-space copy, fallback in old gen Allocation failed - JavaScript heap out of memory

 1: 0x8daaa0 node::Abort() [node]

 2: 0x8daaec  [node]

 3: 0xad73ce v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]

 4: 0xad7604 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]

 5: 0xec4c32  [node]

 6: 0xeebdf1 v8::internal::EvacuateNewSpaceVisitor::Visit(v8::internal::HeapObject*, int) [node]

 7: 0xeefe37 void v8::internal::LiveObjectVisitor::VisitBlackObjectsNoFail<v8::internal::EvacuateNewSpaceVisitor, v8::internal::MajorNonAtomicMarkingState>(v8::internal::MemoryChunk*, v8::internal::MajorNonAtomicMarkingState*, v8::internal::EvacuateNewSpaceVisitor*, v8::internal::LiveObjectVisitor::IterationMode) [node]

 8: 0xef8af2 v8::internal::FullEvacuator::RawEvacuatePage(v8::internal::Page*, long*) [node]

 9: 0xee7cb1 v8::internal::Evacuator::EvacuatePage(v8::internal::Page*) [node]

10: 0xee7fb2 v8::internal::PageEvacuationTask::RunInParallel() [node]

11: 0xee045f v8::internal::ItemParallelJob::Task::RunInternal() [node]

12: 0xee0f97 v8::internal::ItemParallelJob::Run(std::shared_ptr<v8::internal::Counters>) [node]

13: 0xeec76e void v8::internal::MarkCompactCollectorBase::CreateAndExecuteEvacuationTasks<v8::internal::FullEvacuator, v8::internal::MarkCompactCollector>(v8::internal::MarkCompactCollector*, v8::internal::ItemParallelJob*, v8::internal::RecordMigratedSlotVisitor*, v8::internal::MigrationObserver*, long) [node]

14: 0xef79fd v8::internal::MarkCompactCollector::EvacuatePagesInParallel() [node]

15: 0xef7b66 v8::internal::MarkCompactCollector::Evacuate() [node]

16: 0xef8602 v8::internal::MarkCompactCollector::CollectGarbage() [node]

17: 0xed0451 v8::internal::Heap::MarkCompact() [node]

18: 0xed0b41 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]

19: 0xed1744 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]

20: 0xed43b1 v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) [node]

21: 0xe9c695  [node]

22: 0xea3f0a v8::internal::Factory::NewRawOneByteString(int, v8::internal::PretenureFlag) [node]

23: 0x11d3935 v8::internal::IncrementalStringBuilder::Extend() [node]

24: 0xf98348 v8::internal::JsonStringifier::SerializeString(v8::internal::Handle<v8::internal::String>) [node]

25: 0xf988ab v8::internal::JsonStringifier::Result v8::internal::JsonStringifier::Serialize_<true>(v8::internal::Handle<v8::internal::Object>, bool, v8::internal::Handle<v8::internal::Object>) [node]

26: 0xf9d9bd v8::internal::JsonStringifier::Result v8::internal::JsonStringifier::Serialize_<false>(v8::internal::Handle<v8::internal::Object>, bool, v8::internal::Handle<v8::internal::Object>) [node]

27: 0xf9d626 v8::internal::JsonStringifier::Result v8::internal::JsonStringifier::Serialize_<false>(v8::internal::Handle<v8::internal::Object>, bool, v8::internal::Handle<v8::internal::Object>) [node]

28: 0xf9e719 v8::internal::JsonStringifier::Stringify(v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>) [node]

29: 0xba61d1 v8::internal::Builtin_JsonStringify(int, v8::internal::Object**, v8::internal::Isolate*) [node]

30: 0x179937f5bf7d


<--- Last few GCs --->

[19228:0x386ffa0] 65181338 ms: Mark-sweep 1383.1 (1434.8) -> 1383.0 (1435.8) MB, 176.0 / 0.0 ms  (average mu = 0.692, current mu = 0.214) allocation failure scavenge might not succeed

[19228:0x386ffa0] 65181789 ms: Mark-sweep 1384.1 (1435.8) -> 1384.1 (1457.8) MB, 448.0 / 0.0 ms  (average mu = 0.354, current mu = 0.006) allocation failure scavenge might not succeed

@Crysple Crysple changed the title NNI breaks down when there are too many intermediate result NNI breaks down when there are too many intermediate results Mar 22, 2019
@Crysple
Copy link
Contributor Author

Crysple commented Mar 22, 2019

The main cause of this might be:

FATAL ERROR: MarkCompactCollector: semi-space copy, fallback in old gen Allocation failed - JavaScript heap out of memory

So I try to add a node option in launcher.py below and it works:

NODE_OPTIONS="--max-old-space-size=8192"

@scarlett2018 scarlett2018 added the bug Something isn't working label Apr 10, 2019
@rabbit008 rabbit008 added this to the Sept 2019 Release milestone Sep 17, 2019
@ultmaster
Copy link
Contributor

Close as duplicate.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working nnidev
Projects
None yet
Development

No branches or pull requests

4 participants