Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

seg fault / Invalid cid: 0 corrupt heap #44219

Closed
danrubel opened this issue Nov 16, 2020 · 7 comments
Closed

seg fault / Invalid cid: 0 corrupt heap #44219

danrubel opened this issue Nov 16, 2020 · 7 comments
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. crash Process exits with SIGSEGV, SIGABRT, etc. An unhandled exception is not a crash.

Comments

@danrubel
Copy link

Dart SDK version: 2.10.4 (stable) (Wed Nov 11 13:35:58 2020 +0100) on "linux_x64" running on Ubuntu

I am running a large program that uses more than 32 GiB of memory, reads from one file, keeps lots of maps lists and simple data types in memory manifesting in a large memory footprint, and writes to another file.

dart --old-gen-heap-size=61440 run bin/driver.dart followed by program arguments

With smaller data sets and smaller memory footprint all works well.
If I run with a larger data set but omit the --old-gen-heap-size argument,
then I get the expected Exhausted heap space, trying to allocate ... error message
But with a larger data set and the --old-gen-heap-size argument,
the program terminates with one of the following... see (1) and (2) below.

  1. seg fault
===== CRASH =====
si_signo=Segmentation fault(11), si_code=128, si_addr=(nil)
version=2.10.4 (stable) (Wed Nov 11 13:35:58 2020 +0100) on "linux_x64"
pid=7408, thread=7416, isolate_group=main(0x56168f9b3b80), isolate=(nil)((nil))
isolate_instructions=56168d9fbb80, vm_instructions=56168d9fbb80
  pc 0x000056168dd4610b fp 0x00007fa17b6ac990 /home/danrubel/work/download/latest/dart-sdk/bin/dart+0x1cbd10b
  pc 0x000056168dd463d5 fp 0x00007fa17b6ac9c0 /home/danrubel/work/download/latest/dart-sdk/bin/dart+0x1cbd3d5
  pc 0x000056168dd45c18 fp 0x00007fa17b6acae0 /home/danrubel/work/download/latest/dart-sdk/bin/dart+0x1cbcc18
  pc 0x000056168dd447c9 fp 0x00007fa17b6acb50 /home/danrubel/work/download/latest/dart-sdk/bin/dart+0x1cbb7c9
  pc 0x000056168dd43f5b fp 0x00007fa17b6acc40 dart::Scavenger::ParallelScavenge(dart::SemiSpace*)+0x23b
  pc 0x000056168dd438d7 fp 0x00007fa17b6acd30 dart::Scavenger::Scavenge()+0x187
  pc 0x000056168dd360ba fp 0x00007fa17b6ace10 dart::Heap::CollectNewSpaceGarbage(dart::Thread*, dart::Heap::GCReason)+0x1da
  pc 0x000056168dd34961 fp 0x00007fa17b6ace40 dart::Heap::AllocateNew(long)+0x151
  pc 0x000056168dc0b59c fp 0x00007fa17b6ace90 dart::Object::Allocate(long, long, dart::Heap::Space)+0x6c
  pc 0x000056168dc0c9a2 fp 0x00007fa17b6acec0 dart::Array::New(long, dart::Heap::Space)+0x32
  pc 0x000056168dcd8767 fp 0x00007fa17b6acf70 dart::DRT_AllocateArray(dart::NativeArguments)+0x107
  pc 0x00007fa17b801413 fp 0x00007fa17b6acfb0 Unknown symbol
  pc 0x00007fa17b800c1c fp 0x00007fa17b6acfe8 Unknown symbol
  pc 0x00007fa16c87edba fp 0x00007fa17b6ad008 Unknown symbol
  pc 0x00007fa06f93228f fp 0x00007fa17b6ad118 Unknown symbol
  pc 0x00007fa06f92dfce fp 0x00007fa17b6ad1a0 Unknown symbol
  pc 0x00007fa16c83b700 fp 0x00007fa17b6ad1e8 Unknown symbol
  pc 0x00007fa16c87e74d fp 0x00007fa17b6ad220 Unknown symbol
  pc 0x00007fa06f928d6a fp 0x00007fa17b6ad288 Unknown symbol
  pc 0x00007fa16c85ceaa fp 0x00007fa17b6ad2d8 Unknown symbol
  pc 0x00007fa16c85c599 fp 0x00007fa17b6ad308 Unknown symbol
  pc 0x00007fa17a323e20 fp 0x00007fa17b6ad398 Unknown symbol
  pc 0x00007fa16c836c15 fp 0x00007fa17b6ad3f8 Unknown symbol
  pc 0x00007fa16c837ed6 fp 0x00007fa17b6ad478 Unknown symbol
  pc 0x00007fa16c83375b fp 0x00007fa17b6ad4e8 Unknown symbol
  pc 0x00007fa16c83b192 fp 0x00007fa17b6ad538 Unknown symbol
  pc 0x00007fa16c832505 fp 0x00007fa17b6ad568 Unknown symbol
  pc 0x00007fa16c806e3b fp 0x00007fa17b6ad600 Unknown symbol
  pc 0x00007fa16c836c15 fp 0x00007fa17b6ad660 Unknown symbol
  pc 0x00007fa16c837ed6 fp 0x00007fa17b6ad6e0 Unknown symbol
  pc 0x00007fa16c83375b fp 0x00007fa17b6ad750 Unknown symbol
  pc 0x00007fa16c83b8c1 fp 0x00007fa17b6ad7a0 Unknown symbol
  pc 0x00007fa16c8568d5 fp 0x00007fa17b6ad800 Unknown symbol
  pc 0x00007fa16c8542b7 fp 0x00007fa17b6ad820 Unknown symbol
  pc 0x00007fa16c8526c8 fp 0x00007fa17b6ad850 Unknown symbol
  pc 0x00007fa17b8018ff fp 0x00007fa17b6ad8c8 Unknown symbol
  pc 0x000056168db9a2e2 fp 0x00007fa17b6ad960 dart::DartEntry::InvokeCode(dart::Code const&, dart::Array const&, dart::Array const&, dart::Thread*)+0x112
  pc 0x000056168db9a044 fp 0x00007fa17b6ad9f0 dart::DartEntry::InvokeFunction(dart::Function const&, dart::Array const&, dart::Array const&, unsigned long)+0x2d4
  pc 0x000056168db9c7c6 fp 0x00007fa17b6ada40 dart::DartLibraryCalls::HandleMessage(dart::Object const&, dart::Instance const&)+0x1f6
  pc 0x000056168dbd4b9c fp 0x00007fa17b6adc30 dart::IsolateMessageHandler::HandleMessage(std::__2::unique_ptr<dart::Message, std::__2::default_delete<dart::      Message> >)+0x4cc
  pc 0x000056168dc02326 fp 0x00007fa17b6adca0 dart::MessageHandler::HandleMessages(dart::MonitorLocker*, bool, bool)+0x146
  pc 0x000056168dc029da fp 0x00007fa17b6add00 dart::MessageHandler::TaskCallback()+0x1da
  pc 0x000056168dd11d98 fp 0x00007fa17b6add80 dart::ThreadPool::WorkerLoop(dart::ThreadPool::Worker*)+0x148
  pc 0x000056168dd1226c fp 0x00007fa17b6addb0 dart::ThreadPool::Worker::Main(unsigned long)+0x5c
  pc 0x000056168dc8791d fp 0x00007fa17b6ade70 /home/danrubel/work/download/latest/dart-sdk/bin/dart+0x1bfe91d
-- End of DumpStackTrace
[exit     : sp(0) fp(0x7fa17b6acfb0) pc(0)]
[stub     : sp(0x7fa17b6acfc0) fp(0x7fa17b6acfe8) pc(0x7fa17b800c1c)]
[dart     : sp(0x7fa17b6acff8) fp(0x7fa17b6ad008) pc(0x7fa16c87edba) *dart:collection__HashFieldBase@3220832__HashFieldBase@3220832. ]
[dart     : sp(0x7fa17b6ad018) fp(0x7fa17b6ad118) pc(0x7fa06f93228f) *package:kk_project_2020_covid19_som_imaqal/incremental_logic.                               dart_ConversationPerspective_notify ]
[dart     : sp(0x7fa17b6ad128) fp(0x7fa17b6ad1a0) pc(0x7fa06f92dfce) *package:engine/v5.dart_IncrementalEngine__micro_step_opinion_update@23359308 ]
[dart     : sp(0x7fa17b6ad1b0) fp(0x7fa17b6ad1e8) pc(0x7fa16c83b700) *package:engine/v5.dart_IncrementalEngine__micro_step@23359308 ]
[dart     : sp(0x7fa17b6ad1f8) fp(0x7fa17b6ad220) pc(0x7fa16c87e74d) *package:engine/v5.dart_IncrementalEngine_step ]
[dart     : sp(0x7fa17b6ad230) fp(0x7fa17b6ad288) pc(0x7fa06f928d6a) *package:engine/v5/dependency_tracer.dart_DependencyManager_addOpinionReactor ]
[dart     : sp(0x7fa17b6ad298) fp(0x7fa17b6ad2d8) pc(0x7fa16c85ceaa) package:kk_project_2020_covid19_som_imaqal/incremental_logic.dart_::_postCatchup_init ]
[dart     : sp(0x7fa17b6ad2e8) fp(0x7fa17b6ad308) pc(0x7fa16c85c599) file:///home/danrubel/work/GitRepos/Lark/KK-Project-2020-COVID19-SOM-IMAQAL.dart/bin/driver. dart_::_idle_finish_catchup ]
[dart     : sp(0x7fa17b6ad318) fp(0x7fa17b6ad398) pc(0x7fa17a323e20) file:///home/danrubel/work/GitRepos/Lark/KK-Project-2020-COVID19-SOM-IMAQAL.dart/bin/driver. dart_::_main__async_op ]
[dart     : sp(0x7fa17b6ad3a8) fp(0x7fa17b6ad3f8) pc(0x7fa16c836c15) *dart:async__FutureListener@4048458_handleValue ]
[dart     : sp(0x7fa17b6ad408) fp(0x7fa17b6ad478) pc(0x7fa16c837ed6) *dart:async__Future@4048458__propagateToListeners@4048458_handleValueCallback ]
[dart     : sp(0x7fa17b6ad488) fp(0x7fa17b6ad4e8) pc(0x7fa16c83375b) *dart:async__Future@4048458__propagateToListeners@4048458 ]
[dart     : sp(0x7fa17b6ad4f8) fp(0x7fa17b6ad538) pc(0x7fa16c83b192) *dart:async__AsyncAwaitCompleter@4048458_complete ]
[dart     : sp(0x7fa17b6ad548) fp(0x7fa17b6ad568) pc(0x7fa16c832505) *dart:async_::__completeOnAsyncReturn@4048458 ]
[dart     : sp(0x7fa17b6ad578) fp(0x7fa17b6ad600) pc(0x7fa16c806e3b) file:///home/danrubel/work/GitRepos/Lark/KK-Project-2020-COVID19-SOM-IMAQAL.dart/bin/driver. dart_::_run_from_journal__async_op ]
[dart     : sp(0x7fa17b6ad610) fp(0x7fa17b6ad660) pc(0x7fa16c836c15) *dart:async__FutureListener@4048458_handleValue ]
[dart     : sp(0x7fa17b6ad670) fp(0x7fa17b6ad6e0) pc(0x7fa16c837ed6) *dart:async__Future@4048458__propagateToListeners@4048458_handleValueCallback ]
[dart     : sp(0x7fa17b6ad6f0) fp(0x7fa17b6ad750) pc(0x7fa16c83375b) *dart:async__Future@4048458__propagateToListeners@4048458 ]
[dart     : sp(0x7fa17b6ad760) fp(0x7fa17b6ad7a0) pc(0x7fa16c83b8c1) *dart:async__Future@4048458__asyncCompleteWithValue@4048458_<anonymous closure> ]
[dart     : sp(0x7fa17b6ad7b0) fp(0x7fa17b6ad800) pc(0x7fa16c8568d5) *dart:async_::__startMicrotaskLoop@4048458 ]
[dart     : sp(0x7fa17b6ad810) fp(0x7fa17b6ad820) pc(0x7fa16c8542b7) *dart:async_::__startMicrotaskLoop@4048458__startMicrotaskLoop@4048458 ]
[dart     : sp(0x7fa17b6ad830) fp(0x7fa17b6ad850) pc(0x7fa16c8526c8) *dart:isolate__RawReceivePortImpl@1026248__handleMessage@1026248 ]
[entry    : sp(0x7fa17b6ad860) fp(0x7fa17b6ad8c8) pc(0x7fa17b8018ff)]
Aborted (core dumped)
  1. Invalid cid: 0 ... Corrupt heap
../../runtime/vm/raw_object.cc: 365: error: Invalid cid: 0, obj: 0x7fa15a53f9b8, tags: 0. Corrupt heap?
version=2.10.4 (stable) (Wed Nov 11 13:35:58 2020 +0100) on "linux_x64"
pid=9141, thread=9151, isolate_group=main(0x55eda7251b80), isolate=(nil)((nil))
isolate_instructions=55eda59d7b80, vm_instructions=55eda59d7b80
  pc 0x000055eda5c67bcc fp 0x00007faa351fd830 dart::Profiler::DumpStackTrace(void*)+0x7c
  pc 0x000055eda59d7cb2 fp 0x00007faa351fd910 dart::Assert::Fail(char const*, ...)+0x82
  pc 0x000055eda5c7ac39 fp 0x00007faa351fd940 dart::ObjectLayout::VisitPointersPredefined(dart::ObjectPointerVisitor*, long)+0x4f9
  pc 0x000055eda5d2240d fp 0x00007faa351fd970 /home/danrubel/work/download/latest/dart-sdk/bin/dart+0x1cbd40d
  pc 0x000055eda5d21c18 fp 0x00007faa351fda90 /home/danrubel/work/download/latest/dart-sdk/bin/dart+0x1cbcc18
  pc 0x000055eda5d207c9 fp 0x00007faa351fdb00 /home/danrubel/work/download/latest/dart-sdk/bin/dart+0x1cbb7c9
  pc 0x000055eda5d1ff5b fp 0x00007faa351fdbf0 dart::Scavenger::ParallelScavenge(dart::SemiSpace*)+0x23b
  pc 0x000055eda5d1f8d7 fp 0x00007faa351fdce0 dart::Scavenger::Scavenge()+0x187
  pc 0x000055eda5d120ba fp 0x00007faa351fddc0 dart::Heap::CollectNewSpaceGarbage(dart::Thread*, dart::Heap::GCReason)+0x1da
  pc 0x000055eda5d10961 fp 0x00007faa351fddf0 dart::Heap::AllocateNew(long)+0x151
  pc 0x000055eda5be759c fp 0x00007faa351fde40 dart::Object::Allocate(long, long, dart::Heap::Space)+0x6c
  pc 0x000055eda5c34a26 fp 0x00007faa351fde90 dart::TypedData::New(long, long, dart::Heap::Space)+0x136
  pc 0x000055eda5bdfc10 fp 0x00007faa351fdf00 dart::NativeEntry::BootstrapNativeCallWrapper(_Dart_NativeArguments*, void (*)(_Dart_NativeArguments*))+0xa0
  pc 0x00007faa35401594 fp 0x00007faa351fdf40 Unknown symbol
  pc 0x00007faa33ba5f26 fp 0x00007faa351fdf80 Unknown symbol
  pc 0x00007faa2643078c fp 0x00007faa351fdfb0 Unknown symbol
  pc 0x00007faa2647fd89 fp 0x00007faa351fe008 Unknown symbol
  pc 0x00007fa929db3155 fp 0x00007faa351fe118 Unknown symbol
  pc 0x00007fa929dae22e fp 0x00007faa351fe1a0 Unknown symbol
  pc 0x00007faa2643b6f0 fp 0x00007faa351fe1e8 Unknown symbol
  pc 0x00007faa2647e78d fp 0x00007faa351fe220 Unknown symbol
  pc 0x00007fa929da8fca fp 0x00007faa351fe288 Unknown symbol
  pc 0x00007faa2645cf2a fp 0x00007faa351fe2d8 Unknown symbol
  pc 0x00007faa2645c639 fp 0x00007faa351fe308 Unknown symbol
  pc 0x00007faa33ba3e20 fp 0x00007faa351fe398 Unknown symbol
  pc 0x00007faa26436bf5 fp 0x00007faa351fe3f8 Unknown symbol
  pc 0x00007faa26437eb6 fp 0x00007faa351fe478 Unknown symbol
  pc 0x00007faa2643373b fp 0x00007faa351fe4e8 Unknown symbol
  pc 0x00007faa2643b182 fp 0x00007faa351fe538 Unknown symbol
  pc 0x00007faa264324e5 fp 0x00007faa351fe568 Unknown symbol
  pc 0x00007faa26406e3b fp 0x00007faa351fe600 Unknown symbol
  pc 0x00007faa26436bf5 fp 0x00007faa351fe660 Unknown symbol
  pc 0x00007faa26437eb6 fp 0x00007faa351fe6e0 Unknown symbol
  pc 0x00007faa2643373b fp 0x00007faa351fe750 Unknown symbol
  pc 0x00007faa2643b8b1 fp 0x00007faa351fe7a0 Unknown symbol
  pc 0x00007faa26456975 fp 0x00007faa351fe800 Unknown symbol
  pc 0x00007faa26454357 fp 0x00007faa351fe820 Unknown symbol
  pc 0x00007faa26452768 fp 0x00007faa351fe850 Unknown symbol
  pc 0x00007faa354018ff fp 0x00007faa351fe8c8 Unknown symbol
  pc 0x000055eda5b762e2 fp 0x00007faa351fe960 dart::DartEntry::InvokeCode(dart::Code const&, dart::Array const&, dart::Array const&, dart::Thread*)+0x112
  pc 0x000055eda5b76044 fp 0x00007faa351fe9f0 dart::DartEntry::InvokeFunction(dart::Function const&, dart::Array const&, dart::Array const&, unsigned long)+0x2d4
  pc 0x000055eda5b787c6 fp 0x00007faa351fea40 dart::DartLibraryCalls::HandleMessage(dart::Object const&, dart::Instance const&)+0x1f6
  pc 0x000055eda5bb0b9c fp 0x00007faa351fec30 dart::IsolateMessageHandler::HandleMessage(std::__2::unique_ptr<dart::Message, std::__2::default_delete<dart::      Message> >)+0x4cc
  pc 0x000055eda5bde326 fp 0x00007faa351feca0 dart::MessageHandler::HandleMessages(dart::MonitorLocker*, bool, bool)+0x146
  pc 0x000055eda5bde9da fp 0x00007faa351fed00 dart::MessageHandler::TaskCallback()+0x1da
  pc 0x000055eda5cedd98 fp 0x00007faa351fed80 dart::ThreadPool::WorkerLoop(dart::ThreadPool::Worker*)+0x148
  pc 0x000055eda5cee26c fp 0x00007faa351fedb0 dart::ThreadPool::Worker::Main(unsigned long)+0x5c
  pc 0x000055eda5c6391d fp 0x00007faa351fee70 /home/danrubel/work/download/latest/dart-sdk/bin/dart+0x1bfe91d
-- End of DumpStackTrace
[exit     : sp(0) fp(0x7faa351fdf40) pc(0)]
[dart     : sp(0x7faa351fdf50) fp(0x7faa351fdf80) pc(0x7faa33ba5f26) dart:typed_data_Uint32List_Uint32List. ]
[dart     : sp(0x7faa351fdf90) fp(0x7faa351fdfb0) pc(0x7faa2643078c) *dart:collection__InternalLinkedHashMap@3220832__InternalLinkedHashMap@3220832. ]
[dart     : sp(0x7faa351fdfc0) fp(0x7faa351fe008) pc(0x7faa2647fd89) *dart:core_Map_Map._fromLiteral@0150898 ]
[dart     : sp(0x7faa351fe018) fp(0x7faa351fe118) pc(0x7fa929db3155) *package:kk_project_2020_covid19_som_imaqal/incremental_logic.                               dart_ConversationPerspective_notify ]
[dart     : sp(0x7faa351fe128) fp(0x7faa351fe1a0) pc(0x7fa929dae22e) *package:engine/v5.dart_IncrementalEngine__micro_step_opinion_update@23359308 ]
[dart     : sp(0x7faa351fe1b0) fp(0x7faa351fe1e8) pc(0x7faa2643b6f0) *package:engine/v5.dart_IncrementalEngine__micro_step@23359308 ]
[dart     : sp(0x7faa351fe1f8) fp(0x7faa351fe220) pc(0x7faa2647e78d) *package:engine/v5.dart_IncrementalEngine_step ]
[dart     : sp(0x7faa351fe230) fp(0x7faa351fe288) pc(0x7fa929da8fca) *package:engine/v5/dependency_tracer.dart_DependencyManager_addOpinionReactor ]
[dart     : sp(0x7faa351fe298) fp(0x7faa351fe2d8) pc(0x7faa2645cf2a) package:kk_project_2020_covid19_som_imaqal/incremental_logic.dart_::_postCatchup_init ]
[dart     : sp(0x7faa351fe2e8) fp(0x7faa351fe308) pc(0x7faa2645c639) file:///home/danrubel/work/GitRepos/Lark/KK-Project-2020-COVID19-SOM-IMAQAL.dart/bin/driver. dart_::_idle_finish_catchup ]
[dart     : sp(0x7faa351fe318) fp(0x7faa351fe398) pc(0x7faa33ba3e20) file:///home/danrubel/work/GitRepos/Lark/KK-Project-2020-COVID19-SOM-IMAQAL.dart/bin/driver. dart_::_main__async_op ]
[dart     : sp(0x7faa351fe3a8) fp(0x7faa351fe3f8) pc(0x7faa26436bf5) *dart:async__FutureListener@4048458_handleValue ]
[dart     : sp(0x7faa351fe408) fp(0x7faa351fe478) pc(0x7faa26437eb6) *dart:async__Future@4048458__propagateToListeners@4048458_handleValueCallback ]
[dart     : sp(0x7faa351fe488) fp(0x7faa351fe4e8) pc(0x7faa2643373b) *dart:async__Future@4048458__propagateToListeners@4048458 ]
[dart     : sp(0x7faa351fe4f8) fp(0x7faa351fe538) pc(0x7faa2643b182) *dart:async__AsyncAwaitCompleter@4048458_complete ]
[dart     : sp(0x7faa351fe548) fp(0x7faa351fe568) pc(0x7faa264324e5) *dart:async_::__completeOnAsyncReturn@4048458 ]
[dart     : sp(0x7faa351fe578) fp(0x7faa351fe600) pc(0x7faa26406e3b) file:///home/danrubel/work/GitRepos/Lark/KK-Project-2020-COVID19-SOM-IMAQAL.dart/bin/driver. dart_::_run_from_journal__async_op ]
[dart     : sp(0x7faa351fe610) fp(0x7faa351fe660) pc(0x7faa26436bf5) *dart:async__FutureListener@4048458_handleValue ]
[dart     : sp(0x7faa351fe670) fp(0x7faa351fe6e0) pc(0x7faa26437eb6) *dart:async__Future@4048458__propagateToListeners@4048458_handleValueCallback ]
[dart     : sp(0x7faa351fe6f0) fp(0x7faa351fe750) pc(0x7faa2643373b) *dart:async__Future@4048458__propagateToListeners@4048458 ]
[dart     : sp(0x7faa351fe760) fp(0x7faa351fe7a0) pc(0x7faa2643b8b1) *dart:async__Future@4048458__asyncCompleteWithValue@4048458_<anonymous closure> ]
[dart     : sp(0x7faa351fe7b0) fp(0x7faa351fe800) pc(0x7faa26456975) *dart:async_::__startMicrotaskLoop@4048458 ]
[dart     : sp(0x7faa351fe810) fp(0x7faa351fe820) pc(0x7faa26454357) *dart:async_::__startMicrotaskLoop@4048458__startMicrotaskLoop@4048458 ]
[dart     : sp(0x7faa351fe830) fp(0x7faa351fe850) pc(0x7faa26452768) *dart:isolate__RawReceivePortImpl@1026248__handleMessage@1026248 ]
[entry    : sp(0x7faa351fe860) fp(0x7faa351fe8c8) pc(0x7faa354018ff)]
Aborted (core dumped)
@keertip keertip added the area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. label Nov 16, 2020
@a-siva
Copy link
Contributor

a-siva commented Nov 16, 2020

//cc @rmacnak-google

@a-siva
Copy link
Contributor

a-siva commented Nov 16, 2020

We would need some reproduction instructions for this, is is possible to provide some test code?

@a-siva a-siva added the crash Process exits with SIGSEGV, SIGABRT, etc. An unhandled exception is not a crash. label Nov 16, 2020
@lukechurch
Copy link
Contributor

lukechurch commented Nov 16, 2020

Thanks for the quick response @a-siva ( @rmacnak-google ) - and nice to be in touch again

We've spent a couple of days trying to build a small reproduction example rather than a full production project, unfortunately without much success. So the reproduction currently includes a large datafile and bundle of confidential source code.

I think probably the easiest thing to do might be for us to give you SSH access to a development machine that's setup with the reproduction case. Would that work? Alternatively if there's any extra information we could capture, happy to do so.

We've seen the crash on both our production and dev infra, both on Ubuntu 20.04.

@rmacnak-google
Copy link
Contributor

Hi @danrubel and @lukechurch,

Lately there have been several fixes related to making the VM more reliable when it runs out of memory. AFAICT, these are newer than the 2.10.4 version reported in your stack traces, so you might first try to run your program on the most recent dev build and see if the crashes reproduce there.

The default value of --old-gen-heap-size is set at ~32GB to reflect a constraint coming from the default value of Linux's vm.max_map_count and the VM's heap page alignment (64k mappings * 512kB heap pages = 32GB). Since you are increasing --old-gen-heap-size, you might try increasing vm.max_map_count as well.

If neither of those fix the issue, I will probably need access to a reproduction that I can interact with while making debugging changes to the VM: unfortunately for GC crashes like this stack traces and log output are usually not sufficient to track down the bug.

@danrubel
Copy link
Author

Thank you @rmacnak-google for the suggestions and background information. I'll try the dev VM and various settings with our large data set later this week. In addition I'm trying to build a test case that exercises the VM in a way that causes this error condition to occur. I'll keep you posted as to my progress.

dart-bot pushed a commit that referenced this issue Nov 17, 2020
…nough to support fully allocating a max heap size given by --old_gen_heap_size.

TEST=manually adjusting --old_gen_heap_size
Bug: #44219
Change-Id: I7ef6d46aca85024028dbd64bddd84438d028f0cf
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/172361
Reviewed-by: Alexander Aprelev <aam@google.com>
Reviewed-by: Martin Kustermann <kustermann@google.com>
Commit-Queue: Ryan Macnak <rmacnak@google.com>
@danrubel
Copy link
Author

@rmacnak-google From our testing, it appears that vm.max_map_count = 2 x old-gen-heap-size helps our program to make further processes, but still causes the VM to thrash/hang for a long time (hours in our case) before again crashing. The results of our restricted memory tests (--old-gen-heap-size = 4000) show that vm.max_map_count = 8000 causes this same thrash/hang behavior but that with vm.max_map_count = 10000, the Dart VM "cleanly" terminates with OOM no hang/thrash/non-OOM-crash.

Results from a large data run are still pending but wanted to get back to you with what I've learned before the weekend.

@danrubel
Copy link
Author

@rmacnak-google We are no longer seeing the crashes above given your suggestion to set vm.max_map_count.
Thank you!

@a-siva a-siva closed this as completed Dec 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. crash Process exits with SIGSEGV, SIGABRT, etc. An unhandled exception is not a crash.
Projects
None yet
Development

No branches or pull requests

5 participants