-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node.js v14.15.5 segfault in v8::internal::ConcurrentMarking::Run #37553
Comments
Any update on this? |
It can be reproduced in MacOS, with exception
I also observe that the memory usage is huge (~2.0GB real memory usage ~4.0 GB memory usage, according to activity monitor). Could it be related to memory exhaustion or leaking? |
Because I'm not very familar with C++ backend, I'm going to label this with c++ to identify problem further. |
I'm sorry it's not more useful, but this bug is candidate for continuous segfaults in our own system since upgrading from node 12 to 14:
The segfault always happens on Unfortunately I'm running within an AWS Lambda environment, so there are no dumps. I'm working on recreating it locally, but it seems like this is the same issue (which is what gave me the hint to isolate the JSON parsing in the first place). At the moment I know the following:
I'm currently working on a local / smaller reproduction, although I regret my C++ is unlikely to be good enough to provide any real insights here; I just wanted to register my 'it is not just you'. ed. now with somewhat identical back trace: General information about node instance
List of all threads
Threads' backtrace
|
This happens if you use the
I wonder if #37106 (SIGSEGV for address: 0x0 in ConcurrentMarking::Run) is related; similar story, upgrading from 12 to 14 results in sporadic segfaults, seemingly when causing garbage collection and using concurrent socket requests. |
@thomasmichaelwallace thank you for your confirmation. From my point of view #37106 may be related or even the same issue. However, from the issue description I was not able to derive a direct connection between the two problems. As far as I can tell, the segfault from our use-case is caused by an error in the Node.js/v8 internal data-structure/memory management that gets triggered on high load in the main thread while reading big data chunks from sockets. From my research i found out, that in the past @gireeshpunathil solved some issues with similar context (#25814). Maybe we should ask him for advice? |
let us start by understanding the failing context a little deeper:
just want to state that it is going to be an iterative process! |
Hi @gireeshpunathil, thank you for your fast reply. Besides: In the issue description I referenced a repositor that contains a sample application, that eventually should recreate the issue. |
|
never mind, I am able to recreate with your sample program! thanks for the nice setup for the recreate!
I will debug and let you know! |
Thank you for taking a look at this @gireeshpunathil! It would seem too much of a coincidence for mine and hellivan's socket+gc segfaults to have different underlying causes, so I'm going to trust that his reproduction repo will be enough. For what it's worth, here are the results of my following those commands; just in case it becomes immediately obvious to you that mine is a different issue, which I should separately raise:
|
Hello! I am creator of #37106 and it looks like my issue is exactly the same. |
Thank you very much for your efforts @gireeshpunathil. If it still helps, this would be the output of my debug session in
and
|
@bolt-juri-gavshin - thanks. @hellivan -thanks for the detailed data. As I said, I have the repro now! an interim update: (gdb) set disassembly-flavor intel
(gdb) x/i $rip
=> 0xcff9c4 <_ZN2v88internal17ConcurrentMarking3RunEiPNS1_9TaskStateE+1364>:
movzx eax,BYTE PTR [r14]
(gdb) i r r14
r14 0x3938363632303039 4123105065255841849
(gdb) x/b $r14
0x3938363632303039: Cannot access memory at address 0x3938363632303039
(gdb)
|
Thanks for the update @gireeshpunathil. I might be misunderstanding your comment, but both hellivan and I can reproduce in |
@thomasmichaelwallace - sorry for the confusion - ok, I will then reword as |
Is it a coincidence, that address is a numeric ASCII string? |
Any update on this? |
Just for information: |
I was unable to spend time on this. /cc @nodejs/v8 in case if they find anything obvious from the failing context. |
Any update on this? Maybe we can help with something? |
I'm very much out of my depth with C++, but I wonder if it's worth taking it to the V8 team (I can't imagine node uses it's own garbage collector?). [Definitely still a major issue for us, causes about 100 crashes an hour on production]. |
I am sorry that it is causing visible impact to your production. Unfortunately I am out of ideas - consistent recreate, but does not yield to a debug build or valgrind! pinging debug specialists @addaleax, @mhdawson and @mmarchini to see if there are other techniques to debug memory corruption issues without valgrind. |
@gireeshpunathil, one thought I have is trying to see if it is related to V8 JIT activity. A few things to try would include:
|
@gireeshpunathil one more question, When you ran valgrind did you use a debug build or the regular build? I've recently seen a case where compiling with debug eliminated the problem (likely due to it initializing to variables to 0) but valgrind on the non-debug build did report issues. |
@mhdawson - thanks! plain release: crash btw, are there less pervasive options in valgrind? I can see the available options, but not sure about the degree of influence those will have on the runtime behavior. |
@gireeshpunathil in terms of less pervasive options with valgrind, I'm not aware of any. I know you can enable additional checking/generation of info but I usually start by just running with the defaults. |
Another suggestion is to narrow down the passes with version X, fails with version Y to between 2 sequential releases. At that point you can look at the changes between those 2 releases, see if there is a v8 updates or any other changes that look like they could be related. @hellivan that might be something you could do. It might still be a larger number of changes if its something like passes with latest version of 12.x, fails with 14.0.0 but still useful I think. |
So taking onboard your suggestion, @mhdawson, I've discovered the following:
Unfortunately, as far as I can tell, v13 -> v14 is 1890 commits (Change Log) Critically, though it does include upgrading V8 to 8.1.307.20, which might provide some insight? If there's anymore versions it's worth testing to gain some insights (I'm assuming v13.14 > v14.0.0 are consecutive) I'm happy to do the leg work. I have noticed that the segmentation fault seems to happen nearly immediately in v14.0.0, but takes longer v14.17.0. |
sure, but luckily we have |
my observation was different from that of @thomasmichaelwallace , in terms of passing and failing versions; so don't take this as final and conclusive. the |
Thanks for introducing me to My test to reproduce stops at: 2883c85 "deps: update V8 to 8.1.307.20" I would guess updating v8 would make sense as a candidate for the change in behaviour that leads to this bug. Annoyingly I can't actually get this specific commit to build because
|
With the risk of oversharing:
I do not know if it is important, but the final two |
Just to follow up (again; sorry) - I did a I guess the question is (and remember, my C++ is limited to the ability to build other people's stuff), what can I do now I have this to debug this problem? I'm happy to share anything. |
Output from
|
@thomasmichaelwallace I think --track-origins=yes might be more informative than --leak-check=yes here, but that’s just a guess |
Thanks for the prompt:
(I'll update with the suggested output suggested |
As I previously pointed out, I always had a (gdb) where
#0 0x00005564483eda05 in v8::internal::MemoryChunk::InYoungGeneration (
this=0x0) at ../deps/v8/src/heap/spaces.h:837
837 return (GetFlags() & kIsInYoungGenerationMask) != 0;
#1 v8::internal::Heap::InYoungGeneration (heap_object=...)
at ../deps/v8/src/heap/heap-inl.h:389
#2 0x000055644849ac41 in v8::internal::Scavenger::ScavengeObject<v8::internal::FullHeapObjectSlot> (this=this@entry=0x55644cbcb890, p=p@entry=...,
object=object@entry=...) at ../deps/v8/src/objects/heap-object.h:219
#3 0x000055644849e0ea in v8::internal::ScavengeVisitor::VisitHeapObjectImpl<v8::internal::FullObjectSlot> (this=0x7ffe4d4a8d00, heap_object=..., slot=...)
at ../deps/v8/src/base/macros.h:365
#4 v8::internal::ScavengeVisitor::VisitPointersImpl<v8::internal::FullObjectSlot> (end=..., start=..., this=<optimized out>, host=...)
at ../deps/v8/src/heap/scavenger-inl.h:474
#5 v8::internal::ScavengeVisitor::VisitPointers (end=..., start=...,
host=..., this=<optimized out>) at ../deps/v8/src/heap/scavenger-inl.h:427
#6 v8::internal::BodyDescriptorBase::IteratePointers<v8::internal::ScavengeVisitor> (obj=..., obj@entry=..., end_offset=end_offset@entry=112,
v=v@entry=0x7ffe4d4a8d00, start_offset=8)
at ../deps/v8/src/objects/objects-body-descriptors-inl.h:127
--Type <RET> for more, q to quit, c to continue without paging--
#7 0x000055644849fcda in v8::internal::FlexibleBodyDescriptor<8>::IterateBody<v8::internal::ScavengeVisitor> (v=0x7ffe4d4a8d00, object_size=112, obj=...,
map=...) at ../deps/v8/src/objects/objects-body-descriptors.h:118
#8 v8::internal::HeapVisitor<int, v8::internal::ScavengeVisitor>::VisitStruct
(object=..., map=..., this=0x7ffe4d4a8d00)
at ../deps/v8/src/heap/objects-visiting-inl.h:154
#9 v8::internal::HeapVisitor<int, v8::internal::ScavengeVisitor>::Visit (
this=this@entry=0x7ffe4d4a8d00, map=..., object=object@entry=...)
at ../deps/v8/src/heap/objects-visiting-inl.h:59
#10 0x00005564484a3fd9 in v8::internal::HeapVisitor<int, v8::internal::ScavengeVisitor>::Visit (object=..., this=0x7ffe4d4a8d00)
at /usr/include/x86_64-linux-gnu/bits/string_fortified.h:34
#11 v8::internal::Scavenger::Process (this=0x55644cbcb890,
barrier=<optimized out>) at ../deps/v8/src/heap/scavenger.cc:547
#12 0x00005564484a49ca in v8::internal::ScavengingTask::ProcessItems (
this=0x55644cbf3b10) at ../deps/v8/src/heap/scavenger.cc:70
#13 v8::internal::ScavengingTask::RunInParallel (this=0x55644cbf3b10,
runner=<optimized out>) at ../deps/v8/src/heap/scavenger.cc:49
#14 0x000055644842eebf in v8::internal::ItemParallelJob::Task::RunInternal (
--Type <RET> for more, q to quit, c to continue without paging--
this=<optimized out>) at ../deps/v8/src/heap/item-parallel-job.cc:34
#15 v8::internal::CancelableTask::Run (this=<optimized out>)
at ../deps/v8/src/tasks/cancelable-task.h:155
#16 v8::internal::ItemParallelJob::Run (this=this@entry=0x7ffe4d4a9010)
at ../deps/v8/src/heap/item-parallel-job.cc:103
#17 0x00005564484a20ed in v8::internal::ScavengerCollector::CollectGarbage (
this=0x55644cb46880) at ../deps/v8/src/heap/scavenger.cc:303
#18 0x00005564483f1083 in v8::internal::Heap::Scavenge (
this=this@entry=0x55644caddb90) at /usr/include/c++/9/bits/unique_ptr.h:360
#19 0x000055644841e454 in v8::internal::Heap::PerformGarbageCollection (
this=this@entry=0x55644caddb90,
collector=collector@entry=v8::internal::SCAVENGER,
gc_callback_flags=gc_callback_flags@entry=v8::kNoGCCallbackFlags)
at ../deps/v8/src/heap/heap.cc:2028
#20 0x000055644841ed62 in v8::internal::Heap::CollectGarbage (
this=this@entry=0x55644caddb90, space=space@entry=v8::internal::NEW_SPACE,
gc_reason=gc_reason@entry=v8::internal::GarbageCollectionReason::kAllocationFailure, gc_callback_flags=gc_callback_flags@entry=v8::kNoGCCallbackFlags)
at ../deps/v8/src/heap/heap.cc:1587
--Type <RET> for more, q to quit, c to continue without paging--
#21 0x0000556448421daf in v8::internal::Heap::AllocateRawWithLightRetrySlowPath
(this=this@entry=0x55644caddb90, size=size@entry=16,
allocation=v8::internal::AllocationType::kYoung,
origin=origin@entry=v8::internal::AllocationOrigin::kRuntime,
alignment=alignment@entry=v8::internal::kDoubleUnaligned)
at ../deps/v8/include/v8-internal.h:223
#22 0x0000556448421f45 in v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath (this=this@entry=0x55644caddb90, size=size@entry=16,
allocation=<optimized out>,
origin=origin@entry=v8::internal::AllocationOrigin::kRuntime,
alignment=alignment@entry=v8::internal::kDoubleUnaligned)
at ../deps/v8/src/heap/heap.cc:5000
#23 0x00005564483c55a1 in v8::internal::Heap::AllocateRawWith<(v8::internal::Heap::AllocationRetryMode)1> (this=this@entry=0x55644caddb90, size=size@entry=16,
allocation=allocation@entry=v8::internal::AllocationType::kYoung,
origin=origin@entry=v8::internal::AllocationOrigin::kRuntime,
alignment=alignment@entry=v8::internal::kDoubleUnaligned)
at ../deps/v8/src/objects/heap-object.h:108
#24 0x00005564483c56e8 in v8::internal::Factory::AllocateRaw (
--Type <RET> for more, q to quit, c to continue without paging--
this=this@entry=0x55644cad4880, size=size@entry=16,
allocation=allocation@entry=v8::internal::AllocationType::kYoung,
alignment=alignment@entry=v8::internal::kDoubleUnaligned)
at ../deps/v8/src/execution/isolate.h:913
#25 0x00005564483ab919 in v8::internal::FactoryBase<v8::internal::Factory>::AllocateRaw (this=this@entry=0x55644cad4880, size=size@entry=16,
allocation=allocation@entry=v8::internal::AllocationType::kYoung,
alignment=alignment@entry=v8::internal::kDoubleUnaligned)
at ../deps/v8/src/heap/factory-base.cc:236
#26 0x00005564483ab930 in v8::internal::FactoryBase<v8::internal::Factory>::AllocateRawWithImmortalMap (this=this@entry=0x55644cad4880, size=size@entry=16,
allocation=allocation@entry=v8::internal::AllocationType::kYoung, map=...,
alignment=alignment@entry=v8::internal::kDoubleUnaligned)
at ../deps/v8/src/heap/factory-base.cc:227
#27 0x00005564483b24d1 in v8::internal::Factory::NewHeapNumber<(v8::internal::AllocationType)0> (this=this@entry=0x55644cad4880)
at /usr/include/x86_64-linux-gnu/bits/string_fortified.h:34
#28 0x00005564483b42a2 in v8::internal::Factory::NewHeapNumber<(v8::internal::AllocationType)0> (value=6.9529954760082175e-310, this=0x55644cad4880)
--Type <RET> for more, q to quit, c to continue without paging--
at ../deps/v8/src/heap/factory-inl.h:66
#29 v8::internal::Factory::NewNumber<(v8::internal::AllocationType)0> (
this=0x55644cad4880, value=value@entry=0.33400000000000002)
at ../deps/v8/src/heap/factory.cc:2029
#30 0x000055644857fe19 in v8::internal::JsonParser<unsigned short>::ParseJsonNumber (this=this@entry=0x7ffe4d4aa580) at ../deps/v8/src/execution/isolate.h:1059
#31 0x0000556448581278 in v8::internal::JsonParser<unsigned short>::ParseJsonValue (this=this@entry=0x7ffe4d4aa580)
at /usr/include/c++/9/ext/new_allocator.h:89
#32 0x0000556448581be2 in v8::internal::JsonParser<unsigned short>::ParseJson (
this=this@entry=0x7ffe4d4aa580) at ../deps/v8/src/json/json-parser.cc:309
#33 0x00005564481e2558 in v8::internal::JsonParser<unsigned short>::Parse (
reviver=..., source=..., isolate=0x55644cad4880)
at ../deps/v8/src/handles/handles.h:108
#34 v8::internal::Builtin_Impl_JsonParse (args=...,
isolate=isolate@entry=0x55644cad4880)
at ../deps/v8/src/builtins/builtins-json.cc:24
#35 0x00005564481e3bd0 in v8::internal::Builtin_JsonParse (args_length=6,
args_object=0x7ffe4d4aa690, isolate=0x55644cad4880)
--Type <RET> for more, q to quit, c to continue without paging--
at ../deps/v8/src/builtins/builtins-json.cc:16
#36 0x0000556448fa1ba0 in Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit () at ../../deps/v8/../../deps/v8/src/builtins/promise-misc.tq:91
#37 0x0000556448d9f458 in Builtins_InterpreterEntryTrampoline ()
at ../../deps/v8/../../deps/v8/src/objects/string.tq:72 (gdb) p this
$1 = (const v8::internal::MemoryChunk * const) 0x0 |
ok - another thread too seems to have entered the scavenge cycle, which is in the process of re-arranging the chunks / objects. Can scavenge run in parallel? If so, how do the threads co-ordinate? (gdb) t 3
(gdb) where
#0 0x0000557bca39e3f0 in v8::base::List<v8::internal::MemoryChunk>::Contains (
this=0x557bce712f70, element=0x49b03cc0000) at ../deps/v8/src/base/list.h:60
#1 v8::base::List<v8::internal::MemoryChunk>::Remove (element=0x49b03cc0000,
this=0x557bce712f70) at ../deps/v8/src/base/list.h:43
#2 v8::internal::PagedSpace::RemovePage (this=this@entry=0x557bce712f50,
page=page@entry=0x49b03cc0000) at ../deps/v8/src/heap/spaces.cc:1841
#3 0x0000557bca3acba0 in v8::internal::PagedSpace::RefillFreeList (
this=0x557bce7d0590) at ../deps/v8/src/heap/spaces.cc:1700
#4 0x0000557bca3a9f5e in v8::internal::PagedSpace::RawSlowRefillLinearAllocationArea (this=0x557bce7d0590, size_in_bytes=32,
origin=v8::internal::AllocationOrigin::kGC)
at ../deps/v8/src/heap/spaces.cc:3809
#5 0x0000557bca29857a in v8::internal::PagedSpace::EnsureLinearAllocationArea (
origin=v8::internal::AllocationOrigin::kGC, size_in_bytes=32,
this=0x557bce7d0590) at ../deps/v8/src/heap/spaces-inl.h:387
#6 v8::internal::PagedSpace::EnsureLinearAllocationArea (
origin=v8::internal::AllocationOrigin::kGC, size_in_bytes=32,
this=0x557bce7d0590) at ../deps/v8/src/heap/spaces-inl.h:382
#7 v8::internal::PagedSpace::AllocateRawUnaligned (
this=this@entry=0x557bce7d0590, size_in_bytes=size_in_bytes@entry=32,
origin=origin@entry=v8::internal::AllocationOrigin::kGC)
at ../deps/v8/src/heap/spaces-inl.h:419
#8 0x0000557bca2994b4 in v8::internal::PagedSpace::AllocateRaw (
this=0x557bce7d0590, size_in_bytes=32, alignment=<optimized out>,
origin=v8::internal::AllocationOrigin::kGC)
at ../deps/v8/src/heap/spaces-inl.h:483
#9 0x0000557bca37f06b in v8::internal::LocalAllocator::Allocate (
alignment=v8::internal::kWordAligned,
origin=v8::internal::AllocationOrigin::kGC, object_size=32,
space=v8::internal::OLD_SPACE, this=0x557bce7d0578)
at ../deps/v8/src/heap/spaces.h:3107
#10 v8::internal::Scavenger::PromoteObject<v8::internal::FullHeapObjectSlot> (
object_fields=v8::internal::ObjectFields::kDataOnly, object_size=32,
object=..., slot=..., map=..., this=0x557bce7d04e0)
at ../deps/v8/src/heap/scavenger-inl.h:174
#11 v8::internal::Scavenger::EvacuateObjectDefault<v8::internal::FullHeapObjectSlot> (this=this@entry=0x557bce7d04e0, map=map@entry=..., slot=slot@entry=...,
object=..., object_size=object_size@entry=32,
object_fields=v8::internal::ObjectFields::kDataOnly)
at ../deps/v8/src/heap/scavenger-inl.h:259
#12 0x0000557bca37fae8 in v8::internal::Scavenger::EvacuateObject<v8::internal::FullHeapObjectSlot> (source=..., map=..., slot=..., this=0x557bce7d04e0)
at ../deps/v8/src/objects/map.h:814
#13 v8::internal::Scavenger::ScavengeObject<v8::internal::FullHeapObjectSlot> (
this=0x557bce7d04e0, p=p@entry=..., object=...)
at ../deps/v8/src/heap/scavenger-inl.h:396
#14 0x0000557bca38019f in v8::internal::IterateAndScavengePromotedObjectsVisitor::HandleSlot<v8::internal::FullHeapObjectSlot> (this=this@entry=0x7f385a7fbb20,
--Type <RET> for more, q to quit, c to continue without paging--
host=host@entry=..., slot=slot@entry=..., target=..., target@entry=...)
at ../deps/v8/src/base/atomic-utils.h:149
#15 0x0000557bca380635 in v8::internal::IterateAndScavengePromotedObjectsVisitor::VisitPointersImpl<v8::internal::FullObjectSlot> (end=..., start=..., host=...,
this=0x7f385a7fbb20) at ../deps/v8/src/base/macros.h:365
#16 v8::internal::IterateAndScavengePromotedObjectsVisitor::VisitPointers (
end=..., start=..., host=..., this=0x7f385a7fbb20)
at ../deps/v8/src/heap/scavenger.cc:94
#17 v8::internal::BodyDescriptorBase::IteratePointers<v8::internal::IterateAndScavengePromotedObjectsVisitor> (obj=obj@entry=...,
end_offset=end_offset@entry=112, v=v@entry=0x7f385a7fbb20, start_offset=8)
at ../deps/v8/src/objects/objects-body-descriptors-inl.h:127
#18 0x0000557bca380eef in v8::internal::FlexibleBodyDescriptor<8>::IterateBody<v8::internal::IterateAndScavengePromotedObjectsVisitor> (v=0x7f385a7fbb20,
object_size=112, obj=..., map=...)
at ../deps/v8/src/objects/objects-body-descriptors.h:118
--Type <RET> for more, q to quit, c to continue without paging--
#19 v8::internal::CallIterateBody::apply<v8::internal::FlexibleBodyDescriptor<8>, v8::internal::IterateAndScavengePromotedObjectsVisitor> (v=0x7f385a7fbb20,
object_size=112, obj=..., map=...)
at ../deps/v8/src/objects/objects-body-descriptors-inl.h:1088
#20 v8::internal::BodyDescriptorApply<v8::internal::CallIterateBody, void, v8::internal::Map, v8::internal::HeapObject, int, v8::internal::IterateAndScavengePromotedObjectsVisitor*> (p4=0x7f385a7fbb20, p3=112, p2=..., p1=...,
type=<optimized out>)
at ../deps/v8/src/objects/objects-body-descriptors-inl.h:1056
#21 v8::internal::HeapObject::IterateBodyFast<v8::internal::IterateAndScavengePromotedObjectsVisitor> (this=<synthetic pointer>, v=0x7f385a7fbb20,
object_size=112, map=...)
at ../deps/v8/src/objects/objects-body-descriptors-inl.h:1094
#22 v8::internal::Scavenger::IterateAndScavengePromotedObject (
this=this@entry=0x557bce7d04e0, target=target@entry=..., map=map@entry=...,
size=size@entry=112) at ../deps/v8/src/heap/scavenger.cc:471
--Type <RET> for more, q to quit, c to continue without paging--
#23 0x0000557bca389193 in v8::internal::Scavenger::Process (
this=0x557bce7d04e0, barrier=<optimized out>)
at ../deps/v8/src/heap/scavenger.cc:559
#24 0x0000557bca389b8a in v8::internal::ScavengingTask::ProcessItems (
this=0x557bd0014c70) at ../deps/v8/src/heap/scavenger.cc:70
#25 v8::internal::ScavengingTask::RunInParallel (this=0x557bd0014c70,
runner=<optimized out>) at ../deps/v8/src/heap/scavenger.cc:54
#26 0x0000557bca313a91 in v8::internal::ItemParallelJob::Task::RunInternal (
this=0x557bd0014c70) at ../deps/v8/src/heap/item-parallel-job.cc:34
#27 0x0000557bca16d121 in non-virtual thunk to v8::internal::CancelableTask::Run() () at ../deps/v8/src/heap/heap-write-barrier-inl.h:213
#28 0x0000557bc9dc28d7 in node::(anonymous namespace)::PlatformWorkerThread (
data=0x557bce693670) at ../src/node_platform.cc:43
#29 0x00007f3861162609 in start_thread (arg=<optimized out>)
at pthread_create.c:477
#30 0x00007f3861089293 in clone () /cc @addaleax @nodejs/v8 |
Is this issue what https://chromium-review.googlesource.com/c/v8/v8/+/2988414 fixes? |
highly probable - as the context seems similar. |
I was so excited by this that I immediately did the following:
As @gireeshpunathil says, it fits the narrative. The bug was [probably] introduced when v8 was updated, it seems to be caused by some combination of parsing json, from a buffer, in a threaded way, where the garbage collector sets off; which is exactly what the patch addresses. So we have the fix!
|
thanks for confirming this @thomasmichaelwallace !! the patch will be consumed here naturally, but takes its own sweet time. Pinging @targos to know the standard procedure v8 changes to Node: do we pro-actively PR in master, or cherry-pick a bunch of v8 patches occasionally, or consume v8 only on version boundaries. |
Refs: v8/v8@9.1.269.36...9.1.269.38 Fixes: #37553 PR-URL: #39196 Reviewed-By: Richard Lau <rlau@redhat.com> Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com> Reviewed-By: Jiawen Geng <technicalcute@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: Tobias Nießen <tniessen@tnie.de> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Original commit message: [JSON] Fix GC issue in BuildJsonObject We must ensure that the sweeper is not running or has already swept mutable_double_buffer. Otherwise the GC can add it to the free list. Bug: v8:11837 Change-Id: Ifd9cf15f1c94f664fd6489c70bb38b59730cdd78 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2928181 Commit-Queue: Victor Gomes <victorgomes@chromium.org> Reviewed-by: Toon Verwaest <verwaest@chromium.org> Reviewed-by: Dominik Inführ <dinfuehr@chromium.org> Cr-Commit-Position: refs/heads/master@{#74859} Refs: v8/v8@81181a8 PR-URL: nodejs#39187 Fixes: nodejs#37553 Refs: v8/v8@81181a8 Reviewed-By: Michaël Zasso <targos@protonmail.com> Reviewed-By: Richard Lau <rlau@redhat.com> Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Original commit message: [JSON] Fix GC issue in BuildJsonObject We must ensure that the sweeper is not running or has already swept mutable_double_buffer. Otherwise the GC can add it to the free list. Bug: v8:11837 Change-Id: Ifd9cf15f1c94f664fd6489c70bb38b59730cdd78 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2928181 Commit-Queue: Victor Gomes <victorgomes@chromium.org> Reviewed-by: Toon Verwaest <verwaest@chromium.org> Reviewed-by: Dominik Inführ <dinfuehr@chromium.org> Cr-Commit-Position: refs/heads/master@{#74859} Refs: v8/v8@81181a8 PR-URL: #39187 Fixes: #37553 Refs: v8/v8@81181a8 Reviewed-By: Michaël Zasso <targos@protonmail.com> Reviewed-By: Richard Lau <rlau@redhat.com> Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Original commit message: [JSON] Fix GC issue in BuildJsonObject We must ensure that the sweeper is not running or has already swept mutable_double_buffer. Otherwise the GC can add it to the free list. Bug: v8:11837 Change-Id: Ifd9cf15f1c94f664fd6489c70bb38b59730cdd78 Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2928181 Commit-Queue: Victor Gomes <victorgomes@chromium.org> Reviewed-by: Toon Verwaest <verwaest@chromium.org> Reviewed-by: Dominik Inführ <dinfuehr@chromium.org> Cr-Commit-Position: refs/heads/master@{#74859} Refs: v8/v8@81181a8 PR-URL: nodejs#39187 Fixes: nodejs#37553 Refs: v8/v8@81181a8 Reviewed-By: Michaël Zasso <targos@protonmail.com> Reviewed-By: Richard Lau <rlau@redhat.com> Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
14.15.5
Linux WorkMachine 5.11.1-arch1-1 #1 SMP PREEMPT Tue, 23 Feb 2021 14:05:30 +0000 x86_64 GNU/Linux
What steps will reproduce the bug?
As far as we found out, the segfault happens if
Node.js
sends/receives lots of data via sockets and processes it in an expensive synchronous method (e.g.JSON.parse
).The original problem involved some basic
JSON
data processing where the data was received from aRabbitMQ
using the amqplib npm package. Meanwhile we were able to recreate the problem by only usingNode.js
internal mechanisms (net
package) in this sample repository:https://github.com/hellivan/nodejs-14.15.5-ConcurrentMarking-segfault
How often does it reproduce? Is there a required condition?
The error only reproduces under uncertain conditions that are difficult to replicate. Under normal circumstances, it may possible that the application runs for hours and then crashes without a reason. However it may also happen that it crashes right after the start.
What is the expected behavior?
Node.js
runtime should executeJS
application without interruptions.What do you see instead?
Node.js
crashes with aSIGSEGV
.Additional information
During the analysis of the original application crashes, we were able to extract some coredumps which are listed below. Due to privacy reasons we replaced some paths in the results. Due to the complexity of the original application, we created a reduced sample application, which we hope reproduces the same segmentation fault as the original one. During our tests, we found out that other
Node.js
versions may be affected by this bug, too. We were able to sporadically reproduce the issue forNode.js
versions14.16.0
and15.10.0
.If you need any help or information regarding the coredumps please let me know.
1. Coredump
General information about node instance
List of all threads
Threads' backtrace
2. Coredump
List of all threads
Threads' backtrace
3. Coredump
List of all threads
Threads' backtrace
The text was updated successfully, but these errors were encountered: