Bug:The server crash when balance with duplication #1589

ninsmiracle · 2023-08-28T14:34:28Z

Bug Report

Encountering the same situation as described in issue #693.
However, I can reproduce this bug to occur 100%.

Please answer these questions before submitting your issue. Thanks!

This bug appear in online server when the clueter doing duplication and try to do load balance at the same time. This bug make a lot of machine coredump.

What did you do?

Created a large-scale application with a substantial replica size (greater than 10GB).
Shutdown two replica servers.
Waited for the cluster to achieve complete health.
Initiated data streaming (and waited for 30 minutes).
Enabled duplication function (and waited for 30 minutes).
Restarted the two replica servers.
Set the meta level to lively, initiating the balancing process.

What did you see instead?

server crush with coredump info:

#0  dsn::zlock::lock (this=this@entry=0x98)
    at /home/jiashuo1/work/incubator-pegasus/src/rdsn/src/runtime/zlocks.cpp:89
89      /home/jiashuo1/work/incubator-pegasus/src/rdsn/src/runtime/zlocks.cpp: No such file or directory.
(gdb) #0  dsn::zlock::lock (this=this@entry=0x98)
    at /home/jiashuo1/work/incubator-pegasus/src/rdsn/src/runtime/zlocks.cpp:89
#1  0x00007f7d063eb9af in zauto_lock (lock=..., this=<synthetic pointer>)
    at /home/jiashuo1/work/incubator-pegasus/src/rdsn/include/dsn/tool-api/zlocks.h:121
#2  dsn::replication::mutation_log::max_commit_on_disk (this=this@entry=0x0)
    at /home/jiashuo1/work/incubator-pegasus/src/rdsn/src/replica/mutation_log.cpp:877
#3  0x00007f7d064bcf1b in dsn::replication::load_mutation::run (1678968919
    this=0x2fe2ce380)
    at /home/jiashuo1/work/incubator-pegasus/src/rdsn/src/replica/duplication/duplication_pipeline.cpp:45
#4  0x00007f7d06625631 in dsn::task::exec_internal (
    this=this@entry=0xcc81a0870)
    at /home/jiashuo1/work/incubator-pegasus/src/rdsn/src/runtime/task/task.cpp:176
#5  0x00007f7d0663ace2 in dsn::task_worker::loop (this=0x2e2e580)
    at /home/jiashuo1/work/incubator-pegasus/src/rdsn/src/runtime/task/task_worker.cpp:224
#6  0x00007f7d0663ae60 in dsn::task_worker::run_internal (this=0x2e2e580)
    at /home/jiashuo1/work/incubator-pegasus/src/rdsn/src/runtime/task/task_worker.cpp:204
#7  0x00007f7d052b9a2f in execute_native_thread_routine ()
   from /home/work/app/pegasus/alsgsrv-monetization-master/replica/package/bin/libdsn_utils.so
#8  0x00007f7d030c4dc5 in start_thread () from /lib64/libpthread.so.0
#9  0x00007f7d015c373d in clone () from /lib64/libc.so.6
(gdb) quit

What version of Pegasus are you using?

pegasus2.4

The text was updated successfully, but these errors were encountered:

ninsmiracle added the type/bug This issue reports a bug. label Aug 28, 2023

ninsmiracle mentioned this issue Aug 29, 2023

fix:dup conflict with balance #1590

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug:The server crash when balance with duplication #1589

Bug:The server crash when balance with duplication #1589

ninsmiracle commented Aug 28, 2023 •

edited by acelyc111

Loading

Bug:The server crash when balance with duplication #1589

Bug:The server crash when balance with duplication #1589

Comments

ninsmiracle commented Aug 28, 2023 • edited by acelyc111 Loading

Bug Report

Please answer these questions before submitting your issue. Thanks!

What did you do?

What did you see instead?

What version of Pegasus are you using?

ninsmiracle commented Aug 28, 2023 •

edited by acelyc111

Loading