Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug:The server crash when balance with duplication #1589

Open
ninsmiracle opened this issue Aug 28, 2023 · 0 comments
Open

Bug:The server crash when balance with duplication #1589

ninsmiracle opened this issue Aug 28, 2023 · 0 comments
Labels
type/bug This issue reports a bug.

Comments

@ninsmiracle
Copy link
Contributor

ninsmiracle commented Aug 28, 2023

Bug Report

Encountering the same situation as described in issue #693.
However, I can reproduce this bug to occur 100%.

Please answer these questions before submitting your issue. Thanks!

This bug appear in online server when the clueter doing duplication and try to do load balance at the same time. This bug make a lot of machine coredump.

What did you do?

  1. Created a large-scale application with a substantial replica size (greater than 10GB).
  2. Shutdown two replica servers.
  3. Waited for the cluster to achieve complete health.
  4. Initiated data streaming (and waited for 30 minutes).
  5. Enabled duplication function (and waited for 30 minutes).
  6. Restarted the two replica servers.
  7. Set the meta level to lively, initiating the balancing process.

What did you see instead?

server crush with coredump info:

#0  dsn::zlock::lock (this=this@entry=0x98)
    at /home/jiashuo1/work/incubator-pegasus/src/rdsn/src/runtime/zlocks.cpp:89
89      /home/jiashuo1/work/incubator-pegasus/src/rdsn/src/runtime/zlocks.cpp: No such file or directory.
(gdb) #0  dsn::zlock::lock (this=this@entry=0x98)
    at /home/jiashuo1/work/incubator-pegasus/src/rdsn/src/runtime/zlocks.cpp:89
#1  0x00007f7d063eb9af in zauto_lock (lock=..., this=<synthetic pointer>)
    at /home/jiashuo1/work/incubator-pegasus/src/rdsn/include/dsn/tool-api/zlocks.h:121
#2  dsn::replication::mutation_log::max_commit_on_disk (this=this@entry=0x0)
    at /home/jiashuo1/work/incubator-pegasus/src/rdsn/src/replica/mutation_log.cpp:877
#3  0x00007f7d064bcf1b in dsn::replication::load_mutation::run (1678968919
    this=0x2fe2ce380)
    at /home/jiashuo1/work/incubator-pegasus/src/rdsn/src/replica/duplication/duplication_pipeline.cpp:45
#4  0x00007f7d06625631 in dsn::task::exec_internal (
    this=this@entry=0xcc81a0870)
    at /home/jiashuo1/work/incubator-pegasus/src/rdsn/src/runtime/task/task.cpp:176
#5  0x00007f7d0663ace2 in dsn::task_worker::loop (this=0x2e2e580)
    at /home/jiashuo1/work/incubator-pegasus/src/rdsn/src/runtime/task/task_worker.cpp:224
#6  0x00007f7d0663ae60 in dsn::task_worker::run_internal (this=0x2e2e580)
    at /home/jiashuo1/work/incubator-pegasus/src/rdsn/src/runtime/task/task_worker.cpp:204
#7  0x00007f7d052b9a2f in execute_native_thread_routine ()
   from /home/work/app/pegasus/alsgsrv-monetization-master/replica/package/bin/libdsn_utils.so
#8  0x00007f7d030c4dc5 in start_thread () from /lib64/libpthread.so.0
#9  0x00007f7d015c373d in clone () from /lib64/libc.so.6
(gdb) quit

What version of Pegasus are you using?

pegasus2.4

@ninsmiracle ninsmiracle added the type/bug This issue reports a bug. label Aug 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug This issue reports a bug.
Projects
None yet
Development

No branches or pull requests

1 participant