Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Highly concurrent add/drop of logger sinks cause the server to crash #4

Open
F1F88 opened this issue Nov 6, 2024 · 4 comments · May be fixed by #13
Open

Highly concurrent add/drop of logger sinks cause the server to crash #4

F1F88 opened this issue Nov 6, 2024 · 4 comments · May be fixed by #13
Assignees
Labels
bug Something isn't working

Comments

@F1F88
Copy link
Owner

F1F88 commented Nov 6, 2024

现象

如下复现代码,执行 Crash() 时有较高几率出现服务器崩溃

原因

ref: spdlog-wiki

logger->sinks() 不是线程安全的,在高并发情况下修改可能会出现空指针异常,从而导致服务器崩溃。

影响范围

Logger.AddSink(), Logger.DropSink() 添加自 v1.0.0

异步(多线程) 添加自 v1.2.0

复现代码

#include <sourcemod>
#include <log4sp>

public void OnPluginStart()
{
    PrintToServer("****************** Log4sp Crash ******************");
    RegConsoleCmd("sm_log4sp_crash", CB_CommandCrash);
    RegConsoleCmd("sm_log4sp_no_crash", CB_CommandNoCrash);
}

Action CB_CommandCrash(int client, int args)
{
    RequestFrame(Crash);
    return Plugin_Handled;
}

void Crash()
{
    PrintToServer("========== Crash Start ==========");
    Logger logger = Logger.CreateServerConsoleLogger("logger-test-crash", true);

    for (int i = 0; i < 10; i++)
    {
        Sink sink = new ServerConsoleSinkMT();
        logger.AddSink(sink);
        logger.InfoAmxTpl("Test Server Console MT Log - Crash - %d", i);
        logger.DropSink(sink);
        delete sink;
    }

    delete logger;
    PrintToServer("========== Crash End ==========");
}

Action CB_CommandNoCrash(int client, int args)
{
    RequestFrame(NoCrash);
    return Plugin_Handled;
}

void NoCrash()
{
    PrintToServer("========== NoCrash Start ==========");

    for (int i = 0; i < 1000; i++)
    {
        Sink sinks[2];
        sinks[0] = new ServerConsoleSinkMT();
        sinks[1] = new ServerConsoleSinkMT();

        Logger logger = new Logger("logger-test-no-crash", sinks, 2, true);
        logger.InfoAmxTpl("Test Server Console MT Log - NoCrash - %d", i);

        delete sinks[0];
        delete sinks[1];
        delete logger;
    }

    PrintToServer("========== NoCrash End ==========");
}

解决方案

A. 移除多线程

理由

  • 单线程的性能很好,每秒能写入上百万条(实测约为 300 万)日志数据到文件里

  • 在性能实测中,多线程因为锁的原因,大部分情况都要更慢

  • 多线程唯一比单线程快,在于多线程输出到控制台,且队列已满时丢弃

。单线程即使最慢的控制台,每秒也能输出十多万条数据

  • 单线程更简单,代码简洁清晰

  • 影响范围较大,且将失去异步记录日志的特性

B. 移除 Logger.AddSink(), Logger.DropSink()

  • 移除后仍有 new Logger() 的方案创建含有多 Sink 的 Logger

  • 涉及的改动相对较少,影响范围略大(多 Sink 的 Logger 只有 2 种创建方式,而这移除的是其一)

C. 解决并发问题

  • 在修改 Logger->sinks() 时添加锁

  • 影响最小

  • 难度,复杂度,耗时可能偏高

D. 用户自己控制

  • 模仿 spdlog,由用户自己避开高并发调用 AddSink(), DropSink()
@F1F88
Copy link
Owner Author

F1F88 commented Nov 9, 2024

崩溃条件

  1. 创建异步 Logger
  2. 创建多线程 Sink
  3. 使用 Logger.AddSink() 将多线程 Sink 添加到异步 Logger
  4. 使用异步 Logger 打印一条日志消息
  5. 使用 Logger.DropSink() 将多线程 Sink 移除出异步 Logger
  6. 使用 delete Sink; 将多线程 Sink 从注册器中移除

假设线程 A 执行到步骤 4 时,代码进入到了遍历所有 sinks 打印日志,但没有遍历到步骤 3 添加的多线程 Sink 来输出;

接着线程 B 执行了步骤 5 和步骤 6,使得内存里的多线程 Sink 引用数归零,从而被智能指针删除;

最后线程 A 轮到继续执行,准备调用多线程 Sink 来输出消息时,由于线程 B 导致多线程 Sink 成了野指针,就会造成服务器崩溃。


虽然 Logger.AddSink() 也存在线程安全问题,但不会导致服务器崩溃

#include <sourcemod>
#include <log4sp>

public void OnPluginStart()
{
    RegConsoleCmd("sm_log4sp_no_crash", CB_CommandNoCrash);
}

Action CB_CommandNoCrash(int client, int args)
{
    RequestFrame(NoCrash);
    return Plugin_Handled;
}

void NoCrash()
{
    PrintToServer("========== NoCrash Start ==========");

    for (int i = 0; i < 50000; i++)
    {
        Logger logger = Logger.CreateServerConsoleLogger("logger-test-crash", true);
        Sink sink = new ServerConsoleSinkMT();
        logger.AddSink(sink);
        logger.InfoAmxTpl("Test Server Console MT Log - NoCrash - %d", i);
        delete sink;
        delete logger;
    }

    PrintToServer("========== NoCrash End ==========");
}

@F1F88 F1F88 pinned this issue Nov 9, 2024
@F1F88 F1F88 unpinned this issue Nov 9, 2024
@F1F88 F1F88 closed this as completed in 2c41f98 Nov 9, 2024
@F1F88
Copy link
Owner Author

F1F88 commented Nov 20, 2024

线程安全文档:https://github.com/gabime/spdlog/wiki/1.1.-Thread-Safety#non-thread-safe-functions

动态添加 sink 的讨论:gabime/spdlog#674

@F1F88 F1F88 reopened this Nov 22, 2024
@F1F88
Copy link
Owner Author

F1F88 commented Nov 22, 2024

Since random crashes occurred again during high-concurrency testing, this issue is reopened.
The cause of the crash is likely caused by AddSink(). I think it is still necessary to add a lock mechanism or use a thread-safe method (such as dist_sink) to fix this problem.


由于高并发测试时再次出现了随机崩溃,重新打开此Issue。
崩溃的原因很可能是AddSink()造成的,我认为还是需要增加锁机制或者使用线程安全的方法(如:dist_sink)来修复这个问题。

@F1F88 F1F88 changed the title 高并发修改 logger 的 sinks 可能会导致服务器崩溃 Highly concurrent add/drop of logger sinks cause the server to crash Nov 22, 2024
@F1F88
Copy link
Owner Author

F1F88 commented Nov 22, 2024

保存测试并发问题的 cpp 代码

#include <iostream>
using namespace std;

#include "spdlog/async.h"
#include "spdlog/spdlog.h"
#include "spdlog/sinks/stdout_sinks.h"
#include "spdlog/sinks/dist_sink.h"

std::shared_ptr<spdlog::logger> logger;

int number = 10000;

void noCrash()
{
    for (int i = 0; i < 10000; ++i)
    {
        auto dist_sink = std::dynamic_pointer_cast<spdlog::sinks::dist_sink_mt>(logger->sinks().front());
        auto sink = std::make_shared<spdlog::sinks::stdout_sink_mt>();
        dist_sink->add_sink(sink);
        logger->info("===== nihao ===== - {:10d} {:10d} {:10d} {:10d} {:10d} {:10d}=====", i, i, i, i, i, number--);
        dist_sink->remove_sink(sink);
    }
}

int main()
{
    spdlog::init_thread_pool(32, 10);

    auto sink = std::make_shared<spdlog::sinks::stdout_sink_mt>();
    auto sinks = std::vector<spdlog::sink_ptr>{sink};
    auto dist_sink = std::make_shared<spdlog::sinks::dist_sink_mt>(sinks);
    logger = std::make_shared<spdlog::async_logger>("test-no-crash", dist_sink, spdlog::thread_pool());

    std::thread t1(noCrash);
    std::thread t2(noCrash);
    std::thread t3(noCrash);
    std::thread t4(noCrash);

    t1.join();
    t2.join();
    t3.join();
    t4.join();

    spdlog::drop("test-no-crash");
    spdlog::shutdown();
    return 0;
}
#include <iostream>
using namespace std;

#include "spdlog/async.h"
#include "spdlog/spdlog.h"
#include "spdlog/sinks/stdout_sinks.h"
#include "spdlog/sinks/dist_sink.h"

std::shared_ptr<spdlog::logger> logger;

int number = 10000;

void crash()
{
    for (int i = 0; i < 10000; ++i)
    {
        logger->sinks().push_back(std::make_shared<spdlog::sinks::stdout_sink_mt>());
        logger->info("===== nihao ===== A - {:10d} {:10d} {:10d} {:10d} {:10d} {:10d}=====", i, i, i, i, i, number--);
        logger->sinks().pop_back();
    }
}

int main()
{
    spdlog::init_thread_pool(32, 10);

    auto sink = std::make_shared<spdlog::sinks::stdout_sink_mt>();
    logger = std::make_shared<spdlog::async_logger>("test-crash", sink, spdlog::thread_pool());

    std::thread t1(crash);
    std::thread t2(crash);
    std::thread t3(crash);
    std::thread t4(crash);

    t1.join();
    t2.join();
    t3.join();
    t4.join();

    spdlog::drop("test-crash");
    spdlog::shutdown();
    return 0;
}

@F1F88 F1F88 added the bug Something isn't working label Nov 28, 2024
@F1F88 F1F88 self-assigned this Nov 28, 2024
F1F88 added a commit that referenced this issue Dec 3, 2024
@F1F88 F1F88 linked a pull request Dec 9, 2024 that will close this issue
@F1F88 F1F88 linked a pull request Dec 11, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant