Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EditorHelp documentation generation thread can cause a deadlock during project load #83677

Open
bitsawer opened this issue Oct 20, 2023 · 1 comment

Comments

@bitsawer
Copy link
Member

Godot version

4.2 custom dev (multiple versions)

System information

Windows 10

Issue description

Creating this issue as discussed on RocketChat.

Sometimes, when opening a project in Godot editor, the editor hangs while loading the project. Process CPU usage will be 0 and when the process is inspected using a C++ debugger the main thread and EditorHelp documentation loading thread (EditorHelp::_load_doc_thread()) seem to be deadlocked, both waiting for a lock. This happens very rarely, I have seen this happen probably less than 5 times in during several months of testing so it is possible the issue very sensitive to thread timing and project structure. The first time I remember this happening was at least a few months ago.

This might be related to #82817 and as discussed on RocketChat the error message about editor_doc_cache.res being in use might also be related to this as this is a file EditorHelp creates and updates.

Call stack from main thread, it looks like it is deadlocked waiting for a write lock (built and run using llvm-mingw on Windows):

NtWaitForAlertByThreadId (@NtWaitForAlertByThreadId:8)
RtlSleepConditionVariableSRW (@RtlSleepConditionVariableSRW:86)
SleepConditionVariableSRW (@SleepConditionVariableSRW:14)
std::__1::__libcpp_condvar_wait(void**, void**) (@std::__1::__libcpp_condvar_wait(void**, void**):7)
std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) (@std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&):8)
std::__1::shared_timed_mutex::lock() (@std::__1::shared_timed_mutex::lock():34)
RWLock::write_lock() (e:\Repositories\godot\core\os\rw_lock.h:59)
RWLockWrite::RWLockWrite(RWLock&) (e:\Repositories\godot\core\os\rw_lock.h:92)
ClassDB::add_signal(StringName const&, MethodInfo const&) (e:\Repositories\godot\core\object\class_db.cpp:853)
DependencyRemoveDialog::_bind_methods() (e:\Repositories\godot\editor\dependency_editor.cpp:660)
DependencyRemoveDialog::initialize_class() (e:\Repositories\godot\editor\dependency_editor.h:97)
DependencyRemoveDialog::_initialize_classv() (e:\Repositories\godot\editor\dependency_editor.h:97)
Object::_postinitialize() (e:\Repositories\godot\core\object\object.cpp:210)
postinitialize_handler(Object*) (e:\Repositories\godot\core\object\object.cpp:1896)
DependencyRemoveDialog* _post_initialize<DependencyRemoveDialog>(DependencyRemoveDialog*) (e:\Repositories\godot\core\os\memory.h:90)
EditorFileDialog::EditorFileDialog() (e:\Repositories\godot\editor\gui\editor_file_dialog.cpp:1941)
ProjectExportDialog::ProjectExportDialog() (e:\Repositories\godot\editor\export\project_export.cpp:1351)
EditorNode::EditorNode() (e:\Repositories\godot\editor\editor_node.cpp:7393)
Main::start() (e:\Repositories\godot\main\main.cpp:3103)
widechar_main(int, wchar_t**) (e:\Repositories\godot\platform\windows\godot_windows.cpp:179)
_main() (e:\Repositories\godot\platform\windows\godot_windows.cpp:204)
main (e:\Repositories\godot\platform\windows\godot_windows.cpp:223)
__tmainCRTStartup (@__tmainCRTStartup:108)
WinMainCRTStartup (@.l_startw:6)
BaseThreadInitThunk (@BaseThreadInitThunk:9)
RtlUserThreadStart (@RtlUserThreadStart:12)

And the EditorHelp loading thread calls stack, deadlocked waiting for a read lock:

NtWaitForAlertByThreadId (@NtWaitForAlertByThreadId:8)
RtlSleepConditionVariableSRW (@RtlSleepConditionVariableSRW:86)
SleepConditionVariableSRW (@SleepConditionVariableSRW:14)
std::__1::__libcpp_condvar_wait(void**, void**) (@std::__1::__libcpp_condvar_wait(void**, void**):7)
std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) (@std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&):8)
std::__1::shared_timed_mutex::lock_shared() (@std::__1::shared_timed_mutex::lock_shared():20)
RWLock::read_lock() const (e:\Repositories\godot\core\os\rw_lock.h:44)
RWLockRead::RWLockRead(RWLock const&) (e:\Repositories\godot\core\os\rw_lock.h:79)
ClassDB::get_class_list(List<StringName, DefaultAllocator>*) (e:\Repositories\godot\core\object\class_db.cpp:94)
ClassDB::get_api_hash(ClassDB::APIType) (e:\Repositories\godot\core\object\class_db.cpp:177)
_compute_doc_version_hash() (e:\Repositories\godot\editor\editor_help.cpp:2320)
EditorHelp::_load_doc_thread(void*) (e:\Repositories\godot\editor\editor_help.cpp:2326)
Thread::callback(unsigned long long, Thread::Settings const&, void (*)(void*), void*) (e:\Repositories\godot\core\os\thread.cpp:61)
decltype(std::declval<void (*)(unsigned long long, Thread::Settings const&, void (*)(void*), void*)>()(std::declval<unsigned long long>(), std::declval<Thread::Settings>(), std::declval<void (*)(void*)>(), std::declval<void*>())) std::__1::__invoke[abi:v160005]<void (*)(unsigned long long, Thread::Settings const&, void (*)(void*), void*), unsigned long long, Thread::Settings, void (*)(void*), void*>(void (*&&)(unsigned long long, Thread::Settings const&, void (*)(void*), void*), unsigned long long&&, Thread::Settings&&, void (*&&)(void*), void*&&) (c:\llvm-mingw-20230603-ucrt-x86_64\include\c++\v1\__functional\invoke.h:394)
void std::__1::__thread_execute[abi:v160005]<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (*)(unsigned long long, Thread::Settings const&, void (*)(void*), void*), unsigned long long, Thread::Settings, void (*)(void*), void*, 2ull, 3ull, 4ull, 5ull>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (*)(unsigned long long, Thread::Settings const&, void (*)(void*), void*), unsigned long long, Thread::Settings, void (*)(void*), void*>&, std::__1::__tuple_indices<2ull, 3ull, 4ull, 5ull>) (c:\llvm-mingw-20230603-ucrt-x86_64\include\c++\v1\thread:282)
void* std::__1::__thread_proxy[abi:v160005]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (*)(unsigned long long, Thread::Settings const&, void (*)(void*), void*), unsigned long long, Thread::Settings, void (*)(void*), void*>>(void*) (c:\llvm-mingw-20230603-ucrt-x86_64\include\c++\v1\thread:293)
_configthreadlocale (@_configthreadlocale:58)
BaseThreadInitThunk (@BaseThreadInitThunk:9)
RtlUserThreadStart (@RtlUserThreadStart:12)

Steps to reproduce

Load any project and the loading can hang forever. However, this happens very rarely.

Minimal reproduction project

Any project will probably do. However, as this might be documentation generation related, having a bigger project with more classes etc. could trigger this more or less often.

@YuriSizov
Copy link
Contributor

YuriSizov commented Oct 20, 2023

So I know for a fact that the order of EditorNode initialization and documentation generation is not deterministic (see #80377). This means that sometimes docs would generate before and sometimes after EditorNode is finished setting itself and everything in it up.

From your logs, this seems like the case as well. We are attempting to generate documentation before the editor is done setting itself up. Seems like while this happens some classes can still be registered with ClassDB which causes the conflict.

Now the question is — is this preventable? I suppose if classes can be initialized as late as they are requested by another object's constructor, then ClassDB never has stable information in it? Exposed classes are registered ahead of time, but unexposed classes can still happen afterwards.

Edit:

So, classes are initialized (initialize_class()) in two situations: you register them and they are allocated for the first time. I guess for the purposes of the documentation we don't care for the latter, we only need registered classes. We just need to make sure docs can do what they need to do without interrupting anything else.

Since deadlocks like that when accessing ClassDB don't normally happen, this is probably something particular about this situation. For example, an obvious hint about this problem is that we call EditorHelp::generate_doc(); in EditorNode constructor, just before we start adding nodes to the tree (which causes allocations, which causes the deadlock). So this could probably be adjusted in some way.

Not sure how to test this though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants