Improve threading in ClassDB and EditorHelp #83695
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Maybe. Related to #83677.
After talking with @RandomShaper and staring at the code for a few hours I'm not at all confident about what's going on. There is clearly an issue with
ClassDB
locking up, but why exactly that happens is hard to deduce because the issue is not easily reproduceable (I've never had this kind of deadlock myself).Analyzing code doesn't yield an obvious problem, but Pedro suggested that one of the issues may be in
ClassDB::get_api_hash
. This method has a read lock, but also callsClassDB::get_class_list
which attempts to set a read lock as well. This may be an undefined behavior. There is also another problem with this method. It actually writes the data sometimes, not just reads is. So a read lock is not sufficient, we need an exclusive one.At first I tried to juggle read and write locks by selectively locking and unlocking the mutex within
ClassDB::get_api_hash
. But that created new deadlocks under some circumstances. So I approached it in another way and extracted relevant bits ofClassDB::get_class_list
back intoClassDB::get_api_hash
(the method is trivial, it's just one loop), and added a write lock for the entire duration of it.I also noticed a couple of methods that didn't have (correct) locks, but I think should have.
ClassDB::_bind_method_custom
is used by GDExtension and doesn't have a write lock.ClassDB::set_method_error_return_values
mutates the state, but has a read lock, and its getter counterpart has no lock at all. I addressed those issues as well.At the same time I was looking at the docs generator for a potential issue. I tried to move the generation to be a bit later so, hopefully,
EditorNode
initialized classes would not get in the way. This isn't a proper solution, because more classes can be initialized at any moment afterwards (and indeed they do), but it doesn't seem to hurt either. I only had to adjustCreateDialog
to not rely on the documentation always being available (which is a good thing to do anyway).I also changed
EditorHelp::_compute_doc_version_hash()
to only be called once, before we attempt to generate anything. This means it no longer runs in a thread, but it also means it only runs once, which should be great, since it's expensive. In fact, based on the internal benchmark I could notice a 0.5-1.0 second reduction in startup times, though YMMV. I also organized related variables and methods a bit inEditorHelp
.Now, whether this helps with #83677 or not, I cannot claim. I think this is better, since locks in
ClassDB
appear to be a pretty clear inconsistency/logical issue. But I can't measure how much better (or indeed worse) this is in actuality. If this is deemed safe, it should be okay for 4.2, but this is far from my comfort zone to recommend it.