Flush message queue by double-buffer flipping #89464

SlugFiller · 2024-03-13T23:51:26Z

Uses a double buffer for message queues, so using call_deferred during a message queue flush adds to a second buffer. After the current buffer finishes processing, it flips the buffers, deallocating the processed buffer, and continues to flush the second buffer. This repeats until both buffers are empty.

This reduces lock contention during message queue flush, and ensures queue memory usage does not grow unbounded if deferred calls add to the queue. It does not, however, allow rendering a frame or handling events between processing of each of the buffers, only when both buffers are clear.

AThousandShips · 2024-03-14T06:53:03Z

The issue is this:

This will mask invalid behaviour and fail silently, i.e. lock up, instead of IMO correctly erroring
The worst part to me though is that people in general assume this to fire in the next frame, this change will make that behaviour continue to be bad, and cause confusion and frustration just like the original, crash, did, but now it'll be unclear, with no error telling them something is wrong, just a game that freezes forever
This is in my opinion unlikely to cause any memory savings, and will just add memory usage, the vast majority of deferred calls won't call other methods deferred, so that won't matter

So I'd say this just adds complexity and memory use for a feature that's not needed, and that makes the code less safe, but those are just my two cents

Have you hit a situation where the message queue runs out of memory in normal, non destructive use? It takes quite a while as I recall from testing my own attempts to fix this, with infinite self calls, several seconds of freezing before the out of memory message fired, so I don't see any normal use of the engine ever exhausting the queue space, have you had cases where this actually was a limitation to your usage?

Edit: Removed the reference to the bug report as this does not fix it, that bug was due to a misunderstanding of how the deferred mechanism works, the OP thought (as many do, including me originally) that the call puts it in the next frame, that was fixed by clarifying the documentation to clarify that an unconditional deferred call will infinitely loop, this doesn't make the use case of the OP work, it just changes it from a crash to a freeze

Edit 2: and I'm sorry if I'm coming across, or being, too contratian here, but this is a very very sensitive and critical part of the engine used in every part of the engine and which is critical to functioning and performance, so it needs to be lean and robust, if there's a legitimate pain point which this solves I'm all in favour, but I'm unconvinced this is something creating a limitation for users, I don't see exhausting the message queue happening without stuttering that's outside of the reasonable, or that real world reasonable use would be benefited by this

SlugFiller · 2024-03-14T12:54:06Z

Out of memory crashes do not produce an error. They just make the editor/game mysteriously "go away", which should never happen. Message queues are generally expected to act as a trampoline, specifically for working around the limitations of regular recursion (stack overflow). Having bounded memory usage is imperative.

Even if the target machine has plenty of memory, since each page is 4k bytes, allocating the 8k for 2 pages is still better than the megabyte that a few hundred cycles with a few parameters (stored internally as Variants) could add up to.

P.S. I want to put on the record something I mentioned in chat. A bigger issue with the message queue currently is that it uses LocalVector to store an array of pointers to pages. The pages can be allocated by a custom allocator, but the LocalVector, naturally, cannot, and uses its own allocation. In fact, it completely defeats the purpose of having paged allocation, since it needs to reallocate as it grows, and can easily cause memory fragmentation. If the purpose is to be lean and robust, this is definitely going in the opposite direction.

The proper way it should be handled is by having each page start with a pointer to the next page, and only hold the first and last page pointers in the CallQueue class. The queue has no need for random access, so this structure makes more sense. However, I felt that a change like that would be out of scope for this specific PR.

AThousandShips · 2024-03-14T13:00:43Z

Out of memory crashes do not produce an error.

They do, right here:

godot/core/object/message_queue.cpp

Line 98 in da945ce

    
           fprintf(stderr, "Failed method: %s. Message queue out of memory. %s\n", String(p_callable).utf8().get_data(), error_text.utf8().get_data());

Flush message queue by double-buffer flipping

6718d99

SlugFiller requested a review from a team as a code owner March 13, 2024 23:51

KoBeWi added enhancement topic:core labels Mar 14, 2024

KoBeWi added this to the 4.x milestone Mar 14, 2024

AThousandShips added discussion needs testing labels Mar 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flush message queue by double-buffer flipping #89464

Flush message queue by double-buffer flipping #89464

SlugFiller commented Mar 13, 2024 •

edited by AThousandShips

Loading

AThousandShips commented Mar 14, 2024 •

edited

Loading

SlugFiller commented Mar 14, 2024

AThousandShips commented Mar 14, 2024

Flush message queue by double-buffer flipping #89464

Are you sure you want to change the base?

Flush message queue by double-buffer flipping #89464

Conversation

SlugFiller commented Mar 13, 2024 • edited by AThousandShips Loading

AThousandShips commented Mar 14, 2024 • edited Loading

SlugFiller commented Mar 14, 2024

AThousandShips commented Mar 14, 2024

SlugFiller commented Mar 13, 2024 •

edited by AThousandShips

Loading

AThousandShips commented Mar 14, 2024 •

edited

Loading