Make `RID_Owner` lock-free for fetching. #86333

reduz · 2023-12-19T16:37:24Z

This PR makes RID_Owner lock free for fetching values, this should give a very significant peformance boost where used.

Some considerations:

A maximum number of elements to alocate must be given (by default 200k).
Merged the validators and the data into a single struct, to make it more cache access friendly.
Uses a mutex for thread safety since nowadays spinlocks are discouraged.
Access to the RID structure is still safe given they are independent from addition/removals.
RID access was never really thread-safe in the sense that the contents of the data are not protected anyway. Each server needs to implement locking as it sees fit.

core/templates/rid_owner.h

This PR makes RID_Owner lock free for fetching values, this should give a very significant peformance boost where used. Some considerations: * A maximum number of elements to alocate must be given (by default 200k). * Access to the RID structure is still safe given they are independent from addition/removals. * RID access was never really thread-safe in the sense that the contents of the data are not protected anyway. Each server needs to implement locking as it sees fit.

RandomShaper

If I understand correctly, the thread-safe version has pre-allocate because otherwise memrealloc() can change the base pointer of the data. I'm wondering if that could be solved by using a PagedAllocator (UPDATE: or PagedArray, maybe is what I mean). Maybe the downside is that element access needs a bit of extra arithmetic and, more prominently, indirection to reach the item on the right page. Can you confirm?

RandomShaper · 2023-12-21T08:51:40Z

core/templates/rid_owner.h

 		}

 		if (alloc_count == max_alloc) {
 			//allocate a new chunk
 			uint32_t chunk_count = alloc_count == 0 ? 0 : (max_alloc / elements_in_chunk);
+			if (THREAD_SAFE && chunk_count == chunk_limit) {


Suggested change

if (THREAD_SAFE && chunk_count == chunk_limit) {

if (unlikely(THREAD_SAFE && chunk_count == chunk_limit)) {

RandomShaper · 2023-12-21T09:52:47Z

Aside, given there's no need for thread safety on the data itself (requires external sync anyway), I'm wondering if we could go fully lock-free, this way:

If thread-safe, validator becomes an std::atomic<uint32_t>, with load-acquire and store-release used on it.
If not thread-safe, it keeps being a raw uint32_t.

PS: For the records, there's an idea I don't want to forget and so I'm writing it here: in case we still want locking, the SpinLock may still be advantageous, as it's sync variable could be used even for non-locky operations; e.g., to acquire-load on them. To avoid some of the downsides of it, we'd want to polish #85167. But, again, this is not relevant at the current status of the discussion.

reduz · 2023-12-21T10:59:18Z

@RandomShaper I am wondering if making the validator an atomic really changes anything, my fear is that load-acquire still forces the other CPUs to finalize other memory writes.

RandomShaper · 2023-12-21T12:52:40Z

Acquire-release atomics would at least be superior to either Mutex or SpinLock. The problem is thus with the functions that are now lock-less in this PR even in the thread-safe version. If we are fine with readers potentially seeing old data, those reads could be relaxed, but they have to be atomic in any case (in most or all relevant architectures that will boil down to plain reads, but the code would be more future-proof that way).

reduz · 2023-12-21T15:09:42Z

@RandomShaper I think in theory it should be fine if they see old data in this case, the data will not be accessible anyway after the mutex ends the lock, and will be updated when removed after the mutex also ends the lock. In the meantime, nothing should bother the data access so I think it should be ok. The data itself is not thread safe anyway, even now, so its not related to the actual RID access per se.

Calinou

Tested locally (rebased on top of master 10e1114 + #86333), it works as expected on Linux + NVIDIA.

On https://github.com/godotengine/tps-demo, it loads a full second faster than on master with neither PR applied.

Calinou · 2024-02-02T15:59:35Z

core/templates/rid_owner.h

 		}
 	}

 	void set_description(const char *p_descrption) {
 		description = p_descrption;
 	}

-	RID_Alloc(uint32_t p_target_chunk_byte_size = 65536) {
+	RID_Alloc(uint32_t p_target_chunk_byte_size = 65536, uint32_t p_maximum_amount_of_elements = 200000) {


Out of curiosity, shouldn't the default value for p_maximum_amount_of_elements be a multiple of 65536?

yeah thats a good point

clayjohn · 2024-02-16T21:35:01Z

For context, we need to merge this before merging #90400, but we are in no rush to merge #90400 for now, so no pressure

akien-mga · 2024-09-25T17:25:05Z

Superseded by #97465.

reduz requested a review from a team as a code owner December 19, 2023 16:37

Calinou added enhancement topic:core labels Dec 19, 2023

Calinou added this to the 4.3 milestone Dec 19, 2023

AThousandShips reviewed Dec 19, 2023

View reviewed changes

core/templates/rid_owner.h Outdated Show resolved Hide resolved

reduz force-pushed the lock-free-rid branch 2 times, most recently from 6b35284 to 89db4ba Compare December 19, 2023 16:51

reduz force-pushed the lock-free-rid branch from 89db4ba to 792a49b Compare December 19, 2023 16:52

YuriSizov requested a review from RandomShaper December 19, 2023 17:13

RandomShaper reviewed Dec 21, 2023

View reviewed changes

DarioSamo mentioned this pull request Jan 26, 2024

Add transfer queue support to RenderingDevice and enable multithreaded resource loading #87590

Closed

2 tasks

Calinou approved these changes Feb 2, 2024

View reviewed changes

Calinou reviewed Feb 2, 2024

View reviewed changes

akien-mga changed the title ~~Make RID_Owner lock-free for fetching.~~ Make RID_Owner lock-free for fetching. Feb 3, 2024

akien-mga requested a review from RandomShaper February 5, 2024 12:55

AThousandShips mentioned this pull request Feb 13, 2024

Make RID_Owner<Texture> threadsafe in TextureStorage for GLES3 #88205

Merged

DarioSamo mentioned this pull request Apr 8, 2024

Ubershaders and pipeline pre-compilation (and dedicated transfer queues). #90400

Merged

7 tasks

clayjohn modified the milestones: 4.3, 4.4 Apr 16, 2024

DarioSamo mentioned this pull request Sep 25, 2024

Make RID_Owner lock-free for fetching. #97465

Merged

akien-mga closed this Sep 25, 2024

akien-mga added the archived label Sep 25, 2024

akien-mga removed this from the 4.4 milestone Sep 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `RID_Owner` lock-free for fetching. #86333

Make `RID_Owner` lock-free for fetching. #86333

reduz commented Dec 19, 2023 •

edited

Loading

RandomShaper left a comment •

edited

Loading

RandomShaper Dec 21, 2023

RandomShaper commented Dec 21, 2023 •

edited

Loading

reduz commented Dec 21, 2023

RandomShaper commented Dec 21, 2023

reduz commented Dec 21, 2023

Calinou left a comment

Calinou Feb 2, 2024

reduz Feb 13, 2024

clayjohn commented Feb 16, 2024 •

edited

Loading

akien-mga commented Sep 25, 2024

	if (THREAD_SAFE && chunk_count == chunk_limit) {
	if (unlikely(THREAD_SAFE && chunk_count == chunk_limit)) {

Make RID_Owner lock-free for fetching. #86333

Make RID_Owner lock-free for fetching. #86333

Conversation

reduz commented Dec 19, 2023 • edited Loading

RandomShaper left a comment • edited Loading

Choose a reason for hiding this comment

RandomShaper Dec 21, 2023

Choose a reason for hiding this comment

RandomShaper commented Dec 21, 2023 • edited Loading

reduz commented Dec 21, 2023

RandomShaper commented Dec 21, 2023

reduz commented Dec 21, 2023

Calinou left a comment

Choose a reason for hiding this comment

Calinou Feb 2, 2024

Choose a reason for hiding this comment

reduz Feb 13, 2024

Choose a reason for hiding this comment

clayjohn commented Feb 16, 2024 • edited Loading

akien-mga commented Sep 25, 2024

Make `RID_Owner` lock-free for fetching. #86333

Make `RID_Owner` lock-free for fetching. #86333

reduz commented Dec 19, 2023 •

edited

Loading

RandomShaper left a comment •

edited

Loading

RandomShaper commented Dec 21, 2023 •

edited

Loading

clayjohn commented Feb 16, 2024 •

edited

Loading