-
-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make RID_Owner
lock-free for fetching.
#86333
Conversation
6b35284
to
89db4ba
Compare
This PR makes RID_Owner lock free for fetching values, this should give a very significant peformance boost where used. Some considerations: * A maximum number of elements to alocate must be given (by default 200k). * Access to the RID structure is still safe given they are independent from addition/removals. * RID access was never really thread-safe in the sense that the contents of the data are not protected anyway. Each server needs to implement locking as it sees fit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, the thread-safe version has pre-allocate because otherwise memrealloc()
can change the base pointer of the data. I'm wondering if that could be solved by using a PagedAllocator
(UPDATE: or PagedArray
, maybe is what I mean). Maybe the downside is that element access needs a bit of extra arithmetic and, more prominently, indirection to reach the item on the right page. Can you confirm?
} | ||
|
||
if (alloc_count == max_alloc) { | ||
//allocate a new chunk | ||
uint32_t chunk_count = alloc_count == 0 ? 0 : (max_alloc / elements_in_chunk); | ||
if (THREAD_SAFE && chunk_count == chunk_limit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (THREAD_SAFE && chunk_count == chunk_limit) { | |
if (unlikely(THREAD_SAFE && chunk_count == chunk_limit)) { |
Aside, given there's no need for thread safety on the data itself (requires external sync anyway), I'm wondering if we could go fully lock-free, this way:
PS: For the records, there's an idea I don't want to forget and so I'm writing it here: in case we still want locking, the |
@RandomShaper I am wondering if making the validator an atomic really changes anything, my fear is that load-acquire still forces the other CPUs to finalize other memory writes. |
Acquire-release atomics would at least be superior to either |
@RandomShaper I think in theory it should be fine if they see old data in this case, the data will not be accessible anyway after the mutex ends the lock, and will be updated when removed after the mutex also ends the lock. In the meantime, nothing should bother the data access so I think it should be ok. The data itself is not thread safe anyway, even now, so its not related to the actual RID access per se. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested locally (rebased on top of master
10e1114 + #86333), it works as expected on Linux + NVIDIA.
On https://github.com/godotengine/tps-demo, it loads a full second faster than on master
with neither PR applied.
} | ||
} | ||
|
||
void set_description(const char *p_descrption) { | ||
description = p_descrption; | ||
} | ||
|
||
RID_Alloc(uint32_t p_target_chunk_byte_size = 65536) { | ||
RID_Alloc(uint32_t p_target_chunk_byte_size = 65536, uint32_t p_maximum_amount_of_elements = 200000) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, shouldn't the default value for p_maximum_amount_of_elements
be a multiple of 65536
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah thats a good point
RID_Owner
lock-free for fetching.
Superseded by #97465. |
This PR makes RID_Owner lock free for fetching values, this should give a very significant peformance boost where used.
Some considerations: