Improve field count typical case performance #179

rressi-at-globus · 2024-09-05T11:52:49Z

Description

The tightest upper bound one can specify on the number of fields in a struct is sizeof(type) * CHAR_BIT. So this was previously used when performing a binary search for the field count. This upper bound is extremely loose when considering a typical large struct, which is more likely to contain a relatively small number of relatively large fields rather than the other way around. The binary search range being multiple orders of magnitude larger than necessary wouldn't have been a significant issue if each test was cheap, but they're not. Testing a field count of N costs O(N) memory and time. As a result, the initial few steps of the binary search may be prohibitively expensive.

The primary optimization introduced by these changes is to use unbounded binary search, a.k.a. exponential search, instead of the typically loosely bounded binary search. This produces a tight upper bound (within 2x) on the field count to then perform the binary search with.

As an upside of this change, the compiler-specific limit placed on the upper bound on the field count to stay within compiler limits could be removed.

Notes

This PR has been obtained by resolving merge issues from this older PR:

Improve field count typical case performance #120

The tightest upper bound one can specify on the number of fields in a struct is `sizeof(type) * CHAR_BIT`. So this was previously used when performing a binary search for the field count. This upper bound is extremely loose when considering a typical large struct, which is more likely to contain a relatively small number of relatively large fields rather than the other way around. The binary search range being multiple orders of magnitude larger than necessary wouldn't have been a significant issue if each test was cheap, but they're not. Testing a field count of N costs O(N) memory and time. As a result, the initial few steps of the binary search may be prohibitively expensive. The primary optimization introduced by these changes is to use unbounded binary search, a.k.a. exponential search, instead of the typically loosely bounded binary search. This produces a tight upper bound (within 2x) on the field count to then perform the binary search with. As an upside of this change, the compiler-specific limit placed on the upper bound on the field count to stay within compiler limits could be removed.

In the last CI run, 15 tasks failed with a compiler is out of heap space error. With the jobs running in parallel, it's hard to determine which tasks failed due to their own excessive memory usage and which were well-behaved, but a victim of running when another task consumed all the available memory.

This could happen for a type with a constructor accepting a parameter pack. This also prevents unbounded growth in case something goes wrong with the logic and something should have already stopped (or never started).

rressi-at-globus · 2024-09-06T06:27:28Z

Given that the original PR has resolved his merge conflicts, and that @runer112 has done further investigations, I can close this PR which is now redundant:

Improve field count typical case performance #120

runer112 and others added 10 commits January 18, 2023 17:39

Update performance evaluation

aac7c36

Add complete type precondition and condition evaluation on preconditions

096445f

Restore parallel testing

de5d6f4

Test fields_count with incomplete types

28bd7f5

Add more field count tests

e80f7aa

Oops

ca3ca70

Prevent unbounded field count test growth

dd1ae1c

This could happen for a type with a constructor accepting a parameter pack. This also prevents unbounded growth in case something goes wrong with the logic and something should have already stopped (or never started).

Merge branch 'develop' into opt-field-count

4393123

runer112 mentioned this pull request Sep 5, 2024

Improve field count typical case performance #120

Open

rressi-at-globus closed this Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve field count typical case performance #179

Improve field count typical case performance #179

rressi-at-globus commented Sep 5, 2024

rressi-at-globus commented Sep 6, 2024

Improve field count typical case performance #179

Improve field count typical case performance #179

Conversation

rressi-at-globus commented Sep 5, 2024

Description

Notes

rressi-at-globus commented Sep 6, 2024