Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve field count typical case performance #179

Closed

Conversation

rressi-at-globus
Copy link

Description

The tightest upper bound one can specify on the number of fields in a struct is sizeof(type) * CHAR_BIT. So this was previously used when performing a binary search for the field count. This upper bound is extremely loose when considering a typical large struct, which is more likely to contain a relatively small number of relatively large fields rather than the other way around. The binary search range being multiple orders of magnitude larger than necessary wouldn't have been a significant issue if each test was cheap, but they're not. Testing a field count of N costs O(N) memory and time. As a result, the initial few steps of the binary search may be prohibitively expensive.

The primary optimization introduced by these changes is to use unbounded binary search, a.k.a. exponential search, instead of the typically loosely bounded binary search. This produces a tight upper bound (within 2x) on the field count to then perform the binary search with.

As an upside of this change, the compiler-specific limit placed on the upper bound on the field count to stay within compiler limits could be removed.

Notes

This PR has been obtained by resolving merge issues from this older PR:

runer112 and others added 10 commits January 18, 2023 17:39
The tightest upper bound one can specify on the number of fields in a
struct is `sizeof(type) * CHAR_BIT`. So this was previously used when
performing a binary search for the field count. This upper bound is
extremely loose when considering a typical large struct, which is more
likely to contain a relatively small number of relatively large fields
rather than the other way around. The binary search range being multiple
orders of magnitude larger than necessary wouldn't have been a
significant issue if each test was cheap, but they're not. Testing a
field count of N costs O(N) memory and time. As a result, the initial
few steps of the binary search may be prohibitively expensive.

The primary optimization introduced by these changes is to use unbounded
binary search, a.k.a. exponential search, instead of the typically
loosely bounded binary search. This produces a tight upper bound (within
2x) on the field count to then perform the binary search with.

As an upside of this change, the compiler-specific limit placed on the
upper bound on the field count to stay within compiler limits could be
removed.
In the last CI run, 15 tasks failed with a compiler is out of heap space error.
With the jobs running in parallel, it's hard to determine which tasks failed due
to their own excessive memory usage and which were well-behaved, but a victim of
running when another task consumed all the available memory.
This could happen for a type with a constructor accepting a parameter
pack.

This also prevents unbounded growth in case something goes wrong with
the logic and something should have already stopped (or never started).
@rressi-at-globus
Copy link
Author

Given that the original PR has resolved his merge conflicts, and that @runer112 has done further investigations, I can close this PR which is now redundant:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants