Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise the fast path for alloc and dealloc #62

Merged
merged 28 commits into from
Jul 2, 2019
Merged

Optimise the fast path for alloc and dealloc #62

merged 28 commits into from
Jul 2, 2019

Conversation

mjp41
Copy link
Member

@mjp41 mjp41 commented Jul 1, 2019

This seems to have a pretty drastic improvement in speed to heavy allocation benchmarks. It pushes snmalloc to have approximately the same performance as mimalloc on the cfrac benchmark.

mjp41 and others added 12 commits July 1, 2019 14:24
This massively improves the TLS access at the expense of not being
dynamically loadable.
This is useful as codegen is nicer if we use size_t, but the semantics
is uint8_t, and is stored as that in many places in the metadata.
Ultimately should introduce a wrapper to check this invariant.
This change introduces a per small sizeclass free list.  That can be
used to access the free objects for that sizeclass with minimal
calculations being required.

It changes to a partial bump ptr.  We bump allocate a whole OS
page worth of objects at a go, so we don't switch as frequently
between bump and free list allocation.

The code for the fast paths has been restructured to minimise the
work required on the common case, and also it is all inlined for the
common case.

Allocating a zero sized object is moved off the fast path.  Ask for 1
byte if you want to be fast.
Fixes GCC warning that was incorrect using an ASSUME.

Made fast path and slow path Macros so we can add additional attributes.
@mjp41
Copy link
Member Author

mjp41 commented Jul 1, 2019

This addresses #58

CMakeLists.txt Outdated Show resolved Hide resolved
src/ds/bits.h Outdated Show resolved Hide resolved
src/ds/bits.h Outdated Show resolved Hide resolved
bool valid_head(bool is_short)
{
size_t size = sizeclass_to_size(sizeclass);
size_t slab_start = get_initial_link(sizeclass, is_short);
size_t slab_start = get_initial_offset(sizeclass, is_short);
size_t all_high_bits = ~static_cast<size_t>(1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we have a constant for this expression?

src/mem/metaslab.h Show resolved Hide resolved
src/mem/pagemap.h Outdated Show resolved Hide resolved
src/mem/sizeclass.h Show resolved Hide resolved
src/mem/sizeclass.h Outdated Show resolved Hide resolved
src/mem/sizeclass.h Show resolved Hide resolved
src/mem/sizeclasstable.h Show resolved Hide resolved
Copy link
Member Author

@mjp41 mjp41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed changes that I think address @davidchisnall s comments

CMakeLists.txt Outdated Show resolved Hide resolved
src/mem/metaslab.h Show resolved Hide resolved
src/mem/sizeclass.h Show resolved Hide resolved
src/mem/sizeclasstable.h Show resolved Hide resolved
@mjp41 mjp41 merged commit 38f0b62 into master Jul 2, 2019
@mjp41 mjp41 deleted the opt branch July 2, 2019 16:06
@mjp41 mjp41 mentioned this pull request Jul 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants