Optimise the fast path for alloc and dealloc #62

mjp41 · 2019-07-01T13:45:45Z

This seems to have a pretty drastic improvement in speed to heavy allocation benchmarks. It pushes snmalloc to have approximately the same performance as mimalloc on the cfrac benchmark.

This massively improves the TLS access at the expense of not being dynamically loadable.

This is useful as codegen is nicer if we use size_t, but the semantics is uint8_t, and is stored as that in many places in the metadata. Ultimately should introduce a wrapper to check this invariant.

This change introduces a per small sizeclass free list. That can be used to access the free objects for that sizeclass with minimal calculations being required. It changes to a partial bump ptr. We bump allocate a whole OS page worth of objects at a go, so we don't switch as frequently between bump and free list allocation. The code for the fast paths has been restructured to minimise the work required on the common case, and also it is all inlined for the common case. Allocating a zero sized object is moved off the fast path. Ask for 1 byte if you want to be fast.

Fixes GCC warning that was incorrect using an ASSUME. Made fast path and slow path Macros so we can add additional attributes.

mjp41 · 2019-07-01T14:02:52Z

This addresses #58

CMakeLists.txt

src/ds/bits.h

davidchisnall · 2019-07-02T08:10:07Z

src/mem/metaslab.h

    bool valid_head(bool is_short)
    {
      size_t size = sizeclass_to_size(sizeclass);
-      size_t slab_start = get_initial_link(sizeclass, is_short);
+      size_t slab_start = get_initial_offset(sizeclass, is_short);
      size_t all_high_bits = ~static_cast<size_t>(1);


Didn't we have a constant for this expression?

src/mem/metaslab.h

src/mem/pagemap.h

src/mem/sizeclass.h

src/mem/sizeclasstable.h

Removed stub from message queue, and use an actual allocation.

mjp41

Just pushed changes that I think address @davidchisnall s comments

CMakeLists.txt

src/mem/metaslab.h

src/mem/sizeclass.h

src/mem/sizeclasstable.h

mjp41 and others added 12 commits July 1, 2019 14:24

Made TLS initial Exec.

90a0274

This massively improves the TLS access at the expense of not being dynamically loadable.

Add a couple of likely annotations.

830b06a

Made a sizecass_t to wrap the sizeclass

7a8eaec

This is useful as codegen is nicer if we use size_t, but the semantics is uint8_t, and is stored as that in many places in the metadata. Ultimately should introduce a wrapper to check this invariant.

Made a size to sizeclass table.

7f7704b

Improved fast path for pagemap.

b0c1531

Add an assert.

fb825dd

[FreeBSD] Fix a warning with GCC.

c2780f9

Add macro for ASSUME and FAST_PATH/SLOW_PATH

3c7d122

Fixes GCC warning that was incorrect using an ASSUME. Made fast path and slow path Macros so we can add additional attributes.

Clang Tidy and Warnings

187a016

Only add Stats if passed to CMake.

9098b1c

Clang tidy.

84c0117

mjp41 requested review from davidchisnall and sylvanc July 1, 2019 13:45

mjp41 added 4 commits July 1, 2019 15:11

Updated differences

1b26c6d

Typo

0453e43

Minor changes to pagemap codegen

86fde10

Minor alterations to slow path, and when to handle messages.

0dbd10b

davidchisnall reviewed Jul 2, 2019

View reviewed changes

mjp41 added 6 commits July 2, 2019 10:46

Fix clang format check.

b14735f

Clang format.

621b7e6

Minor changes to mpscq fast path.

d4e94d9

Remove constexpr steps.

daebe3f

Add prefetch to mpscq.

fdcbcf7

CR Feedback

eb4e28e

Removed stub from message queue, and use an actual allocation.

mjp41 commented Jul 2, 2019

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

src/mem/metaslab.h Show resolved Hide resolved

src/mem/sizeclass.h Show resolved Hide resolved

src/mem/sizeclasstable.h Show resolved Hide resolved

mjp41 added 3 commits July 2, 2019 14:58

Fixes

b74116e

Make GCC happy with inline

54879fb

Ensure id set on dummy deallocation.

57e94b5

mjp41 added 3 commits July 2, 2019 15:44

Clangformat

8970e70

Fix for zero size allocations.

ea39966

Clangformat.

4cf19f3

mjp41 merged commit 38f0b62 into master Jul 2, 2019

mjp41 deleted the opt branch July 2, 2019 16:06

mjp41 mentioned this pull request Jul 9, 2019

Benchmarking and mimalloc #67

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimise the fast path for alloc and dealloc #62

Optimise the fast path for alloc and dealloc #62

mjp41 commented Jul 1, 2019

mjp41 commented Jul 1, 2019

davidchisnall Jul 2, 2019

mjp41 left a comment

Optimise the fast path for alloc and dealloc #62

Optimise the fast path for alloc and dealloc #62

Conversation

mjp41 commented Jul 1, 2019

mjp41 commented Jul 1, 2019

davidchisnall Jul 2, 2019

Choose a reason for hiding this comment

mjp41 left a comment

Choose a reason for hiding this comment