Use `memoryview` in `unpack_frames` #3980

jakirkham · 2020-07-21T21:36:19Z

As part of unpack_frames, we slice out each frame we'd like to extract (see code snippet below).

distributed/distributed/protocol/utils.py

Line 135 in 8a0e4b6

frame = b[start:end]

However this causes a copy, which increases memory usage and creates a notable bottleneck when unpacking frames. Closer inspection of unpack_frames shows this dominates the time of that function and takes up roughly half of the time in deserialize_bytes. Also as deserialize_bytes typically works with a bytes object, these frames end up being bytes objects, which we wind up needing to copy later to produce mutable frames ( see PR #3967 and related context ). IOW performing a copy in unpack_frames is wasted effort.

To fix this issue, we coerce the input of unpack_frames to a memoryview. This means slicing later merely produces views onto the data, which is essentially free. This avoids the copy and alleviates this bottleneck. Also this just works in most Python calls (like struct.unpack_from) as they are bytes-like compatible so work on memoryviews. The details can be seen in the benchmark below using deserialize_bytes part of the unspilling code path, which calls into unpack_frames. This speeds up the unspilling code path by ~50%.

Before:

In [1]: import numpy 
   ...: import pandas 
   ...: from distributed.protocol import serialize_bytelist, deserialize_bytes                                                 

In [2]: df = pandas.DataFrame({ 
   ...:     k: numpy.random.random(1_000_000) 
   ...:     for i, k in enumerate(map(chr, range(ord("A"), ord("K")))) 
   ...: })                                                                                                                     

In [3]: b = b"".join(serialize_bytelist(df))                                                                                   

In [4]: %timeit deserialize_bytes(b)                                                                                           
37.5 ms ± 243 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

After:

In [1]: import numpy 
   ...: import pandas 
   ...: from distributed.protocol import serialize_bytelist, deserialize_bytes                                                 

In [2]: df = pandas.DataFrame({ 
   ...:     k: numpy.random.random(1_000_000) 
   ...:     for i, k in enumerate(map(chr, range(ord("A"), ord("K")))) 
   ...: })                                                                                                                     

In [3]: b = b"".join(serialize_bytelist(df))                                                                                   

In [4]: %timeit deserialize_bytes(b)                                                                                           
19.2 ms ± 188 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Selecting out each frame from the input causes a copy, which increases memory usage and slows down `unpack_frames`. To fix this, coerce the input to a `memoryview`. This way slices into the `memoryview` only take a view onto the underlying data, which is quite fast and doesn't result in additional memory usage.

mrocklin · 2020-07-21T21:41:37Z

Nice. +1

quasiben · 2020-07-22T00:01:46Z

That is so cool! Thanks @jakirkham ! And again, thank you for providing timing reports as well.

jakirkham · 2020-07-22T00:31:48Z

Thanks all! 😄

jakirkham added 2 commits July 21, 2020 14:10

Minor whitespace adjustment

95b0fd2

quasiben merged commit eb10a53 into dask:master Jul 22, 2020

jakirkham deleted the use_memoryview_unpack_frames branch July 22, 2020 00:22

jakirkham mentioned this pull request Jul 28, 2020

Evaluate further serialization performance improvements rapidsai/dask-cuda#106

Closed

gjoseph92 mentioned this pull request Jul 23, 2021

🛑 DNM Deserialization: zero-copy merge subframes when possible #5112

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use `memoryview` in `unpack_frames` #3980

Use `memoryview` in `unpack_frames` #3980

jakirkham commented Jul 21, 2020 •

edited

Loading

mrocklin commented Jul 21, 2020

quasiben commented Jul 22, 2020

jakirkham commented Jul 22, 2020

Use memoryview in unpack_frames #3980

Use memoryview in unpack_frames #3980

Conversation

jakirkham commented Jul 21, 2020 • edited Loading

mrocklin commented Jul 21, 2020

quasiben commented Jul 22, 2020

jakirkham commented Jul 22, 2020

Use `memoryview` in `unpack_frames` #3980

Use `memoryview` in `unpack_frames` #3980

jakirkham commented Jul 21, 2020 •

edited

Loading