Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Improve materialisation performance of SortPreservingMergeExec #691

Merged
merged 4 commits into from
Jul 8, 2021

Commits on Jul 7, 2021

  1. Configuration menu
    Copy the full SHA
    63b8cd3 View commit details
    Browse the repository at this point in the history
  2. perf: minimise array data extend calls

    The `SortPreservingMergeStream` operator merges two streams together by creating an output record batch that is build from the contents of the input. Previously each row of input would be pushed into the output sink even if though the API supports pushing batches of rows.
    
    This commit implements the logic to push batches of rows from inputs where possible.
    
    Performance benchmarks show an improvement of between 3-12%.
    
    ```
    group                               master                                 pr
    -----                               ------                                 --
    interleave_batches                  1.04   637.5±51.84µs        ? ?/sec    1.00   615.5±12.13µs        ? ?/sec
    merge_batches_no_overlap_large      1.12    454.9±2.90µs        ? ?/sec    1.00   404.9±10.94µs        ? ?/sec
    merge_batches_no_overlap_small      1.14    485.1±6.67µs        ? ?/sec    1.00    425.7±9.33µs        ? ?/sec
    merge_batches_small_into_large      1.14    263.0±8.85µs        ? ?/sec    1.00    229.7±5.23µs        ? ?/sec
    merge_batches_some_overlap_large    1.05    532.5±8.33µs        ? ?/sec    1.00   508.3±14.24µs        ? ?/sec
    merge_batches_some_overlap_small    1.06   546.9±12.82µs        ? ?/sec    1.00   516.9±13.20µs        ? ?/sec
    ```
    e-dard committed Jul 7, 2021
    Configuration menu
    Copy the full SHA
    8f285c7 View commit details
    Browse the repository at this point in the history
  3. test: more test coverage

    e-dard committed Jul 7, 2021
    Configuration menu
    Copy the full SHA
    665a059 View commit details
    Browse the repository at this point in the history
  4. refactor: update batch size

    e-dard committed Jul 7, 2021
    Configuration menu
    Copy the full SHA
    5f905c2 View commit details
    Browse the repository at this point in the history