Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PERF] use numexpr in dispatch_to_series #22284

Merged
merged 7 commits into from
Sep 8, 2018

Conversation

jbrockmendel
Copy link
Member

Also avoid copy/cast in mixed-dtype case for _combine_match_index

Lets us use dispatch_to_series and get rid of DataFrame._compare_frame.

asv continuous -f 1.1 -E virtualenv master HEAD -b binary_ops
[...]
    before     after       ratio
  [03707400] [0cf9fa00]
-  105.09ms    94.39ms      0.90  binary_ops.Ops.time_frame_multi_and(True, 'default')
-   11.67ms    10.41ms      0.89  binary_ops.Ops.time_frame_mult(True, 1)
-   32.72ms    29.03ms      0.89  binary_ops.Timeseries.time_timestamp_ops_diff(None)
-   34.11ms    29.24ms      0.86  binary_ops.Ops.time_frame_mult(True, 'default')
-   20.06ms    15.78ms      0.79  binary_ops.Ops.time_frame_comparison(True, 'default')
-   37.93ms    28.27ms      0.75  binary_ops.AddOverflowArray.time_add_overflow_arr_mask_nan

asv continuous -f 1.1 -E virtualenv master HEAD -b binary_ops
[...]
    before     after       ratio
  [03707400] [0cf9fa00]
+   34.44ms    45.40ms      1.32  binary_ops.Ops2.time_frame_int_div_by_zero
+   32.34ms    42.38ms      1.31  binary_ops.Ops2.time_frame_float_mod
+   11.65ms    15.03ms      1.29  binary_ops.Ops.time_frame_mult(False, 'default')
+    8.01ms     9.76ms      1.22  binary_ops.Timeseries.time_series_timestamp_compare('US/Eastern')
+   48.31ms    58.72ms      1.22  binary_ops.Ops2.time_frame_int_mod
+    7.36ms     8.41ms      1.14  binary_ops.Timeseries.time_timestamp_series_compare('US/Eastern')
+    7.11ms     8.09ms      1.14  binary_ops.Timeseries.time_series_timestamp_compare(None)
+   67.32ms    76.35ms      1.13  binary_ops.Ops2.time_frame_float_div
+   30.83ms    34.92ms      1.13  binary_ops.Timeseries.time_timestamp_ops_diff_with_shift(None)
+    7.12ms     8.06ms      1.13  binary_ops.Timeseries.time_timestamp_series_compare(None)
+  104.54ms   116.93ms      1.12  binary_ops.Ops2.time_frame_float_floor_by_zero
+   32.93ms    36.62ms      1.11  binary_ops.Timeseries.time_timestamp_ops_diff(None)
+   40.88ms    45.45ms      1.11  binary_ops.Timeseries.time_timestamp_ops_diff_with_shift('US/Eastern')
-  122.79ms   111.19ms      0.91  binary_ops.Ops.time_frame_comparison(False, 1)
-   11.82ms    10.58ms      0.89  binary_ops.Ops.time_frame_add(True, 1)
-   25.92ms    22.90ms      0.88  binary_ops.AddOverflowArray.time_add_overflow_b_mask_nan
-   11.36ms     9.82ms      0.86  binary_ops.Ops.time_frame_add(False, 1)
-   10.63ms     8.88ms      0.84  binary_ops.AddOverflowScalar.time_add_overflow_scalar(0)
-   10.50ms     8.62ms      0.82  binary_ops.AddOverflowScalar.time_add_overflow_scalar(1)
-   11.55ms     9.43ms      0.82  binary_ops.Ops.time_frame_add(False, 'default')
-   10.47ms     8.41ms      0.80  binary_ops.AddOverflowScalar.time_add_overflow_scalar(-1)
-   28.92ms    22.54ms      0.78  binary_ops.AddOverflowArray.time_add_overflow_both_arg_nan
-   15.50ms    11.39ms      0.73  binary_ops.AddOverflowArray.time_add_overflow_arr_rev
-   43.70ms    25.47ms      0.58  binary_ops.AddOverflowArray.time_add_overflow_arr_mask_nan

asv continuous -f 1.1 -E virtualenv master HEAD -b binary_ops
[...]
    before     after       ratio
  [03707400] [0cf9fa00]
+    6.10ms     6.87ms      1.13  binary_ops.AddOverflowScalar.time_add_overflow_scalar(-1)
-    5.14ms     4.59ms      0.89  binary_ops.Ops.time_frame_comparison(True, 1)
-    4.80ms     4.21ms      0.88  binary_ops.Timeseries.time_series_timestamp_compare(None)
-    4.81ms     4.17ms      0.87  binary_ops.Timeseries.time_timestamp_series_compare('US/Eastern')
-   22.17ms    19.05ms      0.86  binary_ops.Timeseries.time_timestamp_ops_diff(None)
-   21.63ms    18.29ms      0.85  binary_ops.Timeseries.time_timestamp_ops_diff_with_shift(None)
-    5.23ms     4.42ms      0.85  binary_ops.Timeseries.time_timestamp_series_compare(None)
-   52.74ms    42.41ms      0.80  binary_ops.Ops.time_frame_multi_and(False, 1)

asv continuous -f 1.1 -E virtualenv master HEAD -b binary_ops
[...]
    before     after       ratio
  [03707400] [0cf9fa00]
-    4.85ms     4.40ms      0.91  binary_ops.Timeseries.time_series_timestamp_compare(None)
-   17.56ms    15.43ms      0.88  binary_ops.AddOverflowArray.time_add_overflow_arr_mask_nan
-   29.55ms    25.75ms      0.87  binary_ops.Timeseries.time_timestamp_ops_diff_with_shift('US/Eastern')
-    5.33ms     4.59ms      0.86  binary_ops.Timeseries.time_timestamp_series_compare(None)
-   68.42ms    58.38ms      0.85  binary_ops.Ops2.time_frame_float_floor_by_zero
-    5.05ms     4.23ms      0.84  binary_ops.Timeseries.time_series_timestamp_compare('US/Eastern')
-   22.39ms    18.35ms      0.82  binary_ops.Ops2.time_frame_float_mod
-   79.24ms    63.43ms      0.80  binary_ops.Ops.time_frame_comparison(False, 1)
-    5.38ms     4.18ms      0.78  binary_ops.Timeseries.time_timestamp_series_compare('US/Eastern')

@codecov
Copy link

codecov bot commented Aug 11, 2018

Codecov Report

Merging #22284 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #22284      +/-   ##
==========================================
+ Coverage   92.05%   92.05%   +<.01%     
==========================================
  Files         169      169              
  Lines       50783    50787       +4     
==========================================
+ Hits        46749    46753       +4     
  Misses       4034     4034
Flag Coverage Δ
#multiple 90.46% <100%> (ø) ⬆️
#single 42.3% <26.31%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/ops.py 96.94% <100%> (+0.03%) ⬆️
pandas/core/frame.py 97.19% <100%> (-0.01%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 70c9003...776ea34. Read the comment docs.

@gfyoung gfyoung added Performance Memory or execution speed performance Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff labels Aug 13, 2018
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need a specific asv that tests this (e.g. ops on a big frame that is mixed and one that is single dtype). the perf numbers look about the same.

@jbrockmendel
Copy link
Member Author

the perf numbers look about the same.

Yah, this is actually a win though. A previous attempt to use dispatch_to_series instead of _compare_frame had a 15x slowdown. This gets us to a .8x speedup. But the real motivation is to avoid the redundant dispatch code in _compare_frame

BTW for triage purposes, getting #22163 through will make it easier to do a bunch of test cleanup/parametrization. (and, you know, fix a bunch of bugs)

@jbrockmendel
Copy link
Member Author

The perf part of this PR is really secondary. The main goal is to unify DataFrame ops going through dispatch_to_series. The numexpr part is necessary to avoid a perf hit in frame/frame comparisons.

@pep8speaks
Copy link

Hello @jbrockmendel! Thanks for updating the PR.

@jreback jreback added this to the 0.24.0 milestone Sep 8, 2018
@jreback jreback merged commit 16a6e9c into pandas-dev:master Sep 8, 2018
@jreback
Copy link
Contributor

jreback commented Sep 8, 2018

thanks!

@jbrockmendel jbrockmendel deleted the dispatch2 branch September 8, 2018 03:11
aeltanawy pushed a commit to aeltanawy/pandas that referenced this pull request Sep 20, 2018
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants