Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: Regressions between 0.20.1 and 0.20.x #16584

Closed
TomAugspurger opened this issue Jun 2, 2017 · 9 comments
Closed

PERF: Regressions between 0.20.1 and 0.20.x #16584

TomAugspurger opened this issue Jun 2, 2017 · 9 comments
Labels
Performance Memory or execution speed performance
Milestone

Comments

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Jun 2, 2017

I'll add more later, I'm worried my local benchmarks aren't reliable, but I think these ones are real.

Interpolate

       before           after         ratio
     [20baf972]       [eefbaf71]
+        50.6±2ms            1.48s    29.27  frame_methods.Interpolate.time_interpolate
+     1.19±0.01ms       14.9±0.7ms    12.53  frame_methods.Interpolate.time_interpolate_some_good
+      2.45±0.1ms       13.6±0.8ms     5.57  frame_methods.Interpolate.time_interpolate_some_good_infer

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.

groupby

       before           after         ratio
     [20baf972]       [98162e04]
+     1.45±0.03ms      2.49±0.04ms     1.72  groupby.groupby_datetimetz.time_groupby_sum
+     2.71±0.03ms      4.53±0.09ms     1.67  groupby.groupby_nth.time_groupby_series_nth_any
+     1.59±0.02ms      2.51±0.04ms     1.58  groupby.groupby_nth.time_groupby_series_nth_none
+     3.69±0.03ms      4.95±0.09ms     1.34  groupby.groupby_multi.time_groupby_series_simple_cython
+     1.41±0.04ms      1.69±0.04ms     1.19  groupby.groupby_datetime.time_groupby_sum
+     1.91±0.03ms      2.20±0.01ms     1.15  groupby.GroupBySuite.time_cumprod('int', 10000)
+     1.61±0.03ms      1.80±0.02ms     1.12  groupby.GroupBySuite.time_count('int', 10000)
+      9.83±0.2ms      10.9±0.09ms     1.11  groupby.GroupBySuite.time_all('int', 100)
+     1.74±0.03ms      1.93±0.02ms     1.11  groupby.GroupBySuite.time_cummin('int', 10000)
@TomAugspurger TomAugspurger added the Performance Memory or execution speed performance label Jun 2, 2017
@TomAugspurger TomAugspurger added this to the 0.20.2 milestone Jun 2, 2017
@TomAugspurger
Copy link
Contributor Author

cc @WBare for the interpolate regression, #16429 was the commit. Do you have chance to see if there's any easy optimizations to make? If not I can take a look later.

@jreback
Copy link
Contributor

jreback commented Jun 2, 2017

what causes the interpolate ones?

@TomAugspurger
Copy link
Contributor Author

TomAugspurger commented Jun 2, 2017 via email

@TomAugspurger
Copy link
Contributor Author

@jreback the groupby slowdown + 3.47±0.08ms 5.30±0.2ms 1.53 groupby.groupby_multi.time_groupby_series_simple_cython points to #16413

Thoughts? Does the increased perf elsewhere make up for it?

@jreback
Copy link
Contributor

jreback commented Jun 4, 2017

not sure what u mean
this is a perf improvement vs previous versions for other benchmarks

i would remeasure this

@TomAugspurger
Copy link
Contributor Author

It affected time_groupby_series_simple_cython negatively

$ asv continuous d5a681bfa~1 d5a681bfa -b groupby.groupby_multi.time_groupby_series_simple_cython 
· Creating environments
· Discovering benchmarks
·· Uninstalling from conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
·· Installing into conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt..
· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[  0.00%] · For pandas commit hash d5a681bf:
[  0.00%] ·· Building for conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt..........................................................................................
[  0.00%] ·· Benchmarking conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 50.00%] ··· Running groupby.groupby_multi.time_groupby_series_simple_cython                                                                                                                                                                                                                                    5.04±0.09ms
[ 50.00%] · For pandas commit hash 5fe042f5:
[ 50.00%] ·· Building for conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt..........................................................................................
[ 50.00%] ·· Benchmarking conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[100.00%] ··· Running groupby.groupby_multi.time_groupby_series_simple_cython                                                                                                                                                                                                                                    3.79±0.09ms       
       before           after         ratio
     [5fe042f5]       [d5a681bf]
+     3.79±0.09ms      5.04±0.09ms     1.33  groupby.groupby_multi.time_groupby_series_simple_cython

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.

@jreback
Copy link
Contributor

jreback commented Jun 4, 2017

ok create an issue and i will look
don't block for the release

@TomAugspurger
Copy link
Contributor Author

Sounds good. Agreed that it's not worth blocking the release over.

@TomAugspurger
Copy link
Contributor Author

Closed by #16592

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance
Projects
None yet
Development

No branches or pull requests

2 participants