-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: ASV Algorithms benchmark #18423
Conversation
Codecov Report
@@ Coverage Diff @@
## master #18423 +/- ##
==========================================
- Coverage 91.35% 91.31% -0.05%
==========================================
Files 163 163
Lines 49714 49714
==========================================
- Hits 45415 45394 -21
- Misses 4299 4320 +21
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #18423 +/- ##
==========================================
- Coverage 91.33% 91.32% -0.02%
==========================================
Files 163 163
Lines 49752 49717 -35
==========================================
- Hits 45443 45404 -39
- Misses 4309 4313 +4
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comment. ping on green.
asv_bench/benchmarks/algorithms.py
Outdated
|
||
def time_duplicated_float(self): | ||
self.float.duplicated() | ||
params = [1, -1, 0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add params_names
a7858e3
to
0b9a3e2
Compare
asv_bench/benchmarks/algorithms.py
Outdated
np.random.seed(1234) | ||
self.int_idx = pd.Int64Index(np.arange(N).repeat(5)) | ||
self.float_idx = pd.Float64Index(np.random.randn(N).repeat(5)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Breaking this into two classes it duplicating the setup, and the 'Factorize' and 'Duplicated' name is also already in the time_..
method names. So not sure it is necessarily a net improvement
(for this specific case I actually also think we could add a time_duplicated_string
)
0b9a3e2
to
c8124ff
Compare
@jorisvandenbossche for |
asv_bench/benchmarks/algorithms.py
Outdated
|
||
def time_add_overflow_zero_scalar(self): | ||
self.checked_add(self.arr, 0) | ||
class AddOverflowArray(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this to binary_ops.py (AddOverflow)
Try setup_cache Add base ASV class and test Hashing rework algorithms benchmarks improve algorithms benchmark Benchmarks working! cleanup
c8124ff
to
35c7af4
Compare
thanks @mroeschke |
High Level changes:
The
Algorithms
class benchmark was getting large so I broke it into several smaller classes, avoiding creating data structures in eachsetup
call that would only be used for 1 benchmark.Added
np.random.seed(1234)
insetup
classes where random data is created xref BENCH: put in np.random.seed on vbenches #8144Utilized
params
andsetup_cache
where applicable.Added additional hashing benchmarks
The benchmarks should be equivalent to what existed before. If the diff is too large I can break it up into smaller PRs. You can find 3 asv runs below.