-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: optimize is_scalar, is_iterator #31294
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
or is_interval(val) | ||
or util.is_offset_object(val)) | ||
|
||
|
||
def is_iterator(obj: object) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could arguably just use PyIter_Check
where needed now and avoid overhead of def call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so far no usage of this in the cython code, it is just replacing the version in core.dtypes.inference
sure |
i was expecting you to be more excited about this! |
lol, after going thru lots of PRs i type shortest things! this is good |
…ndexing-1row-df * upstream/master: (194 commits) DOC Remove Python 2 specific comments from documentation (pandas-dev#31198) Follow up PR: pandas-dev#28097 Simplify branch statement (pandas-dev#29243) BUG: DatetimeIndex.snap incorrectly setting freq (pandas-dev#31188) Move DataFrame.info() to live with similar functions (pandas-dev#31317) ENH: accept a dictionary in plot colors (pandas-dev#31071) PERF: add shortcut to Timestamp constructor (pandas-dev#30676) CLN/MAINT: Clean and annotate stata reader and writers (pandas-dev#31072) REF: define _get_slice_axis in correct classes (pandas-dev#31304) BUG: DataFrame.floordiv(ser, axis=0) not matching column-wise bheavior (pandas-dev#31271) PERF: optimize is_scalar, is_iterator (pandas-dev#31294) BUG: Series rolling count ignores min_periods (pandas-dev#30923) xfail sparse warning; closes pandas-dev#31310 (pandas-dev#31311) REF: DatetimeIndex.get_value wrap DTI.get_loc (pandas-dev#31314) CLN: internals.managers (pandas-dev#31316) PERF: avoid copies if possible in fill_binop (pandas-dev#31300) Add test for multiindex json (pandas-dev#31307) BUG: passing TDA and wrong freq to TimedeltaIndex (pandas-dev#31268) BUG: inconsistency between PeriodIndex.get_value vs get_loc (pandas-dev#31172) CLN: remove _set_subtyp (pandas-dev#31301) CI: Updated version of macos image (pandas-dev#31292) ...
While working on #30349 I noticed that
is_scalar
is pretty slow for non-scalar args, turns out we can get a 9-19x improvement for listlike cases, 5-10x improvement for Decimal/Period/DateOffset/Interval, with small improvements in nearly every other case while we're at it (the fractions.Fraction object is the only one where i found a small decrease in perf, within the margin of error)xref #31291, assuming this is the desired behavior for is_iterator, this closes that.
Setup:
Non-Scalar Cases
Non-Built-In Scalar Cases
Built-In Scalar Cases
is_iterator check