Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLN: Use generators instead of lists in built-in Python functions #18276

Merged
merged 2 commits into from
Nov 15, 2017

Conversation

mroeschke
Copy link
Member

  • tests passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff

Pass generators in built-in Python functions instead of passing in a list, e.g any([value for value in iterator]) --> any(value for value in iterator)

@codecov
Copy link

codecov bot commented Nov 14, 2017

Codecov Report

Merging #18276 into master will decrease coverage by 0.01%.
The diff coverage is 87.5%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18276      +/-   ##
==========================================
- Coverage    91.4%   91.38%   -0.02%     
==========================================
  Files         164      164              
  Lines       49878    49878              
==========================================
- Hits        45590    45581       -9     
- Misses       4288     4297       +9
Flag Coverage Δ
#multiple 89.19% <85.41%> (ø) ⬆️
#single 39.41% <22.91%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/util/testing.py 100% <ø> (ø) ⬆️
pandas/util/_doctools.py 0% <0%> (ø) ⬆️
pandas/core/frame.py 97.8% <100%> (-0.1%) ⬇️
pandas/io/parsers.py 95.59% <100%> (ø) ⬆️
pandas/core/indexes/base.py 96.42% <100%> (ø) ⬆️
pandas/io/json/normalize.py 96.93% <100%> (ø) ⬆️
pandas/core/internals.py 94.54% <100%> (ø) ⬆️
pandas/io/pytables.py 92.84% <100%> (ø) ⬆️
pandas/core/indexing.py 92.8% <100%> (ø) ⬆️
pandas/core/indexes/range.py 95.66% <100%> (ø) ⬆️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 69472f9...6402974. Read the comment docs.

@codecov
Copy link

codecov bot commented Nov 14, 2017

Codecov Report

Merging #18276 into master will decrease coverage by 0.01%.
The diff coverage is 87.5%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18276      +/-   ##
==========================================
- Coverage    91.4%   91.38%   -0.02%     
==========================================
  Files         164      164              
  Lines       49878    49878              
==========================================
- Hits        45590    45581       -9     
- Misses       4288     4297       +9
Flag Coverage Δ
#multiple 89.19% <85.41%> (ø) ⬆️
#single 39.41% <22.91%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/util/testing.py 100% <ø> (ø) ⬆️
pandas/util/_doctools.py 0% <0%> (ø) ⬆️
pandas/core/internals.py 94.54% <100%> (ø) ⬆️
pandas/io/pytables.py 92.84% <100%> (ø) ⬆️
pandas/core/sparse/frame.py 94.78% <100%> (ø) ⬆️
pandas/core/indexes/base.py 96.42% <100%> (ø) ⬆️
pandas/core/frame.py 97.8% <100%> (-0.1%) ⬇️
pandas/core/indexes/range.py 95.66% <100%> (ø) ⬆️
pandas/io/html.py 85.98% <100%> (ø) ⬆️
pandas/core/indexing.py 92.8% <100%> (ø) ⬆️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 69472f9...6402974. Read the comment docs.

@jreback
Copy link
Contributor

jreback commented Nov 14, 2017

@mroeschke thanks.

can you add a linting check in ci/lint.sh to search for cases like this?

@jreback jreback added the Code Style Code style, linting, code_checks label Nov 14, 2017
@jreback
Copy link
Contributor

jreback commented Nov 14, 2017

does this have any perf impact ?

@mroeschke
Copy link
Member Author

I ran one full asv and two asvs for benchmarks/frame_methods.py and got pretty inconsistent results. I can rerun if needed.

Sure I can include this check in ci/lint.sh. I'll be available to add it later in the week.

@jreback jreback added this to the 0.22.0 milestone Nov 15, 2017
@jreback jreback merged commit 9c799e2 into pandas-dev:master Nov 15, 2017
@jreback
Copy link
Contributor

jreback commented Nov 15, 2017

thanks

Sure I can include this check in ci/lint.sh. I'll be available to add it later in the week.

would be great

@jorisvandenbossche
Copy link
Member

I don't thing you would expect to see any difference in asv. I think this is mainly about code style / redundancy. Certainly given that, apparently (just tried it out, didn't expect this), with small lists it can actually be a bit faster using a list comprehension instead of generator (for bigger ones there is a clear difference though):

In [168]: l = [True, False]*2

In [169]: %timeit any(i for i in l)
511 ns ± 7.38 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [170]: %timeit any([i for i in l])
448 ns ± 11.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [171]: l = [True, False]*100

In [172]: %timeit any(i for i in l)
514 ns ± 7.66 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [173]: %timeit any([i for i in l])
5.7 µs ± 39.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

(and the cases in our code typically are small numbers, eg looping of the axes of a DataFrame (thus 2))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Style Code style, linting, code_checks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants