Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: add tests for keeping dtype in Series.update #23604

Merged
merged 19 commits into from
Nov 20, 2018

Conversation

h-vetinari
Copy link
Contributor

@pep8speaks
Copy link

pep8speaks commented Nov 9, 2018

Hello @h-vetinari! Thanks for updating the PR.

Comment last updated on November 18, 2018 at 16:17 Hours UTC

@h-vetinari h-vetinari mentioned this pull request Nov 9, 2018
4 tasks
@gfyoung gfyoung added Testing pandas testing functions or related to the test suite Dtype Conversions Unexpected or buggy dtype conversions labels Nov 11, 2018
@gfyoung gfyoung requested a review from jreback November 11, 2018 01:02
@codecov
Copy link

codecov bot commented Nov 11, 2018

Codecov Report

Merging #23604 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #23604   +/-   ##
=======================================
  Coverage   92.28%   92.28%           
=======================================
  Files         161      161           
  Lines       51451    51451           
=======================================
  Hits        47483    47483           
  Misses       3968     3968
Flag Coverage Δ
#multiple 90.68% <ø> (ø) ⬆️
#single 42.28% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update deb7b4d...0f50a25. Read the comment docs.

@h-vetinari
Copy link
Contributor Author

@jreback
Incorporated your feedback, and it's green.

@h-vetinari
Copy link
Contributor Author

h-vetinari commented Nov 12, 2018

@jreback
Incorporated feedback; plus, it's green and ready to go.

@h-vetinari
Copy link
Contributor Author

@jreback
Please don't forget this one. :)

Copy link
Contributor Author

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review.

pandas/tests/series/test_combine_concat.py Outdated Show resolved Hide resolved
pandas/tests/series/test_combine_concat.py Outdated Show resolved Hide resolved

expected_values = [caller_values[0], other_values[0], caller_values[2]]
try:
# we keep original dtype whenever possible
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls don't use a try except inside the test itself. you should explicity use pytest.raises if you want to catch an exception

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback
I know, but this is fully intended (I've made the same argument to @gfyoung in a conversation he marked "resolved" below).

The thing is that Series.update will maintain the dtype of the caller as much as it can. This means, when calling from an int Series, floats get cast to int, if possible. Only if that doesn't work, is the dtype changed to whatever the updated values require.

If I were to pass expected results through the parametrization, I wouldn't be able to parametrize separately, and would therefore need 12 expected results. This is neither easy to read nor extensible.

Please have another look at this test, I think it's quite clear that the try-catch is not about skipping, but about different expected values (which I explained with the comment).

Copy link
Contributor Author

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL


expected_values = [caller_values[0], other_values[0], caller_values[2]]
try:
# we keep original dtype whenever possible
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback
I know, but this is fully intended (I've made the same argument to @gfyoung in a conversation he marked "resolved" below).

The thing is that Series.update will maintain the dtype of the caller as much as it can. This means, when calling from an int Series, floats get cast to int, if possible. Only if that doesn't work, is the dtype changed to whatever the updated values require.

If I were to pass expected results through the parametrization, I wouldn't be able to parametrize separately, and would therefore need 12 expected results. This is neither easy to read nor extensible.

Please have another look at this test, I think it's quite clear that the try-catch is not about skipping, but about different expected values (which I explained with the comment).

@h-vetinari
Copy link
Contributor Author

@jreback
I tried to circumvent the try-except again, but it's pretty pointless (not even .astype(errors='ignore') helps).

To demonstrate the difference, I've added an explicitly parametrized test with the expected results, and without using a try-catch. You'll see that, in this format, it's actually much much harder to see the underlying logic.

And it's just not (reasonably) extensible. I found some minor differences between dtypes, and can easily extend the test with the try-catch (1-line-diff commit coming momentarily). In the demo for the case you want, this would multiply the number of expected results I have to explicitly pass, without a reader being able to grok what's actually happening.

([(61,), (63,)], 'int64', pd.Series([10, (61,), 12])),
([(61,), (63,)], float, pd.Series([10., (61,), 12.])),
([(61,), (63,)], object, pd.Series([10, (61,), 12]))
])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback, this PR is not meant to add the same test twice in different ways, I'm just showing what your review requirement (to avoid try-except) would mean.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a MUCH better test. It is very explicit. remove the other, rename and ping.


assert_series_equal(s, expected)

@pytest.mark.parametrize('other_values, caller_dtype, expected', [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call this other, dtype, expected

@h-vetinari
Copy link
Contributor Author

remove the other, rename and ping.

Ping, as requested.

@jreback jreback added this to the 0.24.0 milestone Nov 20, 2018
@jreback jreback merged commit 295c278 into pandas-dev:master Nov 20, 2018
@jreback
Copy link
Contributor

jreback commented Nov 20, 2018

thanks!

thoo added a commit to thoo/pandas that referenced this pull request Nov 20, 2018
…fixed

* upstream/master:
  DOC: more consistent flake8-commands in contributing.rst (pandas-dev#23724)
  DOC: Fixed the doctsring for _set_axis_name (GH 22895) (pandas-dev#22969)
  DOC: Improve GL03 message re: blank lines at end of docstrings. (pandas-dev#23649)
  TST: add tests for keeping dtype in Series.update (pandas-dev#23604)
  TST: For GH4861, Period and datetime in multiindex (pandas-dev#23776)
  TST: move .str-test to strings.py & parametrize it; precursor to pandas-dev#23582 (pandas-dev#23777)
  STY: isort tests/scalar, tests/tslibs, import libwindow instead of _window (pandas-dev#23787)
  BUG: fixed .str.contains(..., na=False) for categorical series (pandas-dev#22170)
  BUG: Maintain column order with groupby.nth (pandas-dev#22811)
  API/DEPR: replace kwarg "pat" with "sep" in str.[r]partition (pandas-dev#23767)
  CLN: Finish isort core (pandas-dev#23765)
  TST: Mark test_pct_max_many_rows with pytest.mark.single (pandas-dev#23799)
@h-vetinari h-vetinari deleted the tst_series_update_dtype branch November 20, 2018 06:53
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
@h-vetinari h-vetinari mentioned this pull request Sep 22, 2019
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants