Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: regression in DataFrame.combine_first with integer columns (GH14687) #14886

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.19.2.txt
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ Bug Fixes
- Bug in clipboard functions on linux with python2 with unicode and separators (:issue:`13747`)
- Bug in clipboard functions on Windows 10 and python 3 (:issue:`14362`, :issue:`12807`)
- Bug in ``.to_clipboard()`` and Excel compat (:issue:`12529`)

- Bug in ``DataFrame.combine_first()`` for integer columns (:issue:`14687`).

- Bug in ``pd.read_csv()`` in which the ``dtype`` parameter was not being respected for empty data (:issue:`14712`)
- Bug in ``pd.read_csv()`` in which the ``nrows`` parameter was not being respected for large input when using the C engine for parsing (:issue:`7626`)
Expand Down
6 changes: 2 additions & 4 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -3665,10 +3665,8 @@ def combine(self, other, func, fill_value=None, overwrite=True):
otherSeries[other_mask] = fill_value

# if we have different dtypes, possibily promote
if notnull(series).all():
new_dtype = this_dtype
otherSeries = otherSeries.astype(new_dtype)
else:
new_dtype = this_dtype
if not is_dtype_equal(this_dtype, other_dtype):
new_dtype = _find_common_type([this_dtype, other_dtype])
if not is_dtype_equal(this_dtype, new_dtype):
series = series.astype(new_dtype)
Expand Down
10 changes: 10 additions & 0 deletions pandas/tests/frame/test_combine_concat.py
Original file line number Diff line number Diff line change
Expand Up @@ -725,3 +725,13 @@ def test_combine_first_period(self):
exp = pd.DataFrame({'P': exp_dts}, index=[1, 2, 3, 4, 5, 7])
tm.assert_frame_equal(res, exp)
self.assertEqual(res['P'].dtype, 'object')

def test_combine_first_int(self):
# GH14687 - integer series that do no align exactly

df1 = pd.DataFrame({'a': [0, 1, 3, 5]}, dtype='int64')
df2 = pd.DataFrame({'a': [1, 4]}, dtype='int64')

res = df1.combine_first(df2)
tm.assert_frame_equal(res, df1)
self.assertEqual(res['a'].dtype, 'int64')