-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: .iloc and .loc behavior not consistent on empty dataframe #9983
Conversation
1089142
to
06dd4d8
Compare
What other changes is this making? (there are a lot of edge cases to empty frames!) For instance the tests in test_groupby looked correct before... |
The problem is that due to the fact that
Where on PR this assert fails. So a couple of tests that compare empty dataframes with different dtypes need to be patched. |
@@ -324,12 +324,12 @@ def test_frame_to_json_except(self): | |||
def test_frame_empty(self): | |||
df = DataFrame(columns=['jim', 'joe']) | |||
self.assertFalse(df._is_mixed_type) | |||
assert_frame_equal(read_json(df.to_json()), df) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, this seems bad that this test passes at all. df.to_json() == {}
here!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe df.to_json() == '{"jim":{},"joe":{}}'
. It is checking that the number / names of columns is correct.
Please test the corrected behaviour of assert_frame_equal in test_testing. IIUC This means previously fleshed out behaviour for empty edge case dtypes has been broken for some time? |
OK, added test to Yes, I think assert_frame_equal was broken for empty dataframes. It would still compare number/names of columns, but not their dtypes. |
@@ -4422,6 +4436,14 @@ def test_indexing_assignment_dict_already_exists(self): | |||
expected.loc[5] = [9, 99] | |||
tm.assert_frame_equal(df, expected) | |||
|
|||
def test_indexing_dtypes_on_empty(self): | |||
df = DataFrame({'a':[1,2,3],'b':['b','b2','b3']}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need a comment with the issue number
6fe6eae
to
0245bfd
Compare
@jreback Made changes, except for moving test out of test_testing --- are you sure about that? |
df1["col1"] = df1["col1"].astype('int64') | ||
df2=pd.DataFrame(columns=["col1","col2"]) | ||
self._assert_equal(df1, df2, check_dtype=False) | ||
self._assert_not_equal(df1, df2, check_dtype=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok then. I would actually use assert_frame_equal
then, otherwise you maybe subtely testing something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's in class TestAssertFrameEqual
--- all the tests are clearly for testing assert_frame_equal
.
@jreback OK, fixed |
@artemyk ok looks good. ping on green (if you want to rebase on master, prob a release note conflict, ok, otherwise will do it on merging). |
Tests Fix Test reorder Doc update Tests fix Tests fix SQL tests fix Testing update Fixes Testing fix Test fix
@jreback Ready to go. |
BUG: .iloc and .loc behavior not consistent on empty dataframe
@artemyk thanks! proof that sometimes trivial looking things (in indexing), actually can have lots of cases / need lots of testing |
Fixes #9964 .
Notice that
assert_frame_equal
now fails for empty dataframes with different dtypes (as, I think, it should). However, this means some tests need to be patched now.