-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix the function find_common_types bug #25320
Conversation
` types[0]` can raise a KeyError when `types` is a `pd.Series` . see issue #25270
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't reviewed all of the failures but this doesn't seem right given this is a very generic function. Does the error affect things outside of SparseDataFrame? If not then seems like the issue needs to be addressed directly there
Also please add test(s) - should be the first part to any PR |
Ee, how to add test(s)? 😄 |
@@ -1075,7 +1075,7 @@ def find_common_type(types): | |||
|
|||
Parameters | |||
---------- | |||
types : list of dtypes | |||
types : list_like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list-like
pandas/core/dtypes/cast.py
Outdated
@@ -1090,7 +1090,7 @@ def find_common_type(types): | |||
if len(types) == 0: | |||
raise ValueError('no types given') | |||
|
|||
first = types[0] | |||
first = types[:1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you are changing ths, you must have a failing test case, can you pls add it
find_common_types
bugLast modification can't pass test, so fix it and now it can pass test.
Codecov Report
@@ Coverage Diff @@
## master #25320 +/- ##
==========================================
- Coverage 91.72% 41.72% -50%
==========================================
Files 173 173
Lines 52831 52831
==========================================
- Hits 48457 22042 -26415
- Misses 4374 30789 +26415
Continue to review full report at Codecov.
|
1 similar comment
Codecov Report
@@ Coverage Diff @@
## master #25320 +/- ##
==========================================
- Coverage 91.72% 41.72% -50%
==========================================
Files 173 173
Lines 52831 52831
==========================================
- Hits 48457 22042 -26415
- Misses 4374 30789 +26415
Continue to review full report at Codecov.
|
In issues #25270 @rasbt gave this queston. He thinks that "the Pandas SparseDataFrame method Yes, he is right. Then I try to track the # example from @rasbt
import pandas as pd
import numpy as np
ary = np.array([ [1, 0, 0, 3],
[1, 0, 2, 0],
[0, 4, 0 ,0] ])
df = pd.DataFrame(ary)
df.columns = [1, 2, 3, 4]
dfs = pd.SparseDataFrame(df,
default_fill_value=0)
# DOES NOT WORK:
dfs.to_coo() # raises KeyError: 0 now if we check: In [12]: dfs.dtypes
Out[12]:
1 int64
2 int64
3 int64
4 int64
dtype: object
In [13]: type(dfs.dtypes)
Out[13]: pandas.core.series.Series as we see, the # pandas/core/dtypes/cast.py in find_common_type(types) at about 1093 lines
def find_common_type(types):
"""
Find a common data type among the given dtypes.
Parameters
----------
types : list of dtypes
Returns
-------
pandas extension or numpy dtype
See Also
--------
numpy.find_common_type
"""
if len(types) == 0:
raise ValueError('no types given')
first = types[0] # list is ok, but pd.Series may cause litte error. We check this statement In [20]: dfs.dtypes[0]
---------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-20-4d14dd9f5c73> in <module>()
----> 1 dfs.dtypes[0]
~/anaconda3/lib/python3.7/site-packages/pandas/core/series.py in __getitem__(self, key)
765 key = com._apply_if_callable(key, self)
766 try:
--> 767 result = self.index.get_value(self, key)
768
769 if not is_scalar(result):
~/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
3116 try:
3117 return self._engine.get_value(s, k,
-> 3118 tz=getattr(series.dtype, 'tz', None))
3119 except KeyError as e1:
3120 if len(self) > 0 and self.inferred_type in ['integer', 'boolean']:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 0 Yes, it raises a Then in the example of @rasbt there are two cases can work. # WORKS (1)
dfs2 = dfs.copy()
dfs2.columns = [0, 1, 2, 3]
dfs2.to_coo()
# WORKS (2)
dfs3 = dfs.copy()
dfs3.columns = [str(i) for i in dfs3.columns]
dfs3.to_coo() In fact, In [10]: dfs.dtypes.index
Out[10]: Int64Index([1, 2, 3, 4], dtype='int64')
In [11]: dfs2.dtypes.index
Out[11]: Int64Index([0, 1, 2, 3], dtype='int64')
In [12]: dfs3.dtypes.index
Out[12]: Index(['1', '2', '3', '4'], dtype='object') Useing So Of cause, But after committing, some checks were not successful, the newest update passed test. |
closing as stale if you want to keep working, merge master and ping |
types[0]
can raise a KeyError whentypes
is apd.Series
. see issue #25270git diff upstream/master -u -- "*.py" | flake8 --diff