-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: groupby().tranforms return ValueError #40102
Labels
Closing Candidate
May be closeable, needs more eyeballs
Comments
yllgl
added
Bug
Needs Triage
Issue that has not been reviewed by a pandas team member
labels
Feb 27, 2021
yllgl
changed the title
BUG: groupb().tranforms return ValueError
BUG: groupby().tranforms return ValueError
Feb 27, 2021
Hi @yllgl - you probably want df.groupby(['F'], dropna=False)['A'].transform(lambda x: x.fillna(x.mean())) does that work for you? |
MarcoGorelli
added
Closing Candidate
May be closeable, needs more eyeballs
and removed
Bug
Needs Triage
Issue that has not been reviewed by a pandas team member
labels
Feb 27, 2021
new error occurs. In [12]:
df['A']=df.groupby(['F'],dropna=False)['A'].transform(lambda x: x.fillna(x.mean()))
Out[12]:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-120-b63e4d7a1ecf> in <module>
----> 1 df['A']=df.groupby(['F'],dropna=False)['A'].transform(lambda x: x.fillna(x.mean()))
d:\python36\lib\site-packages\pandas\core\groupby\generic.py in transform(self, func, engine, engine_kwargs, *args, **kwargs)
492 if not isinstance(func, str):
493 return self._transform_general(
--> 494 func, *args, engine=engine, engine_kwargs=engine_kwargs, **kwargs
495 )
496
d:\python36\lib\site-packages\pandas\core\groupby\generic.py in _transform_general(self, func, engine, engine_kwargs, *args, **kwargs)
541
542 indexer = self._get_index(name)
--> 543 ser = klass(res, indexer)
544 results.append(ser)
545
d:\python36\lib\site-packages\pandas\core\series.py in __init__(self, data, index, dtype, name, copy, fastpath)
312 if len(index) != len(data):
313 raise ValueError(
--> 314 f"Length of passed values is {len(data)}, "
315 f"index implies {len(index)}."
316 )
ValueError: Length of passed values is 1, index implies 0. |
works for me: >>> import pandas as pd
>>> import numpy as np
>>> s = pd.Series([1, 3, 5, np.nan, 6, 8])
>>> df = pd.DataFrame({'A': s,'F': 'foo'})
>>> df.loc[1,'F']=np.nan
>>> df.groupby(['F'],dropna=False)['A'].transform(lambda x: x.fillna(x.mean()))
0 1.0
1 3.0
2 5.0
3 5.0
4 6.0
5 8.0
Name: A, dtype: float64 You're on an old version of pandas though, can you try updating? Indeed, I can reproduce the bug on v1.1.5. Will do a git bisect |
OK, this was fixed in #36842
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
Here's the code:
Problem description
If I change column F's type to 'str' , everything goes well.
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : b5958ee
python : 3.6.8.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19041
machine : AMD64
processor : Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None
pandas : 1.1.5
numpy : 1.17.2
pytz : 2018.9
dateutil : 2.7.5
pip : 19.3.1
setuptools : 41.4.0
Cython : 0.29.3
pytest : None
hypothesis : None
sphinx : 2.4.4
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.3.4
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.3
IPython : 7.2.0
pandas_datareader: None
bs4 : 4.9.0
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.0.2
numexpr : 2.6.9
odfpy : None
openpyxl : 2.6.2
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.5.4
sqlalchemy : None
tables : None
tabulate : None
xarray : 0.11.2
xlrd : None
xlwt : None
numba : 0.42.0
[paste the output of
pd.show_versions()
here leaving a blank line after the details tag]The text was updated successfully, but these errors were encountered: