Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: different apply function behavior when columns with type Timestamp present #17602

Closed
xvwei1989 opened this issue Sep 20, 2017 · 1 comment · Fixed by #18577
Closed

BUG: different apply function behavior when columns with type Timestamp present #17602

xvwei1989 opened this issue Sep 20, 2017 · 1 comment · Fixed by #18577
Labels
Apply Apply, Aggregate, Transform, Map Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@xvwei1989
Copy link

Code Sample, a copy-pastable example if possible

# Your code here
import pandas as pd
df = pd.DataFrame([[1,2],[1,2]],columns=['a','b'])
print df.apply(lambda x: {'s':x['a']+x['b']},1)
################
# (AS EXPECTED)
# output: 
# 0    {u's': 3}
# 1    {u's': 3}
# dtype: object
################

# add one new column with type Timestamp
df['tm'] = [pd.Timestamp('2017-05-01 00:00:00'),pd.Timestamp('2017-05-02 00:00:00')]
print df.apply(lambda x: {'s':x['a']+x['b']},1)

################
#(WRONG OUTPUT)
# output: 
#       a  b   tm
# 0   NaN NaN NaN
# 1   NaN NaN NaN
################

Problem description

when the return type of apply function is dict, if a new column with type Timestamp is added to the dataframe, the result will be unexpected even if the apply function is unchanged

Output of pd.show_versions()

commit: None
python: 2.7.13.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: zh_CN.UTF-8
LOCALE: None.None

pandas: 0.20.3
pytest: None
pip: 9.0.1
setuptools: 36.0.1
Cython: 0.25.2
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 5.4.1
sphinx: None
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: 1.0.0
xlwt: None
xlsxwriter: 0.9.6
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: 1.1.14
pymysql: 0.7.11.None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

@jreback
Copy link
Contributor

jreback commented Sep 20, 2017

duplicate of #16353 and #15628

.apply infers the output dimension based on what you are returning, which looks exactly like a Series. This is not idiomatic pandas, not to mention non-performant.

@jreback jreback closed this as completed Sep 20, 2017
@jreback jreback added Duplicate Report Duplicate issue or pull request Apply Apply, Aggregate, Transform, Map Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Sep 20, 2017
@jreback jreback added this to the No action milestone Sep 20, 2017
@jreback jreback modified the milestones: No action, 0.22.0 Nov 30, 2017
jorisvandenbossche pushed a commit that referenced this issue Feb 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apply Apply, Aggregate, Transform, Map Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants