Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding element to empty object Series changes dtype #19576

Closed
toobaz opened this issue Feb 7, 2018 · 5 comments
Closed

Adding element to empty object Series changes dtype #19576

toobaz opened this issue Feb 7, 2018 · 5 comments
Labels
Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves

Comments

@toobaz
Copy link
Member

toobaz commented Feb 7, 2018

Code Sample, a copy-pastable example if possible

In [2]: s = pd.Series(dtype='object')

In [3]: s.loc[0] = 1

In [4]: s
Out[4]: 
0    1
dtype: int64

In [5]: s = pd.Series(dtype='object')

In [6]: s.loc[0] = 1.

In [7]: s
Out[7]: 
0    1.0
dtype: float64

Compare with:

In [8]: s = pd.Series({1 : 2}, dtype='object')

In [9]: s.loc[0] = 1

In [10]: s
Out[10]: 
1    2
0    1
dtype: object

In [11]: s = pd.Series({1 : 2.}, dtype='object')

In [12]: s.loc[0] = 1.

In [13]: s
Out[13]: 
1    2
0    1
dtype: object

Problem description

The dtype of a Series should change, on addition of new elements, only when it has to:: as a corollary, it should never change when it is object.

Vaguely related to #17261, maybe to #8902.

Expected Output

Always object dtype.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-5-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: it_IT.UTF-8
LOCALE: it_IT.UTF-8

pandas: 0.23.0.dev0+229.gd5a7e7c94
pytest: 3.2.3
pip: 9.0.1
setuptools: 36.7.0
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.0dev
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.0.0
openpyxl: 2.3.0
xlrd: 1.0.0
xlwt: 1.3.0
xlsxwriter: 0.9.6
lxml: 4.1.1
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.2.1

@jreback
Copy link
Contributor

jreback commented Feb 7, 2018

see #6485
and #6942

this is actually a feature

@toobaz
Copy link
Member Author

toobaz commented Feb 7, 2018

I don't think it is related: #6485 is an unwanted upcasting (while I'm seeing an unwanted downcasting), example in #6942 is not complete but seems to be specific to datetimelike stuff.

Most importantly, this one specifically affects empty Series only. Are you referring to this when you say it is a feature? (Might be, but I think it goes against the principle of least surprise, because in the examples above I specifically passed a dtype=object, which is not being respected)

@jreback
Copy link
Contributor

jreback commented Feb 7, 2018

yes the point is a common use case
is to create an empty object and iteratively fill it
not a great pattern but one which is supported
inference on first expansion is ingrained

@jreback jreback closed this as completed Feb 10, 2018
@jreback jreback added Indexing Related to indexing on series/frames, not to indexes themselves Dtype Conversions Unexpected or buggy dtype conversions labels Feb 10, 2018
@jreback jreback added this to the No action milestone Feb 10, 2018
@toobaz
Copy link
Member Author

toobaz commented Feb 10, 2018

is to create an empty object and iteratively fill it

Ideally, we would then have the possibility of dtype-less empty objects. So that if you do pass a dtype, it's not overridden.

@jreback
Copy link
Contributor

jreback commented Feb 10, 2018

i agree in theory we should have an Any type which does exactly this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

No branches or pull requests

2 participants