Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Series.sum has inconsistent return type #9733

Closed
remiremi opened this issue Mar 26, 2015 · 5 comments
Closed

Series.sum has inconsistent return type #9733

remiremi opened this issue Mar 26, 2015 · 5 comments
Labels
Dtype Conversions Unexpected or buggy dtype conversions good first issue

Comments

@remiremi
Copy link

Series.sum returns a numpy type, except when it's empty, in which case it returns a python int of value "0":

In [2]: type(pd.Series([0]).sum())
Out[2]: numpy.int64

In [3]: type(pd.Series().sum())
Out[3]: int

This poses a problem when I do 1 / myserie.sum() because I expect to obtain np.inf rather than a divison by 0 exception.

I think the return type of Series.sum() for empty series should be inferred from the Series's dtype This way, Series([], dtype='str').sum() would return an empty string, and Series([]).sum() would return np.float64(0) since an empty series' default type seems to be float64.

Tested with Pandas 0.16.0

@shoyer
Copy link
Member

shoyer commented Mar 26, 2015

We don't support numeric methods on strings, but otherwise this seasons reasonable to me.

Want to give putting together a patch a try?

@jreback
Copy link
Contributor

jreback commented Mar 27, 2015

@remiremi yeah I don't think we coerce scalars to anything. This could be made more consistent. We should in general NOT be returning python scalars.

@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions API Design Compat pandas objects compatability with Numpy or Python functions labels Mar 27, 2015
@jreback jreback added this to the Next Major Release milestone Mar 27, 2015
remiremi added a commit to remiremi/pandas that referenced this issue Apr 7, 2015
@remiremi
Copy link
Author

remiremi commented Apr 7, 2015

@shoyer @jreback I created a pull request which adds a fix and a relevant test: #9829
Let me know if I'm missing anything

@jreback jreback modified the milestones: 0.16.1, Next Major Release Apr 7, 2015
remiremi added a commit to remiremi/pandas that referenced this issue Apr 17, 2015
@jreback jreback modified the milestones: 0.17.0, 0.16.1 Apr 23, 2015
remiremi added a commit to remiremi/pandas that referenced this issue Aug 21, 2015
@jreback jreback modified the milestones: Next Major Release, 0.17.0 Aug 31, 2015
@jorisvandenbossche
Copy link
Member

This now returns correctly a numpy type:

In [22]: type(pd.Series().sum())
Out[22]: float

So repurpose this issue to add a test to confirm this.

@jorisvandenbossche jorisvandenbossche added Difficulty Novice and removed API Design Compat pandas objects compatability with Numpy or Python functions Prio-high labels Nov 14, 2017
@jorisvandenbossche jorisvandenbossche added the Needs Tests Unit test(s) needed to prevent regressions label Jul 6, 2018
@jorisvandenbossche
Copy link
Member

Closing this as we have an issue to convert the Series dtype to object for empty creation, and #19813 is there for the strange return value.

@jorisvandenbossche jorisvandenbossche modified the milestones: Contributions Welcome, No action Jul 6, 2018
@jorisvandenbossche jorisvandenbossche removed the Needs Tests Unit test(s) needed to prevent regressions label Jul 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions good first issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants