Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Restore original convert_objects and add _convert #11173

Merged
merged 3 commits into from
Oct 2, 2015

Conversation

bashtage
Copy link
Contributor

Restores the v0.16 behavior of convert_objects and moves the new
version of _convert
Adds to_numeric for directly converting numeric data

closes #11116
closes #11133

@bashtage
Copy link
Contributor Author

Still a few things to do:

  • Update release notes to remove new docs and add deprecation note
  • Make sure copy is passed to _possibly_convert_objects
  • Add note about bugs fixed in convert_objects
  • Restore docs for convert_objects
  • Add tests for to_numeric
  • Add docs for to_numeric
  • Move converters to a new file, convert.py

@jreback jreback added the Dtype Conversions Unexpected or buggy dtype conversions label Sep 22, 2015
@jreback jreback added this to the 0.17.0 milestone Sep 22, 2015
timedelta=True,
coerce=False,
# TODO: Remove in 0.18 or 2017, which ever is sooner
def _possibly_convert_objects(values, convert_dates=True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we put _possibly_convert_objects and _soft_convert_objects in pandas/tools/convert.py (new file)
common.py is just so huge. We can eventually move more stuff their as well. It will have to import common for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do this after as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just note that there is some value in keeping this function intact in core._common. Xray was using this private function internally, with a broken guard against it changing (see pydata/xarray#569 for details). Obviously, that was our fault and we've fixed it now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shoyer I don't think you should be using this internal function (I saw you removed it). and instead use the public .convert_objects

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough -- but there's not any other way to hook directly into pandas' object array of datetimes -> datetime64 conversion. Might be worth thinking about making such an API at some point.

On Wed, Sep 23, 2015 at 5:58 PM, Jeff Reback notifications@github.com
wrote:

@@ -1857,23 +1857,84 @@ def _maybe_box_datetimelike(value):

_values_from_object = lib.values_from_object

-def _possibly_convert_objects(values,

  •                          datetime=True,
    
  •                          numeric=True,
    
  •                          timedelta=True,
    
  •                          coerce=False,
    

    +# TODO: Remove in 0.18 or 2017, which ever is sooner
    +def _possibly_convert_objects(values, convert_dates=True,
    @shoyer I don't think you should be using this internal function (I saw you removed it). and instead use the public .convert_objects

    Reply to this email directly or view it on GitHub:
    https://github.com/pydata/pandas/pull/11173/files#r40276398

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean like this?

In [20]: pd.to_datetime([datetime.datetime(2012,1,1,0,0),datetime.datetime(2013,1,1,1,1)],box=False)       
Out[20]: 
array(['2011-12-31T19:00:00.000000000-0500',
       '2012-12-31T20:01:00.000000000-0500'], dtype='datetime64[ns]')

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite -- I only want soft conversion (like the pd.Series constructor). I don't want to convert text that matches a regex, only an array of all datetime instances (to datetime64) or all timedelta instances (to timedelta64).

@bashtage
Copy link
Contributor Author

The important changes are:

convert in ObjectManager: https://github.com/pydata/pandas/pull/11173/files#diff-e705e723b2d6e7c0e2a0443f80916abfR1521

This change requires this to 'route' to either the old or new style converter.

to_numeric: https://github.com/pydata/pandas/pull/11173/files#diff-40d1f19921aa9f2932951124fad07cc6R53

New function.

@jorisvandenbossche
Copy link
Member

@bashtage Really thanks a lot for this!

(I am still not fully convinced the deprecation is needed, but I will try to summarize my reasons tomorrow)

@jreback jreback changed the title ENH: Resotre original convert_objects and add _convert ENH: Restore original convert_objects and add _convert Sep 24, 2015
@jreback
Copy link
Contributor

jreback commented Sep 25, 2015

@bashtage

  • pls move the 2 converting functions either to a new module, core/convert.py.
  • let's finish up the documentation changes.

pls let me know asap.

@bashtage
Copy link
Contributor Author

Still no tests for to_numeric or docs for this method.

@jreback
Copy link
Contributor

jreback commented Sep 25, 2015

looks good! ping when tests for pd.to_numeric

@jreback
Copy link
Contributor

jreback commented Sep 27, 2015

@jorisvandenbossche ok with this

@jreback jreback mentioned this pull request Sep 27, 2015
2 tasks
@jorisvandenbossche
Copy link
Member

to_numeric needs some fixes (and tests): lib is not yet imported, lib.maybe_convert_numeric expects an array so it bugs on a series as arg, and the final result needs to be boxed into a series again if the arg was a series.

@bashtage
Copy link
Contributor Author

Yes - it definitely needs work. It expects something with the buffer
interface, so I need to ensure that this works as expected.

On Sun, Sep 27, 2015 at 8:12 PM Joris Van den Bossche <
notifications@github.com> wrote:

to_numeric needs some fixes (and tests): lib is not yet imported,
lib.maybe_convert_numeric expects an array so it bugs on a series as arg,
and the final result needs to be boxed into a series again if the arg was a
series.


Reply to this email directly or view it on GitHub
#11173 (comment).

@jreback
Copy link
Contributor

jreback commented Sep 30, 2015

@bashtage did you have a chance to add some tests here?

@bashtage
Copy link
Contributor Author

Should be ready when green I think

@bashtage
Copy link
Contributor Author

Can squash commits - wanted to make it easier to see changes at each stage since close to release

@bashtage
Copy link
Contributor Author

bashtage commented Oct 1, 2015

@jreback Green - let me know if I missed anything. Only tick left is to add some docs although this can probably wait since convert_objects is still available.

@@ -48,3 +48,66 @@ def compose(*funcs):
"""Compose 2 or more callables"""
assert len(funcs) > 1, 'At least 2 callables must be passed to compose'
return reduce(_compose2, funcs)


def to_numeric(arg, errors='raise', box=True, coerce=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'coerce' shouldn't be here?

gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 4, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 4, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 4, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 6, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 7, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 9, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 9, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 11, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 12, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 13, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 14, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 15, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 16, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 17, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 21, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 21, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 23, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 26, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 28, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 29, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jul 4, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jul 4, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jul 5, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jul 7, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jul 7, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jul 8, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jul 8, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jul 10, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jul 12, 2017
Deprecated since 0.17.0

xref pandas-devgh-11173

[ci skip]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0.17 regression: convert_objects coercion with multiple dtypes
4 participants