Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: Changing default of str.extract(expand=False) to str.extract(expand=True) #19118

Merged
merged 2 commits into from
Feb 5, 2018
Merged

DEPR: Changing default of str.extract(expand=False) to str.extract(expand=True) #19118

merged 2 commits into from
Feb 5, 2018

Conversation

datapythonista
Copy link
Member

…o str.extract(expand=True) (#6581)

@jreback jreback added Deprecate Functionality to remove in pandas Strings String extension data type and string data labels Jan 7, 2018
@@ -314,6 +314,7 @@ Removal of prior version deprecations/changes
- The ``Panel4D`` and ``PanelND`` classes have been removed (:issue:`13776`)
- The ``Panel``class has dropped the ``to_long``and ``toLong`` methods (:issue:`19077`)
- The options ``display.line_with`` and ``display.height`` are removed in favor of ``display.width`` and ``display.max_rows`` respectively (:issue:`4391`, :issue:`19107`)
- The ``expand`` parameter of :func:`str.extract` method is ``True`` by default (it was `False`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has changed, previously was default to None. add the issue number and this PR number. Move this instead to a sub-section (of API breaking changes) where you show an example of this.

@datapythonista
Copy link
Member Author

I changed the section in the wantsnew, and updated the text and the issue number, but I left the default parameter.

In the signature, expand had a default value of None, but in practice [1] it was False, as it was also said in the docstring [2]. I'll change the message if you disagree, but I think it makes more sense to say that by default was False before.

  1. https://github.com/pandas-dev/pandas/blob/master/pandas/core/strings.py#L687
  2. https://github.com/pandas-dev/pandas/blob/master/pandas/core/strings.py#L613

@@ -208,6 +208,8 @@ Other Enhancements
Backwards incompatible API changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- The default value of the ``expand`` parameter of :func:`str.extract` method changed from ``False`` to ``True`` (:issue:`11386`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well it change from None, which was previously evaluated as False. So None is no longer accepted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is actully a fairly big change, show an example and how to restore previous behavior

@datapythonista
Copy link
Member Author

Thank you for the comments @jreback I understood it now. If you want to take a look, I think it should be much better, and with the examples now. Let me know if any other change is needed.

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


By default, extracting matching patterns from strings with :func:`str.extract` used to return a
``Series`` if a single group was being extracted (a ``DataFrame`` if more than one group was
extracted``). Since Pandas 0.23.0 :func:`str.extract` always returns a ``DataFrame``, unless
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Since" -> "As of"

``expand`` is set to ``False`` (:issue:`11386`).

Also, ``None`` was an accepted value for the ``expand`` parameter (which was equivalent to
``False``), but since 0.23..0 it raises a ``ValueError``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

" but now raises a ``ValueError``.

@datapythonista
Copy link
Member Author

Thanks for the comments @TomAugspurger. I made the changes. In text.rst, there was no example with expand unspecified or None, so the text was already correct. I added a quick note about the default value in the warning.

Please let me know if that looks good, or you've got a better idea.

@datapythonista
Copy link
Member Author

All the comments from the reviews were addressed, right? Just double checking that the PR doesn't need anything from my side.

@jorisvandenbossche jorisvandenbossche changed the title DEPR: Changing default value of parameter str.extract(expand=False) t… DEPR: Changing default of str.extract(expand=False) to str.extract(expand=True) Feb 4, 2018
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added some minor comments on the whatsnew notice

Out [4]:
pandas.core.series.Series

In [5]: s.str.extract('.*(\d\d).*', expand=None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is needed to show this one (it just draws attention away from the more important example IMO)


New Behavior:

.. code-block:: ipython
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this a .. ipython:: python directive? (and remove the output below) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I use the ..ipython:: python directive in all the 3 cases, right?

I'm not sure what do you mean by "remove the output below", sorry. Can you clarify please?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: you can use it for code that can run now, so for the new behaviour. For showing the "old behaviour" you still need to use the literal code-block (there you don't need to change anything).

And by removing output I mean that in an ipython:: python block, you only include the actual code, as the output of the code is generated during the doc build. If you scroll a bit up in the file, you see some examples of that.

@datapythonista
Copy link
Member Author

Thanks for the info @jorisvandenbossche, didn't know about the .. ipython:: directive.

It should be all right now.

@jorisvandenbossche jorisvandenbossche merged commit ce435df into pandas-dev:master Feb 5, 2018
@jorisvandenbossche
Copy link
Member

@datapythonista Thanks!

@jreback jreback added this to the 0.23.0 milestone Feb 5, 2018
harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants