DEPR: deprecate some top-level non-used functions #15538

jreback · 2017-03-01T16:26:41Z

pd.pnow()
pd.groupby()
pd.match()
pd.Term
pd.Expr

remove info.py

jorisvandenbossche · 2017-03-02T08:37:36Z

@jreback Thanks for the PR!

For me, the main question is, do we actually need those functions? If not, I don't think we have to recommend an alternative import path, but just say they are "deprecated and will be removed in a future version"

groupby -> referring to the method is of course OK
pnow -> you can also use Period.now() (similar to Timestamp.now), so if we do a recommendation, I would refer to that instead of tseries.periods.pnow (that would also mean the actual tseries.periods.pnow can be deprecated, instead of only the top-level one by wrapping it)
Term / Expr: I never used them, so I can't really judge: do they have a usecase? If so, it is OK to refer to a more nested path for me, but then that should be documented / added to the public API.
match: the same. Do we want to have this as a public function? If not, I would recommend the alternative import path.

jreback · 2017-03-02T13:06:01Z

pnow -> you can also use Period.now() (similar to Timestamp.now), so if we do a recommendation, I would refer to that instead of tseries.periods.pnow (that would also mean the actual tseries.periods.pnow can be deprecated, instead of only the top-level one by wrapping it)

didn't realize this, yes I will point that out as instead (and deprecate Period.pnow() as well)

Term / Expr: I never used them, so I can't really judge: do they have a usecase? If so, it is OK to refer to a more nested path for me, but then that should be documented / added to the public API.
match: the same. Do we want to have this as a public function? If not, I would recommend the alternative import path.

Expr, you never need to directly use this (its used in .eval under the hood, should not have been exposed at all). Term was needed quite a while back, but string expressions replaced this (years ago).

match: the same. Do we want to have this as a public function? If not, I would recommend the alternative import path.

This is barely tested, not documented, and AFAICT never used anywhere. So it might have utility, but certainly not as a top-level function.

codecov-io · 2017-03-02T17:42:17Z

Codecov Report

Merging #15538 into master will decrease coverage by -0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #15538      +/-   ##
==========================================
- Coverage   91.04%   91.02%   -0.02%     
==========================================
  Files         136      136              
  Lines       49088    49107      +19     
==========================================
+ Hits        44694    44702       +8     
- Misses       4394     4405      +11

Impacted Files	Coverage Δ
pandas/computation/api.py	`100% <100%> (ø)`	✅
pandas/init.py	`91.66% <100%> (ø)`	✅
pandas/tseries/period.py	`92.8% <100%> (+0.02%)`	✅
pandas/core/api.py	`100% <100%> (ø)`	✅
pandas/io/api.py	`100% <100%> (ø)`	✅
pandas/io/gbq.py	`25% <0%> (-58.34%)`	❌
pandas/tools/merge.py	`91.78% <0%> (-0.35%)`	❌
pandas/core/frame.py	`97.82% <0%> (-0.1%)`	❌
pandas/tseries/index.py	`95.32% <0%> (-0.1%)`	❌
pandas/core/common.py	`91.36% <0%> (+0.33%)`	✅

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cd67704...59d2b61. Read the comment docs.

jorisvandenbossche

Thanks for the update!

I personally wouldn't even mention the alternative path for match, if you say that it is "barely tested, not documented, and AFAICT never used anywhere", we shouldn't encourage it.
If someone uses it, sees the removal message, they can always complain that it was useful and we can think about how to publicly expose it (I don't think core.algorithms should be that place).

jorisvandenbossche · 2017-03-02T08:21:33Z

doc/source/whatsnew/v0.20.0.txt

@@ -534,6 +534,12 @@ Deprecations
 - ``Series.sortlevel`` and ``DataFrame.sortlevel`` have been deprecated in favor of ``Series.sort_index`` and ``DataFrame.sort_index`` (:issue:`15099`)
 - importing ``concat`` from ``pandas.tools.merge`` has been deprecated in favor of imports from the ``pandas`` namespace. This should only affect explict imports (:issue:`15358`)
 - ``Series/DataFrame/Panel.consolidate()`` been deprecated as a public method. (:issue:`15483`)
+- The following top-level pandas functions have been deprecated and will be removed in a future version (:issue:`13790`)
+  * ``pd.now()``, replaced by a direct import from ``pandas.tseries.period``


I don't think we have to refer to tseries.period, but just remove it altogether? (or at least not recommend it)

actually you also have pd.Period.now(freq='H')

jorisvandenbossche · 2017-03-02T18:15:05Z

pandas/io/api.py

+
+    warnings.warn("pd.Term is deprecated as it is not "
+                  "applicable to user code. Instead use in-line "
+                  "string expressions in for searching HDFStore",


Can you add this sentence also to the whatsnew overview?

jorisvandenbossche · 2017-03-02T18:17:25Z

pandas/info.py

-pandas - a powerful data analysis and manipulation library for Python
-=====================================================================
-
-See http://pandas.pydata.org/ for full documentation. Otherwise, see the


Good idea to remove this file. To have the same effect (people doing pd? see some info), you can put eg what is above this comment in __init__.py as module docstring. It's not that useful, but it does also no harm to have a minimal explanation.

I could theorectically replace this with something (maybe a minimal README type of thing, so its on the actual doc-string). any thoughts?

yeah should replace with something, maybe a pointer to the docs (plus some minmal stuff). I can simply define this directly I think (in init.py). Any thoughts on what this should say?

Yes, you can just put the exact same string on the first lines of __init__.py, and it will be interpreted as the module docstring.

I would put just the same as here (title, and link to full docs). In principle we could do some clever parsing of a README file and adding that, but I wouldn't do that effort personally.

I just did this

In [1]: pd? Type: module String form: <module 'pandas' from '/Users/jreback/pandas/pandas/__init__.py'> File: ~/pandas/pandas/__init__.py Docstring: pandas - a powerful data analysis and manipulation library for Python ===================================================================== **pandas** is a Python package providing fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, **real world** data analysis in Python. Additionally, it has the broader goal of becoming **the most powerful and flexible open source data analysis / manipulation tool available in any language**. It is already well on its way toward this goal. Main Features ------------- Here are just a few of the things that pandas does well: - Easy handling of missing data in floating point as well as non-floating point data - Size mutability: columns can be inserted and deleted from DataFrame and higher dimensional objects - Automatic and explicit data alignment: objects can be explicitly aligned to a set of labels, or the user can simply ignore the labels and let `Series`, `DataFrame`, etc. automatically align the data for you in computations - Powerful, flexible group by functionality to perform split-apply-combine operations on data sets, for both aggregating and transforming data - Make it easy to convert ragged, differently-indexed data in other Python and NumPy data structures into DataFrame objects - Intelligent label-based slicing, fancy indexing, and subsetting of large data sets - Intuitive merging and joining data sets - Flexible reshaping and pivoting of data sets - Hierarchical labeling of axes (possible to have multiple labels per tick) - Robust IO tools for loading data from flat files (CSV and delimited), Excel files, databases, and saving/loading data from the ultrafast HDF5 format - Time series-specific functionality: date range generation and frequency conversion, moving window statistics, moving window linear regressions, date shifting and lagging, etc.

as the README.md has all kinds of html and such (and is too long)

closes pandas-dev#13790 pd.pnow pd.groupby pd.match pd.Term pd.Expr remove info.py

jreback · 2017-03-02T18:58:51Z

so we have this mini example in

http://pandas-docs.github.io/pandas-docs-travis/comparison_with_r.html

In [1]: s = pd.Series(np.arange(5),dtype=np.float32)

In [2]: s
Out[2]: 
0    0.0
1    1.0
2    2.0
3    3.0
4    4.0
dtype: float32

In [3]: pd.Series(pd.match(s,[2,4],np.nan))
Out[3]: 
0    NaN
1    NaN
2    0.0
3    NaN
4    1.0
dtype: float64

which is actually confusing / misleading.


In [7]: s.where(s.isin([2,4]))
Out[7]: 
0    NaN
1    NaN
2    2.0
3    NaN
4    4.0
dtype: float32

is much more useful (I don't know why you would want that kind of transformation)

though I don't use R much, so not sure what it is trying to show.

jorisvandenbossche · 2017-03-02T20:03:08Z

Hmm, I would maybe just remove that example (isin is already showed just above the match example).
match has its use (if you want exactly what is in the example), question is whether we need to support that.

jreback · 2017-03-02T20:07:01Z

@jorisvandenbossche yep sounds good. Yeah the function still exists, its just not top-level (like .unique, .factorize, .value_counts); though even the latter 2 don't really need to be exposed directly to the user (.value_counts is a method, .factorize is just an implementation detail for category).

These made much more sense when directly allowing numpy arrays was useful (but that is less and less useful). But this is another story.

ok will remove that part of the doc.

closes pandas-dev#13790 pd.pnow pd.groupby pd.match pd.Term pd.Expr remove info.py

Drops the following: * pd.pnow * pd.match * pd.groupby * pd.get_store * pd.Expr * pd.Term xref pandas-devgh-15538. xref pandas-devgh-15940.

Drops the following: * pd.pnow * pd.match * pd.groupby * pd.get_store * pd.Expr * pd.Term xref gh-15538. xref gh-15940.

Drops the following: * pd.pnow * pd.match * pd.groupby * pd.get_store * pd.Expr * pd.Term xref pandas-devgh-15538. xref pandas-devgh-15940.

jreback added the Deprecate Functionality to remove in pandas label Mar 1, 2017

jreback added this to the 0.20.0 milestone Mar 1, 2017

jsexauer mentioned this pull request Mar 1, 2017

DEPR: Clean up list of deprecations from prior versions #6581

Closed

1 task

jreback force-pushed the top_level branch from 8dcbd34 to 12a571e Compare March 1, 2017 19:19

jreback force-pushed the top_level branch 3 times, most recently from da53aac to 7c52dd3 Compare March 2, 2017 14:52

jorisvandenbossche reviewed Mar 2, 2017

View reviewed changes

jreback added 2 commits March 2, 2017 13:36

DEPR: deprecate some top-level non-used functions

a1bb502

closes pandas-dev#13790 pd.pnow pd.groupby pd.match pd.Term pd.Expr remove info.py

add doc-string to top-level

6df38ed

jreback force-pushed the top_level branch from 7c52dd3 to 6df38ed Compare March 2, 2017 18:46

revised doc-strings

90849b6

remove pd.match from docs

59d2b61

jorisvandenbossche merged commit 524a9a0 into pandas-dev:master Mar 3, 2017

AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this pull request Mar 21, 2017

DEPR: deprecate some top-level non-used functions (pandas-dev#15538)

e3ba132

closes pandas-dev#13790 pd.pnow pd.groupby pd.match pd.Term pd.Expr remove info.py

gfyoung added a commit to forking-repos/pandas that referenced this pull request Oct 27, 2018

CLN: Cleanup top-level pandas namespace

67c908d

Drops the following: * pd.pnow * pd.match * pd.groupby * pd.get_store * pd.Expr * pd.Term xref pandas-devgh-15538. xref pandas-devgh-15940.

gfyoung mentioned this pull request Oct 27, 2018

CLN: Cleanup top-level pandas namespace #23380

Merged

jreback mentioned this pull request Oct 27, 2018

DEPR: deprecations log for removed issues #13777

Closed

jreback pushed a commit that referenced this pull request Oct 28, 2018

CLN: Cleanup top-level pandas namespace (#23380)

40423e3

Drops the following: * pd.pnow * pd.match * pd.groupby * pd.get_store * pd.Expr * pd.Term xref gh-15538. xref gh-15940.

tm9k1 pushed a commit to tm9k1/pandas that referenced this pull request Nov 19, 2018

CLN: Cleanup top-level pandas namespace (pandas-dev#23380)

9652fc9

Drops the following: * pd.pnow * pd.match * pd.groupby * pd.get_store * pd.Expr * pd.Term xref pandas-devgh-15538. xref pandas-devgh-15940.

rwolst mentioned this pull request Feb 9, 2020

Should pd.core.algorithms.match have been removed? #31827

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DEPR: deprecate some top-level non-used functions #15538

DEPR: deprecate some top-level non-used functions #15538

jreback commented Mar 1, 2017 •

edited

Loading

jorisvandenbossche commented Mar 2, 2017

jreback commented Mar 2, 2017

codecov-io commented Mar 2, 2017 •

edited

Loading

jorisvandenbossche left a comment •

edited

Loading

jorisvandenbossche Mar 2, 2017

jorisvandenbossche Mar 2, 2017

jorisvandenbossche Mar 2, 2017

jorisvandenbossche Mar 2, 2017

jreback Mar 2, 2017

jreback Mar 2, 2017

jorisvandenbossche Mar 2, 2017

jreback Mar 2, 2017

jorisvandenbossche Mar 2, 2017

jreback commented Mar 2, 2017

jorisvandenbossche commented Mar 2, 2017

jreback commented Mar 2, 2017

DEPR: deprecate some top-level non-used functions #15538

DEPR: deprecate some top-level non-used functions #15538

Conversation

jreback commented Mar 1, 2017 • edited Loading

jorisvandenbossche commented Mar 2, 2017

jreback commented Mar 2, 2017

codecov-io commented Mar 2, 2017 • edited Loading

Codecov Report

jorisvandenbossche left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Mar 2, 2017

jorisvandenbossche commented Mar 2, 2017

jreback commented Mar 2, 2017

jreback commented Mar 1, 2017 •

edited

Loading

codecov-io commented Mar 2, 2017 •

edited

Loading

jorisvandenbossche left a comment •

edited

Loading