-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: Add examples for MultiIndex.get_locs + cleanups #17675
DOC: Add examples for MultiIndex.get_locs + cleanups #17675
Conversation
pandas/core/indexes/multi.py
Outdated
boolean mask array, otherwise it is always a slice or int. | ||
Get location for a label or a tuple of labels as an integer, slice or | ||
boolean mask. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put this Note in Notes.
Codecov Report
@@ Coverage Diff @@
## master #17675 +/- ##
==========================================
- Coverage 91.27% 91.25% -0.02%
==========================================
Files 163 163
Lines 49765 49765
==========================================
- Hits 45421 45412 -9
- Misses 4344 4353 +9
Continue to review full report at Codecov.
|
2a0d882
to
20ed680
Compare
I've made some changes and put them in a new commit as it contains some (very minor) corrections to previous commits also. #17655 had the reference to #17653 had too many back-ticks in the links in section "Examples". I've removed those. |
@@ -223,7 +223,7 @@ class Categorical(PandasObject): | |||
|
|||
See also | |||
-------- | |||
pandas.api.types.CategoricalDtype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this property xref here? (CategoricalDtype is NOT in the main namespace)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I've changed it to api.types.CategoricalDtype
. I had misunderstood the location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking more into it, I actually think CategoricalDtype
isn't shown at all in the generated API docs, so the reference will be not work in any case. Is this correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the problem is not that 'pandas' is there (that is perfectly OK, and in this case I even prefer to show the full path for clarity, as api
is not a commonly used submodule of pandas), the issue is that this page does not exist, so it cannot link to it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll put pandas.
back in then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. You can also add it to the api.rst pages so the link will work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change uploaded.
There seems to be something in the API.rst (line 649), but it mustn't work. @TomAugspurger has started looking into it, I'll wait till he figures it out.
If I get to know what I should add to API.rst, I can add it to this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In api.rst it should be pandas.api.type.CategoricalDtype
(missing the pandas).
But the autoclass fit the best with the rest of the autosummary's, so I may change it. For now, I'd just leave your reference as pandas.api.type.CategoricalDtype
, and I'll make sure they're valid before the release.
20ed680
to
5675c65
Compare
Seems like some issue with the API reference. I'm looking into it.
…On Tue, Sep 26, 2017 at 6:48 AM, topper-123 ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pandas/core/categorical.py
<#17675 (comment)>:
> @@ -223,7 +223,7 @@ class Categorical(PandasObject):
See also
--------
- pandas.api.types.CategoricalDtype
Looking more into it, I actually think CategoricalDtype isn't shown at
all in the generated API docs, so the reference will be not work in ant
case. Is this correct?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#17675 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIj7WrfR-Cpu1g9U1bQrEdH6WLX7fks5smOShgaJpZM4Pjbs9>
.
|
5675c65
to
514b5ad
Compare
pandas/core/indexes/multi.py
Outdated
get_locs : Given a tuple of slices/lists/labels/boolean indexer to a | ||
level-wise spec, produce an indexer to extract those | ||
locations. | ||
get_locs : Get location for a label/slice/list/mask or a sequence of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_locs -> MultiIndex.get_locs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
pandas/core/indexes/multi.py
Outdated
|
||
Parameters | ||
---------- | ||
key : tuple of (slices/list/labels) | ||
seq : label/slice/list/mask or a sequence of such. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be good to explain how the 'sequence of such' should be interpreted
(I should check the code myself to actually know it, but are the different elements of the list mapped to the different levels?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add here "(one for each level)" as you did in get_loc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made a new proposal. This is very difficult to explain in short sentences, I find.
pandas/core/indexes/multi.py
Outdated
|
||
Returns | ||
------- | ||
locs : integer list of locations or boolean indexer suitable | ||
locs : integer, slice or boolean mask suitable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is correct: you changed 'integer list of locations' to 'integer', but from the examples it seems to return an array of integers, so the 'list' was more correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, tup indicates that this method takes a tuple, but it also takes other sequences like lists. In an example I tried it actually choked on tuple, because it thought the tuple was the label of a level. My experience from the last days it therefore is that it's probaby best to use lists rather than tuples (but tuple may work, depending on context).
On the other hand get_loc (without s) demands a tuple and chokes on lists, so it's not so easy to remember what is accepted where.
I can say "list or such", to discourage using tuples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that my comment was about the return value, not the input (and there indeed tuples vs lists sometimes are slightly different but not always, which can be confusing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, I actually couldn't manage to get it to return a boolean mask.
Could you confirm that? I haven't changed that part, but if it's not true, it'd be better it didn't say that...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, looking at the code, I actually think it can only return an integer array. That would mean that the explanation is just wrong.
But I am not too familiar with this. (@jreback ?)
Is there anything I should do here wrt. the CircleCI tests failing? The test failure doesn't seem to related to my PR. |
No, you don't need to care, there are some problems in general: #17702 |
ddd9eab
to
9d3df60
Compare
pandas/core/indexes/multi.py
Outdated
>>> mi = pd.MultiIndex.from_arrays([list('abb'), list('def')]) | ||
|
||
>>> mi.get_locs('b') | ||
slice(1, 3, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I run this locally, I actually get an array as result:
In [109]: mi = pd.MultiIndex.from_arrays([list('abb'), list('def')])
In [110]: mi.get_locs('b')
Out[110]: array([1, 2])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, me too, actually. It seems the func then only return arrays?
9d3df60
to
b4248ef
Compare
cba6753
to
1b0ab88
Compare
I've looked some more at the return value. The return is always |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some final minor comments!
pandas/core/indexes/multi.py
Outdated
Notes | ||
------ | ||
The key cannot be a slice, list of same-level labels, a boolean mask, | ||
or a sequence of such. If you want to use those, use :meth:`get_locs` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_locs -> MultiIndex.get_locs (I think, it would actually be interesting to know whether it works like this if it is a method of the same class. Did you build it locally and checked? But I think this will not work, because in the generated rst docstring pages, a .. currentmodule:: pandas
is included)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've Changed it.
It it possible to to build just the API docs or maybe a subgroup thereof? Working with doc is a but cumbersome, when building it takes so much time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you pass the arg —�single api
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it can indeed be very cumbersome. It's a very long time ago that I tried the --single
flag. It was also meant to eg run only a single .rst file. But I can remember it also didn't work smoothly.
If you think of other ways to make this easier, very welcome! (I once had a script to build only a single docstring, I could try to look that up, but not sure that such links then would work anyhow)
pandas/core/indexes/multi.py
Outdated
key : tuple of (slices/list/labels) | ||
seq : label/slice/list/mask or a sequence of such (one for each used | ||
level. If a level should not be used, set it to | ||
``slice(None)``). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multi-line type specifications don't really work (does not render correctly in sphinx, unless you use \
to indicate line continuation). But you can also spit this up:
seq : label/slice/list/mask or a sequence of such (one for each level)
More explanation ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
1b0ab88
to
070e8f3
Compare
@topper-123 Thanks a lot! |
* 'master' of github.com:pandas-dev/pandas: (188 commits) Separate out _convert_datetime_to_tsobject (pandas-dev#17715) DOC: remove whatsnew note for xref pandas-dev#17131 BUG: Regression in .loc accepting a boolean Index as an indexer (pandas-dev#17738) DEPR: Deprecate cdate_range and merge into bdate_range (pandas-dev#17691) CLN: replace %s syntax with .format in pandas.core: categorical, common, config, config_init (pandas-dev#17735) Fixed the memory usage explanation of categorical in gotchas from O(nm) to O(n+m) (pandas-dev#17736) TST: add backward compat for offset testing for pickles (pandas-dev#17733) remove unused time conversion funcs (pandas-dev#17711) DEPR: Deprecate convert parameter in take (pandas-dev#17352) BUG:Time Grouper bug fix when applied for list groupers (pandas-dev#17587) BUG: Fix some PeriodIndex resampling issues (pandas-dev#16153) BUG: Fix unexpected sort in groupby (pandas-dev#17621) DOC: Fixed typo in documentation for 'pandas.DataFrame.replace' (pandas-dev#17731) BUG: Fix series rename called with str altering name rather index (GH17407) (pandas-dev#17654) DOC: Add examples for MultiIndex.get_locs + cleanups (pandas-dev#17675) Doc improvements for IntervalIndex and Interval (pandas-dev#17714) BUG: DataFrame sort_values and multiple "by" columns fails to order NaT correctly Last of the timezones funcs (pandas-dev#17669) Add missing file to _pyxfiles, delete commented-out (pandas-dev#17712) update imports of DateParseError, remove unused imports from tslib (pandas-dev#17713) ...
git diff upstream/master -u -- "*.py" | flake8 --diff
Adds examples to
MultiIndex.get_locs
and also some changes to.get_loc
and.get_loc_values
.