-
Notifications
You must be signed in to change notification settings - Fork 41
Simplifying indexing (DataFrame.__getitem__) #22
Comments
@shoyer Thanks for starting this discussion. I will try to update my overview next week. I think we should certainly consider this (which does not mean that it will eventually turn out to be possible/desirable to change this so radically) I like the simplicity of the proposal and how it covers the most important use case (I think you can count 'slicing with integers' (eg
I we want to do this, the question is also (apart from the exact semantics): how can we facilitate the transition? |
Here's another proposal, more similar to existing rules and without type dependent logic for indexer keys:
We need rule (2) because otherwise boolean indexing like If we make labels optional (#17), we would use integer indexing instead when there is no index for both A downside of this alternative is that it does break slicing with integers. |
The rules for exactly what
DataFrame.__getitem__
/__setitem__
does (pandas-dev/pandas#9595) are sufficiently complex and inconsistent that they are impossible to understand without extensive experimentation.This makes for a rather embarrassing situation that we really should fix for pandas 2.0.
I made a proposal when this came up last year:
I still like my proposal, but more importantly, it satisfies two important criteria:
df['foo']
,df[['foo', 'bar']]
, anddf[df['foo'] == 'bar']
might cover 80% of use cases).The text was updated successfully, but these errors were encountered: