-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG/API: DataFrame.iloc[:, foo] = bar inplaceness? #44353
Comments
cc @jreback @jorisvandenbossche @TomAugspurger discussed briefly on today's call could use your inputs |
gentle ping @jreback a lot of indexing work hinges on whether we want to change this |
API: df['col'] = value and df.loc[:,'col'] = value are now completely equivalent i think these should be completely equivalent always |
Two problems here:
|
xref #15631 |
Marking as blocker for 1.4rc; we should make a decision about deprecation before then. |
Personally I think I would like the consistency of Also, if we change this, I think it should go with some deprecation cycle, since this was explicitly designed this way before (it's not that we are fixing a bug). |
Yah, that's why i'm pushing to get this done for 1.4 so we can enforce in 2.0.
Agreed. Do you have any preferences on what that API would look like? I think something like |
Turns out the |
i don't think this is particular useful for integer indices, so -1 on adding yet another method to do the same thing. |
The trouble is that if you have non-unique columns then there is no public way of doing this |
i get it. |
I mean another option here is to go back to doing real masking instead of setitem with but i think that creates more issues |
I'm not clear on what you're suggesting. In the status quo Under the status quo, there is no way to do a try-to-operate-in-place setting of an entire column. Under the proposed change, there is no way to do a dont-try-to-operate-in-place positional setting of an entire column. |
we used to do exactly this and i changed it a long time ago because it was very confusing |
how is ".iloc[foo, bar] = baz always tries inplace" more confusing than "that, except for .iloc[:, bar] = baz which is almost never inplace, the exception being when baz has the same dtype as the existing column (but uh, not for ArrayManager or (some) ExtensionDtypes!)" |
In 1.3.0 we made it so that
df["A"] = foo
never operates in-place (xref whatsnew, #33457) and mostly made it so thatdf.loc[foo, bar] = baz
tries to operate in-place before falling back to casting (xref whatsnew, #33457 (comment))There are some remaining cases where .iloc and .loc do not try to set inplace. This issue is to make sure we have consensus about changing/deprecating that behavior. Most of the impacted test cases involve doing an astype, e.g.
ATM this changes df[0] to float64, i.e. is not inplace.
In this case, the user could/should do
df[0] = df[0].astype(np.float64)
. But that approach runs into problems with either non-unique columns or setting a slice of columnsdf.iloc[:, :2] = foo
My thoughts are that 1) we should be consistent about inplace-ness, not special-case the API and 2) the use case here is convenient.
Thoughts?
update Cataloguing issues/PRs where this behavior we implemented apparently-intentionally
#29393 (#25495)
#6149
#6159 BUG/API: df['col'] = value and df.loc[:,'col'] = value are now completely equivalent
update 2 Cataloguing issues that appear to be caused by/related to this inconsistency
#20635 BUG: indexing with loc and iloc with list-likes and new dtypes do not change from object dtype
#24269 DataFrame.loc/iloc fails to update object with new dtype
The text was updated successfully, but these errors were encountered: