-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
document is_copy #18799
Comments
you should certainly not be using this. This was always supposed to be an internal attribute, I am going to deprecate it. you can avoid very easily by just doing a copy on filtered results. or using assign rather than indexing. e.g.
is the pattern or you can just turn the warning completely off in a context manager. closing in favor of a deprecation issue #18801 |
Thank you for your reply, but I don't follow.
The result is already a copy, right?
Why do I need to copy it again?
I can't turn it off in a context manager since I hand the result to the
user, and the warning is raised in the user code.
Sent from phone. Please excuse spelling and brevity.
…On Dec 15, 2017 18:24, "Jeff Reback" ***@***.***> wrote:
Closed #18799 <#18799>.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#18799 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAbcFsr0hjbhy0c90pEThZI4xdpvGcYoks5tAv_IgaJpZM4REBuC>
.
|
I can also not assign, as this is again not in my control. I'm doing a
masking operation, and I want to return the masked df over to the user.
|
Canoncially, this is very easy to work with, simply
use
|
This was about But basically the pattern is: # library code:
def discard_less_than_zero(df):
return df[df.A >= 0]
# user code
df = pd.DataFrame({'A':[1,2,3]})
df2 = discard_less_than_zero(df)
df2['B'] = 2 This is of course a contrived example, but I think the same applies whenever you have a library method that returns a sliced dataframe. If # library code:
def discard_less_than_zero(df):
return df[df.A >= 0].copy() should do it. It just seems conceptually odd. If I understand the warning correctly, this means If df is on the order of magnitude of the free memory, doing an additional copy can mean not being able to work on certain datasets. And regarding the deprecation, I'm not married to any method. I just want a canonical way to solve the issue I described above, ideally without making unnecessary copies. I phrased the issue the way I did because the only information I could find was #6025 (comment), in which @jreback suggests using |
Code Sample, a copy-pastable example if possible
Problem description
I find the behavior of
SettingWithCopyWarning
quite surprising, but I guess that's just what it is.It would be great if you could document
is_copy
and how to use it, though.Whenever any function returns a dataframe, it seems like it should make sure that
is_copy
is set toFalse
(orNone
?) so the user doesn't get a warning if they change it - if you're returning a dataframe, it's unlikely that the user expects this to be a view, and you're not doing chained assignments.The
is_copy
attribute has an empty docstring in the docs and I couldn't find any explanation of it on the website (via google). The only think that told me that overwriting this attribute is actually the right thing to do (again, which is pretty weird to me), was #6025 (comment)The text was updated successfully, but these errors were encountered: