-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: DataFrame.where does not handle Series slice correctly (#10218) #10283
Conversation
@@ -3439,6 +3439,8 @@ def where(self, cond, other=np.nan, inplace=False, axis=None, level=None, | |||
try_cast=False, raise_on_error=True): | |||
|
|||
if isinstance(cond, NDFrame): | |||
if isinstance(cond, pd.Series) and isinstance(self, pd.DataFrame): | |||
cond = self._getitem_array(cond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think would be better to make these align properly, e.g.
cond, _ = cond.align(self)
(which currently will raise)
and it will do the .reindex
then inside of align
you can so somethign like that
@mortada if you can have look w.r.t to my comments |
@jreback I dug into do you mean changing |
d406f7b
to
0ee610f
Compare
@jreback added |
Woah, not sure this is a good idea. Pandas currently does broadcasting the other way (like numpy). I do think this other broadcasting is more useful, but it is an API change that should be discussed more holistically. |
My concern is that pandas (like NumPy) currently has a Series broadcast against the rows of a DataFrame, similar to the situation for 1D/2D numpy arrays. So I don't think this is a good argument name, because here I don't really have a good alternative parameter name here. But I think it would be better simply to add support for
|
@shoyer if you see the above issue. This is exactly what this is try to address. But rather than a specific fix, I think a more general approach is warranted, maybe |
I agree -- |
@shoyer that's a good point, the name does seem to suggest an |
yes I meant it should be None by default and be an axis (int/string) |
@mortada can you update |
I think this also closes #9558 |
that looks right, @mortada can you add tests for that as well. |
@jreback sure will do |
@jreback one problem with using |
@mortada ok, then I think what we need to do is make the doc-string in |
@jreback sure, I've updated this to use |
Filling axis, method and limit | ||
broadcast_axis : %(axes_single_arg)s, default None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a versionadded
pls rebase, ping when green. |
@jreback rebased and added as for the docstring for |
right, |
It seems like broadcast_axis only makes sense for dataframes. Maybe it should be broadcast_axes instead? That would make sense for generalizing to panels. On Fri, Aug 21, 2015 at 3:31 PM, Jeff Reback notifications@github.com
|
@shoyer indeed this is really only for |
Broadcast_axes would take a pair of tuples (or ints) indicating which axis from each argument to broadcast against the other. On Sat, Aug 22, 2015 at 3:10 AM, Mortada Mehyar notifications@github.com
|
@mortada actually I think |
What does the API look like in the more general form? As much as I like this flexibility, it's not clear to me that broadcast_axis=1 will make sense in the broader context of panels. It's particularly ambiguous about which argument it refers to. On Sat, Aug 22, 2015 at 5:40 PM, Jeff Reback notifications@github.com
|
@mortada I forgot why we are not using |
@jreback we can't use the docstring for it actually sounds weird and it perhaps meant to say "aligned" not "allowed":
as a quick example
if
otherwise if
if
|
And the new parameter, whether we call it Since the highest dimension
|
by the way just updated to raise |
@@ -628,6 +628,9 @@ def _needs_reindex_multi(self, axes, method, level): | |||
""" don't allow a multi reindex on Panel or above ndim """ | |||
return False | |||
|
|||
def align(self, other): | |||
raise NotImplementedError | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have this take **kwargs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure will update
BUG: DataFrame.where does not handle Series slice correctly (#10218)
thanks @mortada |
closes #10218