Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Groupby bins empty groups #1027

Merged
merged 5 commits into from
Oct 3, 2016
Merged

Conversation

rabernat
Copy link
Contributor

@rabernat rabernat commented Oct 2, 2016

This PR fixes a bug in groupby_bins in which empty bins were dropped from the grouped results. Now groupby_bins restores any empty bins automatically. To recover the old behavior, one could apply dropna after a groupby operation.

Fixes #1019

# one of these bins will be empty
bins = [0,4,5]
actual = array.groupby_bins('dim_0', bins, drop_empty_bins=True).sum()
print(actual)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prob don't want to keep this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops! yes forgot to take that out

@shoyer
Copy link
Member

shoyer commented Oct 3, 2016

Rather than adding a new keyword argument, I'm inclined to unilaterally set drop_empty_bins=True and treat the former behavior as a bug. I can't see many use cases for the previous behavior.

@rabernat
Copy link
Contributor Author

rabernat commented Oct 3, 2016

@shoyer So do you want the keyword argument dropped altogether? Or just the default changed (as in my last commit)?

I can imagine wanting to drop the empty bins in some cases, but not as default. So I would vote to keep the keyword argument.

@shoyer
Copy link
Member

shoyer commented Oct 3, 2016

Yeah my inclination would be not to add the option at all.
On Sun, Oct 2, 2016 at 7:16 PM Ryan Abernathey notifications@github.com
wrote:

@shoyer https://github.com/shoyer So do you want the keyword argument
dropped altogether? Or just the default changed (as in my last commit)?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1027 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABKS1l-Mw7ojnUQM14ZoCVXwFtA2prn-ks5qwGV7gaJpZM4KMH6y
.

@rabernat
Copy link
Contributor Author

rabernat commented Oct 3, 2016

I guess it's pretty easy to just do dropna

@fmaussion
Copy link
Member

Just for future readers like me: the point of this PR is to make drop_empty_bins = False per default, right? (not True as Stephan wrote). Maybe a new edit of @rabernat 's original post would help.

@rabernat
Copy link
Contributor Author

rabernat commented Oct 3, 2016

@fmaussion it's a little more involved than that...see edited description.

@shoyer shoyer merged commit 0e044ce into pydata:master Oct 3, 2016
@shoyer
Copy link
Member

shoyer commented Oct 3, 2016

Thanks!

@rabernat rabernat deleted the groupby_bins_empty_groups branch October 3, 2016 15:28
fmaussion added a commit to fmaussion/xarray that referenced this pull request Oct 15, 2016
fmaussion added a commit to fmaussion/xarray that referenced this pull request Oct 15, 2016
shoyer pushed a commit that referenced this pull request Nov 2, 2016
* fixes #1027

* reviews + whats new

* wrong commit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

groupby_bins: exclude bin or assign bin with nan when bin has no values
5 participants