Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement DataFrameGroupBy.count() #289

Closed
sethmlarson opened this issue Oct 15, 2020 · 4 comments · Fixed by #292
Closed

Implement DataFrameGroupBy.count() #289

sethmlarson opened this issue Oct 15, 2020 · 4 comments · Fixed by #292
Labels
enhancement New feature or request help wanted Solution is fleshed out and ready to be worked on topic:dataframe Issue or PR about eland.DataFrame

Comments

@sethmlarson
Copy link
Contributor

No description provided.

@sethmlarson sethmlarson added enhancement New feature or request help wanted Solution is fleshed out and ready to be worked on topic:dataframe Issue or PR about eland.DataFrame labels Oct 15, 2020
@V1NAY8
Copy link
Contributor

V1NAY8 commented Oct 17, 2020

So, Ill implement the following:
Because pandas supports the same

>>> ed_df.groupby("dayOfWeek").count()
>>> ed_df.groupby("dayOfWeek").agg(["count"])

@sethmlarson
Copy link
Contributor Author

I wonder if you can use doc_count within each bucket? Gotta check how pandas counts NaNs.

@V1NAY8
Copy link
Contributor

V1NAY8 commented Oct 17, 2020

Yeah, I thought of that.
While investigating it, I found that this is failing

# Flights index
ed_df.agg(['count'])
# Raises an exception

In the _map_pd_aggs_to_es_aggs

  • we are mapping pd_agg 'count' to es_agg 'count'
  • We have to actually map it as 'count' to 'value_count'
  • This is a bug in aggregations

Does value_count work as expected for all type of fields?

@sethmlarson
Copy link
Contributor Author

Seems like a potential bug to me :) If you wanna fix all in one go that'd be good. Im pretty sure value_count works for what we want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Solution is fleshed out and ready to be worked on topic:dataframe Issue or PR about eland.DataFrame
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants