Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Freedman-Diaconis histogram #60312

Closed
GopherJ opened this issue Jul 28, 2020 · 3 comments
Closed

Freedman-Diaconis histogram #60312

GopherJ opened this issue Jul 28, 2020 · 3 comments
Labels
:Analytics/Aggregations Aggregations >enhancement stalled Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@GopherJ
Copy link

GopherJ commented Jul 28, 2020

Sometimes it can be hard to choose histogram interval, Freedman-Diaconis rule is a good example to calculate number of bins to have better distributions, sadly it's not supported in elasticsearch

@GopherJ GopherJ added >enhancement needs:triage Requires assignment of a team area label labels Jul 28, 2020
@polyfractal polyfractal added :Analytics/Aggregations Aggregations and removed needs:triage Requires assignment of a team area label labels Jul 28, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jul 28, 2020
@polyfractal
Copy link
Contributor

Would be nice I agree!

Similar to #50386 and #50120, I think this would require multi-pass agg support because we'd need a single pass over the data to determine the IQR first, then a second pass to bucket the documents.

Could expose a parameter if the user provides the IQR and count, but that's not very user-friendly.

Side note: the new variable width histogram might interest you, and we're planning to do an "auto-histogram" (#31828) similar to auto-date-histo which merges together buckets using some kind of deterministic scheme (powers of 10 or something). Not as nice as something like Freedman-Diaconis, but possible to do in a single pass.

Cheers for the request! Will keep this in mind as we consider multi-pass aggs!

@wchaparro
Copy link
Member

This is not something we plan to implement in the near future in aggregations, and has been superceded by our focus on ESQL. Closing as not planned. If your feel strongly about this one, please let us know.

@wchaparro wchaparro closed this as not planned Won't fix, can't repro, duplicate, stale Jul 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >enhancement stalled Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

No branches or pull requests

4 participants