Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Rewrite AggregationNode to use pyarrow.table group_by #39

Closed
joocer opened this issue Mar 16, 2022 · 3 comments
Closed

✨ Rewrite AggregationNode to use pyarrow.table group_by #39

joocer opened this issue Mar 16, 2022 · 3 comments
Assignees
Labels
Next Release Planned for next release

Comments

@joocer
Copy link
Contributor

joocer commented Mar 16, 2022

Test performance first, initial comparison testing had the existing code running faster.

@joocer joocer assigned joocer and unassigned joocer Mar 16, 2022
@joocer joocer added this to the 0.2 milestone Apr 14, 2022
@joocer
Copy link
Contributor Author

joocer commented Jun 18, 2022

Keep both implementations unless one is a clear winner for performance and reliability and use cost and hints to choose which one to run.

@joocer joocer removed this from the 0.2 milestone Jul 2, 2022
@joocer joocer added the Next Release Planned for next release label Aug 8, 2022
@joocer
Copy link
Contributor Author

joocer commented Aug 8, 2022

Part of #42

@joocer
Copy link
Contributor Author

joocer commented Aug 19, 2022

Done, three discovered tasks:

  • more regression tests to be written (existing ones pass - except one which wasn't standard behaviour)
  • performance checked against production volumes
  • confirm approach doesn't blow memory on production volumes

@joocer joocer self-assigned this Aug 19, 2022
@joocer joocer changed the title [FEATURE] Rewrite AggregationNode to use pyarrow.table group_by ✨ Rewrite AggregationNode to use pyarrow.table group_by Aug 20, 2022
joocer added a commit that referenced this issue Aug 20, 2022
joocer added a commit that referenced this issue Aug 21, 2022
@joocer joocer closed this as completed Aug 21, 2022
joocer added a commit that referenced this issue Aug 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Next Release Planned for next release
Projects
None yet
Development

No branches or pull requests

1 participant