Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor groupby to use classes #155

Merged
merged 1 commit into from
Aug 19, 2024

Conversation

mjpieters
Copy link
Contributor

The implementation is closely modelled on the cpython C implementation of the groupby iterator. This should be easier to reason about by humans and code scanners alike.

Closes #96

Copy link
Owner

@maxfischer2781 maxfischer2781 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work! Indeed this is easier to reason about.

There's one thing missing that's needed for the async world but not the regular groupby: an aclose method on GroupBy and _Grouper. The generator implementation got them for free (though I'm not entirely sure whether the group.aclose behaviour was actually correct...).

As far as I can tell, _Grouper.aclose should just set state.current_group to None. GroupBy.aclose should close the current group as well as the iterator.

@maxfischer2781 maxfischer2781 mentioned this pull request Aug 13, 2024
@mjpieters
Copy link
Contributor Author

Are you sure? The GroupBy class is not an async generator, it's an async iterator; aclose() is there for automatic finalizing on loop exit for async generators, where the interpreter automatically registers aclose() and a first-iteration hook whenever one is created.

What will call aclose() on these? Unless you are proposing we also create a first-iteration hook and call sys.set_asyncgen_hooks(...)?

@mjpieters
Copy link
Contributor Author

Okay, I just re-familiarised myself with the asyncstdlib iterator scoping document. I may not yet fully understand what code takes responsibility for calling aclose(), but I'm happy to implement these.

@mjpieters
Copy link
Contributor Author

Implementations added; GroupBy delegates this to the _GroupByState instance (as it holds the iterator and the current group).

The implementation is closely modelled on the cpython C implementation of the groupby iterator.
@maxfischer2781
Copy link
Owner

Thanks, this looks good to go!

What will call aclose() on these? Unless you are proposing we also create a first-iteration hook and call sys.set_asyncgen_hooks(...)?

The …_asyncgen_hooks and async generator aclose are the safety net if iterators are just dropped. The various explicit aclose methods are there so that people can force prompt cleanup when they need it.

@maxfischer2781 maxfischer2781 merged commit cb745e8 into maxfischer2781:master Aug 19, 2024
11 checks passed
@mjpieters mjpieters deleted the refacored_groupby branch August 20, 2024 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Switch itertools.groupby closures to class implementation
2 participants