-
Notifications
You must be signed in to change notification settings - Fork 889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cardinality control for groupby benchs with flat types #15134
Add cardinality control for groupby benchs with flat types #15134
Conversation
Thanks for kicking this off! You might also consider adding nvbench axes for cardinality for the existing benchmarks. If we started with a default value of |
data_profile profile = | ||
data_profile_builder() | ||
.cardinality(cardinality) | ||
.no_validity() | ||
.distribution(cudf::type_to_id<int32_t>(), distribution_id::UNIFORM, 0, size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code to create keys
and values
here seem to be very similar (the same) as for creating those of groupby max. Can we further extract them into a common function? Like create_keys_values()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The duplicate part is just data profile construction. Though we repeat those 5 lines of code each time, it's just one API call. It's not worth creating a helper function wrapping one single call IMO.
/merge |
Description
Contributes to #15114
This PR adds cardinality control to
group_max
,group_nunique
andgroup_rank
benchmarks.Checklist