Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Refactor experimental/row_operators.cuh and make it default #12593

Open
1 of 4 tasks
GregoryKimball opened this issue Jan 23, 2023 · 1 comment
Open
1 of 4 tasks
Assignees
Labels
2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code.

Comments

@GregoryKimball
Copy link
Contributor

GregoryKimball commented Jan 23, 2023

Is your feature request related to a problem? Please describe.
libcudf contains two sets of row operators: legacy row operators for simple types and experimental row operators for complex types. When we have completed "Part 1" of #11844, then we can safely refactor the experimental row operators to be the default, and drop the table::experimental namespace

Describe the solution you'd like
Ultimately we will deprecate the legacy row operators and move the experimental row operators out of the experimental namespace. Please note that the new equality and lexicographic comparators will include "fast paths" for simple types (see #11330 and #11129), so the legacy row operators will continue to play a role.

Merge plan for completing the deprecation

Describe alternatives you've considered
n/a

Additional context
See #10186 and follow-on issue #11844 for more information about the nested type comparator project.

@vyasr
Copy link
Contributor

vyasr commented Jan 24, 2023

To be clear, I would expect the "fast paths" that you describe to largely be implemented as template specializations of the new comparators rather than leaving the old comparators in place. Much of the code should be reusable and that's the path we've been able to follow in other places e.g. #11129.

@GregoryKimball GregoryKimball added 2 - In Progress Currently a work in progress and removed 0 - Backlog In queue waiting for assignment labels Jan 26, 2023
rapids-bot bot pushed a commit that referenced this issue Feb 16, 2023
This PR adds a fast path for primitive types similar to `experimental::row::lexicographic`. The compilation impact for building on bare-metal from source with command `./build.sh libcudf tests benchmarks` for baseline `16m43.607s` vs this branch `17m13.987s`.

This PR is a part of #12593.

Algorithms and benchmarks (those that were available are linked) affected by this change:
`experimental::row::equality::self_comparator`
- [x] [`group_nunique`](#12676 (comment))
- [x] [`group_rank_scan`](#12676 (comment))
- [x] [`rank_scan`](#12676 (comment))
- [x] `contains_table`
- [x] [`distinct`](#12676 (comment))
- [x] [`unique`](#12676 (comment))
- [x] [`rank`](#12676 (comment)) 

`experimental::row::equality::two_table_comparator`
- [x] `struct_binary_ops`
- [x] `lists/contains`
- [x] [`contains_scalar`](#12676 (comment)) (This algorithm does not need a primitive type optimization because the enclosing struct already type-dispatches based on nested vs non-nested types)
- [x] `contains_table`
- [x] `one_hot_encode`

Authors:
  - Divye Gala (https://github.com/divyegala)

Approvers:
  - Mike Wilson (https://github.com/hyperbolic2346)
  - Nghia Truong (https://github.com/ttnghia)
  - David Wendt (https://github.com/davidwendt)

URL: #12676
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

No branches or pull requests

3 participants