Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support GpuSortArray #2616

Merged
merged 4 commits into from
Jun 9, 2021
Merged

Conversation

sperlingxx
Copy link
Collaborator

Closes #2557.

This PR is to support sort_array function on GPU with cuDF method ColumnVector.listSortRows.

Signed-off-by: sperlingxx <lovedreamf@gmail.com>
Signed-off-by: sperlingxx <lovedreamf@gmail.com>
integration_tests/src/main/python/collection_ops_test.py Outdated Show resolved Hide resolved
TypeSig.ARRAY.nested(TypeSig.all)),
("ascendingOrder", TypeSig.BOOLEAN, TypeSig.BOOLEAN)),
(in, conf, p, r) => new BinaryExprMeta[SortArray](in, conf, p, r) {
override def convertToGpu(lhs: Expression, rhs: Expression): GpuExpression = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't support literal arrays we have to tag it here so we fall back to the CPU.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can support literal arrays. And I made up the corresponding test.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gets hard to test this because if both parts are literals spark will optimize it before we ever see it and replace the entire thing with a literal array that is already sorted. There is a kind of hidden config to disable this, but it would take some digging to find it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you wrote is good enough for me. The test is likely not testing what we expect it to, but the code looks like it should work fine so I am OK with it.

@sameerz sameerz added the feature request New feature or request label Jun 7, 2021
sperlingxx and others added 2 commits June 8, 2021 09:39
Co-authored-by: Gera Shegalov <gshegalov@nvidia.com>
Signed-off-by: sperlingxx <lovedreamf@gmail.com>
@sperlingxx
Copy link
Collaborator Author

build

@sperlingxx sperlingxx merged commit a50e369 into NVIDIA:branch-21.08 Jun 9, 2021
@sperlingxx sperlingxx deleted the gpu_sort_array branch June 9, 2021 02:02
tgravescs pushed a commit that referenced this pull request Jun 9, 2021
Closes #2557.

This PR is to support sort_array function on GPU with cuDF method ColumnVector.listSortRows.

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

Co-authored-by: Gera Shegalov <gshegalov@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] support sort_array on GPU
5 participants