-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] SortOrder supports arrays #2470
Labels
cudf_dependency
An issue or PR with this label depends on a new feature in cudf
feature request
New feature or request
Comments
wbo4958
added
feature request
New feature or request
? - Needs Triage
Need team to review and classify
labels
May 21, 2021
wbo4958
added a commit
to wbo4958/spark-rapids
that referenced
this issue
May 21, 2021
If some rows of order-by columns (it's `a,b,c` in the test) are equal, then it may fail because CPU and GPU can't guarantee the order for the same rows, while lead/lag is typically depending on row's order. The solution is we should add the aggregation column `d` and the default column `d_default` columns into the order-by to guarantee the order. But for now, sorting on array has not been supported yet, see NVIDIA#2470. So this PR just skip the test Signed-off-by: Bobby Wang <wbo4958@gmail.com>
pxLi
pushed a commit
that referenced
this issue
May 21, 2021
If some rows of order-by columns (it's `a,b,c` in the test) are equal, then it may fail because CPU and GPU can't guarantee the order for the same rows, while lead/lag is typically depending on row's order. The solution is we should add the aggregation column `d` and the default column `d_default` columns into the order-by to guarantee the order. But for now, sorting on array has not been supported yet, see #2470. So this PR just skip the test Signed-off-by: Bobby Wang <wbo4958@gmail.com>
sameerz
added
cudf_dependency
An issue or PR with this label depends on a new feature in cudf
and removed
? - Needs Triage
Need team to review and classify
labels
May 25, 2021
nartal1
pushed a commit
to nartal1/spark-rapids
that referenced
this issue
Jun 9, 2021
If some rows of order-by columns (it's `a,b,c` in the test) are equal, then it may fail because CPU and GPU can't guarantee the order for the same rows, while lead/lag is typically depending on row's order. The solution is we should add the aggregation column `d` and the default column `d_default` columns into the order-by to guarantee the order. But for now, sorting on array has not been supported yet, see NVIDIA#2470. So this PR just skip the test Signed-off-by: Bobby Wang <wbo4958@gmail.com>
nartal1
pushed a commit
to nartal1/spark-rapids
that referenced
this issue
Jun 9, 2021
If some rows of order-by columns (it's `a,b,c` in the test) are equal, then it may fail because CPU and GPU can't guarantee the order for the same rows, while lead/lag is typically depending on row's order. The solution is we should add the aggregation column `d` and the default column `d_default` columns into the order-by to guarantee the order. But for now, sorting on array has not been supported yet, see NVIDIA#2470. So this PR just skip the test Signed-off-by: Bobby Wang <wbo4958@gmail.com>
Sorting non-nested arrays is now supported in libcudf so I believe that this can be partially fixed if needed. Support for sorting nested arrays may need to wait further. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
cudf_dependency
An issue or PR with this label depends on a new feature in cudf
feature request
New feature or request
Is your feature request related to a problem? Please describe.
if the type of column
d
is Array, then any expressions and plans with SortOrder will not be accelerated by GPUThe text was updated successfully, but these errors were encountered: