-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the issue of exporting Column RDD [databricks] #4335
Conversation
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally the test should be written as an integration test using the public ColumnarRdd
API to replicate what user code would do, but I'm OK with this being a followup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please retarget to 22.02, as we are in code freeze for 21.12 and this is not a data corruption, crash, or customer critical fix (at this time).
The base branch was changed.
Signed-off-by: Bobby Wang <wbo4958@gmail.com>
1bbb6c4
to
6b6a0a1
Compare
Hmm, Seems the user can't get |
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for moving to 22.02.
Premerge failed in what appears to be an unrelated Python env error. Rekicking. |
build |
build |
This PR is trying to fix #4334.
After 3.1.x (included),
ColumnarRDD(df)
can't extract RDD[Table] directly, instead it will involve columnar to row and row to column which causes perf bad. It turned out the exportColumnRdd is not passed toGpuColumnarToRowExecParent