-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-enable the struct support for the ORC reader. #3262
Conversation
Also add tests for the nested predicate pushdown, and the support for nested column pruning. Relevant PRs: NVIDIA#3079 NVIDIA#2887 Signed-off-by: Firestarman <firestarmanllc@gmail.com>
build |
There is still an issue in CUDF for a coner case when reading a struct column, and here is the fix rapidsai/cudf#9060. |
Move to draft because of the open blocking issue rapidsai/cudf#9060. |
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
build |
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
build |
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
signed-off-by: Firestarman <firestarmanllc@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, just waiting on the cudf issue.
build |
The fix rapidsai/cudf#9060 has been merged. |
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOrcScan.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOrcScan.scala
Outdated
Show resolved
Hide resolved
All the code looks fine, just a few minor nits that I can live without. |
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
build |
Thanks for reivew. I decided to address the nits in this PR because there is one for the code change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nit
fileSchema.getCategory == readSchema.getCategory && { | ||
if (readSchema.getChildren != null) { | ||
readSchema.getChildren.asScala.forall(rc => | ||
fileSchema.getChildren.asScala.exists(fc => isSchemaCompatible(fc, rc))) | ||
} else { | ||
false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
fileSchema.getCategory == readSchema.getCategory && { | |
if (readSchema.getChildren != null) { | |
readSchema.getChildren.asScala.forall(rc => | |
fileSchema.getChildren.asScala.exists(fc => isSchemaCompatible(fc, rc))) | |
} else { | |
false | |
} | |
fileSchema.getCategory == readSchema.getCategory && | |
readSchema.getChildren != null) && | |
readSchema.getChildren.asScala.forall(rc => | |
fileSchema.getChildren.asScala.exists(fc => isSchemaCompatible(fc, rc))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reivew. I will update this in antoher PR.
It adds the struct support in ORC reader, along with the tests for the nested predicate pushdown, and the support for nested column pruning.
fixes #2879
fixes #1481
fixes #463
The old relevant PRs:
#3079
#2887
Signed-off-by: Firestarman firestarmanllc@gmail.com