-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for decimal types in ORC writer #8198
Add support for decimal types in ORC writer #8198
Conversation
…ea-orc-writer-decimal
…ea-orc-writer-decimal
…ea-orc-writer-decimal
…ea-orc-writer-decimal
…ea-orc-writer-decimal
…ea-orc-writer-decimal
…ea-orc-writer-decimal
…ea-orc-writer-decimal
…ea-orc-writer-decimal
…ea-orc-writer-decimal
…ea-orc-writer-decimal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pytest lgtm
…ea-orc-writer-decimal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to say partial review but I know everything else is very orc and implementation specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A minor comment, apart from that looks good.
rerun tests |
rerun tests |
@gpucibot merge |
Closes #8159, #7126
Current implementation uses an array to hold the exact size of each encoded element before the encode step. This allows us to simplify the encoding (each element encode is independent) and to allocate streams of exact size instead of the worst-case. The process is different from other types because decimal data streams do not use RLE encoding.
Will add benchmarks once data generator can produce decimal data.