-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-104: [FORMAT] Add alignment and padding requirements + union clarification #67
Conversation
…reflect using only 1 type buffer
@@ -10,6 +10,8 @@ concepts, here is a small glossary to help disambiguate. | |||
* Contiguous memory region: a sequential virtual address space with a given | |||
length. Any byte can be reached via a single pointer offset less than the | |||
region's length. | |||
* Contiguous memory buffer: A contiguous memory region that stores | |||
a multi-value component of an Array. Sometimes referred to as just "buffer". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this read better as "a variable length component of an Array".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think either is okay. Could give as an example an array of integers of some type (e.g. signed int8 or signed int32) and length
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave as is. Hopefully the rest of the document serves as an example.
This all seems reasonable to me. @toddlipcon, @jacques-n, @Ippokratis7, @bigdata-memory, and others, would you give a look when you are able? |
Any additional feedback on this? |
Since this reflects the discussion on the mailing list, I'll signing off on this as the default alignment, and we can revisit if there are some lingering concerns. Should other minimum alignments be needed we can most likely address that in the metadata. +1, thank you |
Support isnull, isnotnull, equal, and not_equal for date/time types Support date/time types for less_than, less_than_or_equal_to, greater_than, greater_than_or_equal_to Implement all extractXxx functions
Support isnull, isnotnull, equal, and not_equal for date/time types Support date/time types for less_than, less_than_or_equal_to, greater_than, greater_than_or_equal_to Implement all extractXxx functions
Support isnull, isnotnull, equal, and not_equal for date/time types Support date/time types for less_than, less_than_or_equal_to, greater_than, greater_than_or_equal_to Implement all extractXxx functions
Support isnull, isnotnull, equal, and not_equal for date/time types Support date/time types for less_than, less_than_or_equal_to, greater_than, greater_than_or_equal_to Implement all extractXxx functions
I was also able to remove the `-Wno-unused-value` compiler flag. Removing `-Wno-unused-variable` will have to take place in another patch (more work required). Author: Wes McKinney <wesm@apache.org> Closes apache#67 from wesm/PARQUET-463 and squashes the following commits: da3afb2 [Wes McKinney] Fix signed-unsigned comparisons inside dchecks a1ca479 [Wes McKinney] Remove -Wno-unused-value 0b49cc6 [Wes McKinney] Adapt simple dcheck macros from Kudu, fix dcheck failures
Support isnull, isnotnull, equal, and not_equal for date/time types Support date/time types for less_than, less_than_or_equal_to, greater_than, greater_than_or_equal_to Implement all extractXxx functions
I was also able to remove the `-Wno-unused-value` compiler flag. Removing `-Wno-unused-variable` will have to take place in another patch (more work required). Author: Wes McKinney <wesm@apache.org> Closes apache#67 from wesm/PARQUET-463 and squashes the following commits: da3afb2 [Wes McKinney] Fix signed-unsigned comparisons inside dchecks a1ca479 [Wes McKinney] Remove -Wno-unused-value 0b49cc6 [Wes McKinney] Adapt simple dcheck macros from Kudu, fix dcheck failures Change-Id: Ia735bfc97f1641984f9925f662c828ab270f0596
I was also able to remove the `-Wno-unused-value` compiler flag. Removing `-Wno-unused-variable` will have to take place in another patch (more work required). Author: Wes McKinney <wesm@apache.org> Closes apache#67 from wesm/PARQUET-463 and squashes the following commits: da3afb2 [Wes McKinney] Fix signed-unsigned comparisons inside dchecks a1ca479 [Wes McKinney] Remove -Wno-unused-value 0b49cc6 [Wes McKinney] Adapt simple dcheck macros from Kudu, fix dcheck failures Change-Id: Ia735bfc97f1641984f9925f662c828ab270f0596
I was also able to remove the `-Wno-unused-value` compiler flag. Removing `-Wno-unused-variable` will have to take place in another patch (more work required). Author: Wes McKinney <wesm@apache.org> Closes apache#67 from wesm/PARQUET-463 and squashes the following commits: da3afb2 [Wes McKinney] Fix signed-unsigned comparisons inside dchecks a1ca479 [Wes McKinney] Remove -Wno-unused-value 0b49cc6 [Wes McKinney] Adapt simple dcheck macros from Kudu, fix dcheck failures Change-Id: Ia735bfc97f1641984f9925f662c828ab270f0596
I was also able to remove the `-Wno-unused-value` compiler flag. Removing `-Wno-unused-variable` will have to take place in another patch (more work required). Author: Wes McKinney <wesm@apache.org> Closes apache#67 from wesm/PARQUET-463 and squashes the following commits: da3afb2 [Wes McKinney] Fix signed-unsigned comparisons inside dchecks a1ca479 [Wes McKinney] Remove -Wno-unused-value 0b49cc6 [Wes McKinney] Adapt simple dcheck macros from Kudu, fix dcheck failures Change-Id: Ia735bfc97f1641984f9925f662c828ab270f0596
Support isnull, isnotnull, equal, and not_equal for date/time types Support date/time types for less_than, less_than_or_equal_to, greater_than, greater_than_or_equal_to Implement all extractXxx functions
Support isnull, isnotnull, equal, and not_equal for date/time types Support date/time types for less_than, less_than_or_equal_to, greater_than, greater_than_or_equal_to Implement all extractXxx functions
* Support casting boolean to bigint (apache#60) * remove log4j as it's not used (apache#61) Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * Add stripe iteration support for batch_size reading in the ORC Scanner (apache#63) * Install re2 headers (apache#66) Co-authored-by: PHILO-HE <feilong.he@intel.com> Co-authored-by: zhixingheyi-tian <xiangxiang.shen@intel.com>
I believe this change captures the discussion we had on the mailing list about alignment and padding for arrays. It also captures the update to UnionArrays. The rendered version should be viewable here: https://github.com/emkornfield/arrow/blob/emk_format_changes/format/Layout.md