ColumnMetaData should no longer be written inline with data #6115
Labels
enhancement
Any new improvement worthy of a entry in the changelog
parquet
Changes to the parquet crate
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The writing of the thrift
ColumnMetaData
outside of the Parquet file footer was recently deprecated (apache/parquet-format#440), as was the setting of theColumnChunk::file_offset
field. Also, theColumnMetaData
currently written has incorrect values fordictionary_page_offset
anddata_page_offset
(they are relative to the start of the chunk rather than being offset to their location in the file).Describe the solution you'd like
The current Parquet spec indicates the
file_offset
field should be set to 0, andColumnMetaData
should no longer be written inline with the data.Describe alternatives you've considered
If not removed, the offsets mentioned above should be set to correct values.
The text was updated successfully, but these errors were encountered: