You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As @tustvold points out, there is a column_order API defined in parquet that is currently entirely ignored by DataFusion
It is not entirely clear to me what the implications of ignoring this field are or what other parquet writers populate it with, but we should probably not ignore it
Describe the bug
As @tustvold points out, there is a
column_order
API defined in parquet that is currently entirely ignored by DataFusionIt is not entirely clear to me what the implications of ignoring this field are or what other parquet writers populate it with, but we should probably not ignore it
To Reproduce
No response
Expected behavior
No response
Additional context
To emphasise the point I made when this API was originally proposed, you need more than just the ParquetStatistics in order to correctly interpret the data. You need at least the FileMetadata to get the https://docs.rs/parquet/latest/parquet/file/metadata/struct.FileMetaData.html#method.column_order in order to be able to even interpret what the statistics mean for a given column.
The text was updated successfully, but these errors were encountered: