Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predicate push-down into parquet broken for Date32 columns #649

Closed
yordan-pavlov opened this issue Jun 30, 2021 · 3 comments · Fixed by #690
Closed

Predicate push-down into parquet broken for Date32 columns #649

yordan-pavlov opened this issue Jun 30, 2021 · 3 comments · Fixed by #690
Assignees
Labels
bug Something isn't working

Comments

@yordan-pavlov
Copy link
Contributor

Describe the bug
Earlier this week I found that predicate push-down into parquet for Date32 columns was broken in PR #426

I found that this was caused by missing branches in impl TryFrom<&DataType> for ScalarValue here https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/scalar.rs#L924
which is used in get_min_max_values here https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/physical_plan/parquet.rs#L508

I also found that adding the following lines into the try_from method resolves the issue:

DataType::Date32 => ScalarValue::Date32(None),
DataType::Date64 => ScalarValue::Date64(None),

To Reproduce

  • filter Date32 column in a parquet data source
  • the statistics column(s) generated for the filtered Date32 columns will be all null

Expected behavior
Statistics column(s) generated for Date32 columns from a parquet data source should not be all null

Additional context
n/a

@alamb

@yordan-pavlov yordan-pavlov added the bug Something isn't working label Jun 30, 2021
@alamb
Copy link
Contributor

alamb commented Jul 1, 2021

I can fix this @yordan-pavlov if you are not already doing do

@yordan-pavlov
Copy link
Contributor Author

yes please @alamb , it would be great if you could; I have been working on more improvements for the parquet arrow reader so my local repo isn't very clean.

@alamb alamb self-assigned this Jul 1, 2021
@alamb
Copy link
Contributor

alamb commented Jul 1, 2021

I will do so @yordan-pavlov

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants