We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug If you run a query in DataFusion against parquet files, it will create several unnecessary temporary files.
IOx also hits the same thing (with the same root cause): https://github.com/influxdata/influxdb_iox/issues/3507#issuecomment-1023679575
There are several places which (non obviously) create a DiskManager instance today -- the one that hits the parquet usecase above is (in the creation of the pruning predicate that requires an ExecutionContext): https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/physical_optimizer/pruning.rs#L132
ExecutionContext
This has two problems:
I propose a two pronged solution (will propose two PRs):
I think the second will be a slightly larger project as it gets passed to create_physical_expr
create_physical_expr
Though I think the main sources of problem are related to create_physical_expr and that only uses the context to look up vars, if necessary.
The text was updated successfully, but these errors were encountered:
cc @yjshen I plan to work on these items today
Sorry, something went wrong.
ExecutionContextState
DefaultPhysicalPlanner
alamb
Successfully merging a pull request may close this issue.
Describe the bug
If you run a query in DataFusion against parquet files, it will create several unnecessary temporary files.
IOx also hits the same thing (with the same root cause): https://github.com/influxdata/influxdb_iox/issues/3507#issuecomment-1023679575
There are several places which (non obviously) create a DiskManager instance today -- the one that hits the parquet usecase above is (in the creation of the pruning predicate that requires an
ExecutionContext
): https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/physical_optimizer/pruning.rs#L132This has two problems:
I propose a two pronged solution (will propose two PRs):
I think the second will be a slightly larger project as it gets passed to
create_physical_expr
Though I think the main sources of problem are related to
create_physical_expr
and that only uses the context to look up vars, if necessary.The text was updated successfully, but these errors were encountered: