-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Projection pushdown removes unqualified column names even when they are used #617
Comments
FYI @houqp -- I believe I have found the problem, but now I need to sort out one other failure before I can make a PR |
🤔 I think this may be more complicated than I thought - I am not sure if I am creating the plans incorrectly or if projection pushdown is doing the wrong thing. I need to study this some more |
So at least initially the answer appears to be "I should be creating fully qualified Here is the change I needed to make in IOx: let exprs = input
.schema()
.fields()
.iter()
- .map(|field| logical_plan::col(field.name()))
+ .map(|field| Expr::Column(field.qualified_column()))
.collect::<Vec<_>>(); This was a non obvious failure more however, and I can probably figure out a better way to detect such errors. Anyhow, closing for now |
hmm... the projection push down should be able to handle unqualified columns without problem, i.e. the current design doesn't require clients to add qualifiers to all the fields. Do you have a reproducible example i can play with? I definitely didn't test the change with extension nodes. |
@houqp I will get one |
Here is a reproducer (in projection_push_down.rs) -- it can also be found on https://github.com/alamb/arrow-datafusion/tree/alamb/repro_projection_pruning #[test]
fn table_scan_projected_schema_non_qualified_relation() -> Result<()> {
let table_scan = test_table_scan()?;
let input_schema = table_scan.schema();
assert_eq!(3, input_schema.fields().len());
assert_fields_eq(&table_scan, vec!["a", "b", "c"]);
// Build the LogicalPlan directly (don't use PlanBuilder), so
// that the Column references are unqualified (e.g. their
// relation is `None`). PlanBuilder resolves the expressions
let expr = vec![col("a"), col("b")];
let projected_fields = exprlist_to_fields(&expr, input_schema).unwrap();
let projected_schema = DFSchema::new(projected_fields).unwrap();
let plan = LogicalPlan::Projection {
expr,
input: Arc::new(table_scan),
schema: Arc::new(projected_schema),
};
assert_fields_eq(&plan, vec!["a", "b"]);
let expected = "Projection: #a, #b\
\n TableScan: test projection=Some([0, 1])";
assert_optimized_plan_eq(&plan, expected);
Ok(())
} It fails in the following way:
The problem is that the I will put up a PR with code to fix the problem |
Describe the bug
Problem:
I have a (user defined node) that looks like
Prior to #55 this worked
After that PR the projection pushdown logic decides that since the
col1
,col2
andcol3
references don't have the table qualifiert1
on them, they are removed and the optimized plan looks likeWhich then has issues because it expects col2 and col3 to be present but they have been "optimized" out
Expected behavior
The table scan should include
col2
andcol3
(they should not be optimized out), in addition tocol1
The text was updated successfully, but these errors were encountered: