-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Projection Pushdown (Parquet) #196
Comments
only implement for the external readers (blob, nosql and sql readers) |
This is probably going to be harder than just collecting all the tokens which are labelled as identifiers, especially when there's joins or sub queries, or aliases. |
Implement hint |
Do this as a first page to gather all the fields, then intersect with the selected fields and use that on future page reads. This will mean no benefit on small datasets (single page), but that's not what would benefit from this anyway. A '*' in the field list should disable the optimization. This should be reflected in EXPLAIN. Note that NATURAL JOIN should add a '*' to the field list when implemented. This should NOT be the same approach taken for other data types. |
This may conflict with the schema evolution feature, what happens if we select a column that doesn't exist. Maybe we need to wrap in a try and do more expensive work if it fails. |
Can the field list be converted to a set earlier and once Can we use the schema to update the field list set |
FEATURE/#196 - Initial Projection Pushdown (Parquet only)
Push down the projection to the read step.
This should improve performance by handling less data in the processing steps.
The text was updated successfully, but these errors were encountered: