Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve docs for TableProvider::supports_filters_pushdown and remove deprecated function #9923

Merged
merged 5 commits into from
Apr 5, 2024

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Apr 3, 2024

Which issue does this PR close?

Closes #9922
part of #7013

Rationale for this change

While debugging something with @pauldix yesterday, we were somewhat confused with the old API (as the default impl calls the deprecated function) and the docs weren't clear

We figured it out, but it was a paper cut that caused more confusion that it should have

What changes are included in this PR?

  1. Improve docs for TableProvider::supports_filters_pushdown
  2. Improve docs for TableProviderFilterPushDown
  3. Remove similarly named, but deprecated TableProvider::supports_filter_pushdown

Are these changes tested?

Yes by CI

Are there any user-facing changes?

Docs, and the deprecated API is now removed

@github-actions github-actions bot added logical-expr Logical plan and expressions core Core DataFusion crate labels Apr 3, 2024
@@ -1430,12 +1430,12 @@ impl TableProvider for DataFrameTableProvider {
Some(&self.plan)
}

fn supports_filter_pushdown(
fn supports_filters_pushdown(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switches to using the non deprecated but very similarly named API

/// to optimise data retrieval.
/// Note: the returned vector much have the same size as the filters argument.
#[allow(deprecated)]
/// Specify if DataFusion should provide filter expressions to the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to improve the docs about what this was doing

@@ -154,28 +158,31 @@ pub trait TableProvider: Sync + Send {
limit: Option<usize>,
) -> Result<Arc<dyn ExecutionPlan>>;

/// Tests whether the table provider can make use of a filter expression
/// to optimise data retrieval.
#[deprecated(since = "20.0.0", note = "use supports_filters_pushdown instead")]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been deprecated for quite a while

@alamb alamb added the documentation Improvements or additions to documentation label Apr 3, 2024
@phillipleblanc
Copy link
Contributor

I was similarly confused when I first ran across this before figuring it out. Thanks for fixing.

datafusion/core/src/datasource/provider.rs Outdated Show resolved Hide resolved
datafusion/expr/src/table_source.rs Outdated Show resolved Hide resolved
Co-authored-by: Phillip LeBlanc <phillip@leblanc.tech>
@github-actions github-actions bot removed the documentation Improvements or additions to documentation label Apr 3, 2024
Copy link
Contributor Author

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review @phillipleblanc

cc @avantgardnerio as I think you maybe added the original API way back

@avantgardnerio avantgardnerio self-requested a review April 3, 2024 16:25
Copy link
Contributor

@avantgardnerio avantgardnerio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@alamb alamb merged commit 308ebc5 into apache:main Apr 5, 2024
24 checks passed
@alamb
Copy link
Contributor Author

alamb commented Apr 5, 2024

Thanks again @avantgardnerio and @phillipleblanc

@alamb alamb deleted the alamb/filter_pushdown branch April 5, 2024 11:14
Michael-J-Ward added a commit to Michael-J-Ward/datafusion-python that referenced this pull request May 13, 2024
andygrove pushed a commit to apache/datafusion-python that referenced this pull request May 14, 2024
* chore: upgrade datafusion Deps

Ref #690

* update concat and concat_ws to use datafusion_functions

Moved in apache/datafusion#10089

* feat: upgrade functions.rs

Upstream is continuing it's migration to UDFs.

Ref apache/datafusion#10098
Ref apache/datafusion#10372

* fix ScalarUDF import

* feat: remove deprecated suppors_filter_pushdown and impl supports_filters_pushdown

Deprecated function removed in apache/datafusion#9923

* use `unnest_columns_with_options` instead of deprecated `unnest_column_with_option`

* remove ScalarFunction wrappers

These relied on upstream BuiltinScalarFunction, which are now removed.

Ref apache/datafusion#10098

* update dataframe `test_describe`

`null_count` was fixed upstream.

Ref apache/datafusion#10260

* remove PyDFField and related methods

DFField was removed upstream.

Ref: apache/datafusion#9595

* bump `datafusion-python` package version to 38.0.0

* re-implement `PyExpr::column_name`

The previous implementation relied on `DFField` which was removed upstream.

Ref: apache/datafusion#9595
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate logical-expr Logical plan and expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve documentation about FiltersPushdown
3 participants