Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: function name hints for UDFs #9407

Merged
merged 6 commits into from
Mar 10, 2024

Conversation

SteveLauC
Copy link
Contributor

Which issue does this PR close?

Closes #9392.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions optimizer Optimizer rules core Core DataFusion crate labels Mar 1, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @SteveLauC -- this is looking very cool

@@ -85,6 +85,10 @@ pub trait ContextProvider {

/// Get configuration options
fn options(&self) -> &ConfigOptions;

fn udfs(&self) -> HashMap<String, Arc<ScalarUDF>>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of exposes details of the implementation (e.g. a HashMap)

What about potentially just returning the names following the model of CatalogProvider or SchemaProvider: https://docs.rs/datafusion/latest/datafusion/catalog/schema/trait.SchemaProvider.html#tymethod.table_names

Something like

Suggested change
fn udfs(&self) -> HashMap<String, Arc<ScalarUDF>>;
/// returns all udf names
fn udf_names(&self) -> Vec<&str>

(or maybe Strings)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I love consistency! Will do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want this function to return a Vec<&str> or Vec<String>, the suggested change uses Vec<&str>, but the table_names() function in the referenced link uses Vec<String>

@SteveLauC SteveLauC force-pushed the feat/fn_name_hint_for_scalar_udf branch from 0675113 to 40cd987 Compare March 3, 2024 05:31
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR also needs tests -- however, I think when you merge this branch up from main, you should have to update at least one test in a sqllogictest file, so you probably don't have to write anything new

// All aggregate functions and builtin window functions
AggregateFunction::iter()
.map(|func| func.to_string())
.chain(BuiltInWindowFunction::iter().map(|func| func.to_string()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we want to add the ctx.udafs() and ctx.udwfs() here as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@alamb
Copy link
Contributor

alamb commented Mar 10, 2024

Hi @SteveLauC -- how is this PR going? Anything we can help with?

@SteveLauC SteveLauC force-pushed the feat/fn_name_hint_for_scalar_udf branch from 40cd987 to 3a3c173 Compare March 10, 2024 09:45
@SteveLauC SteveLauC marked this pull request as ready for review March 10, 2024 10:07
@SteveLauC SteveLauC requested a review from alamb March 10, 2024 10:08
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Mar 10, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @SteveLauC -- I tried this out and it worked great

❯ select abs2(1);
Error during planning: Invalid function 'abs2'.
Did you mean 'abs'?
❯

I also merged up from main and added a test for the functionality (undid the change fro #9388)

@@ -483,7 +483,7 @@ statement error Did you mean 'arrow_typeof'?
SELECT arrowtypeof(v1) from test;

# Scalar function
statement error Invalid function 'to_timestamps_second'
statement error Did you mean 'to_timestamp_seconds'?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated test here

@alamb alamb merged commit f1f0965 into apache:main Mar 10, 2024
24 checks passed
@alamb
Copy link
Contributor

alamb commented Mar 10, 2024

Thanks again @SteveLauC

@SteveLauC SteveLauC deleted the feat/fn_name_hint_for_scalar_udf branch March 10, 2024 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate logical-expr Logical plan and expressions optimizer Optimizer rules sql SQL Planner sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add ScalarUDFs in missing function hints / suggested errors
3 participants