Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace array_contains with SQL array functions: array_has, array_has_any, array_has_all #6990

Merged
merged 17 commits into from
Jul 22, 2023

Conversation

jayzhan211
Copy link
Contributor

@jayzhan211 jayzhan211 commented Jul 17, 2023

Which issue does this PR close?

Ref #6980

Close #6973
Close #6976
Close #6977

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Physical Expressions core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Jul 17, 2023
Copy link
Contributor Author

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why I got these three files after regen.sh

proto_descriptor.bin
src/datafusion.rs serde.rs
@izveigor Did you have these three files when you run ./datafusion/proto/regen.sh before?

@izveigor
Copy link
Contributor

izveigor commented Jul 18, 2023

@jayzhan211 Afrer all the calls the function regen.sh I have never got the files. I think it would be better to add them to .gitignore to avoid such a situation in the future.

concat_inner_lists(array_concat(&[as_list_array(&arg)?
.values()
.clone()])?)
/// Array_has_any SQL function
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not find a way that combines has_any and has_all without losing readability.

@jayzhan211 jayzhan211 marked this pull request as ready for review July 20, 2023 08:21
@jayzhan211
Copy link
Contributor Author

@izveigor @alamb Ready for review, thanks

@izveigor
Copy link
Contributor

@jayzhan211 Nice features! Could you add documentation (files: docs/source/user-guide/expressions.md and docs/source/user-guide/sql/scalar_functions.md) and aliases (such as array_contains and prefix list_) for the cool new functions?

@alamb alamb changed the title Introduce SQL array function: array has Introduce SQL array functions: array_has, array_has_any, array_has_all Jul 20, 2023
@alamb alamb changed the title Introduce SQL array functions: array_has, array_has_any, array_has_all Introduce SQL array functions: array_has, array_has_any, array_has_all, remove array_contains Jul 20, 2023
@alamb alamb changed the title Introduce SQL array functions: array_has, array_has_any, array_has_all, remove array_contains Replace array_contains with SQL array functions: array_has, array_has_any, array_has_all Jul 20, 2023
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add list_has, list_has_any, list_has_all aliases to be consistent with #7008 from @izveigor ?

For anyone else following along array_contains was introduced by @izveigor in #6618 so I don't think it has been released yet (and thus this isn't an API change)

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jayzhan211

I think this PR looks good to me other than the panic's instead of errors. Once those are fixed I think this one is good to go.

Also, thank you to both you and @izveigor for pushing these functions ahead -- it is great to see DataFusion grow so nicely

non_list_contains!(array, args[1], BooleanArray)
}
_ => {
todo!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please return DataFusionError::NotYetImplemented here rather than todo!(), which will panic?

.iter()
.dedup()
.flatten()
.any(|x| x == elem.expect("null type not supported"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, can we please try and avoid panic's on not supported errors -- it makes for a much easier to understand error message for users

@alamb
Copy link
Contributor

alamb commented Jul 20, 2023

I think we should also update the user guide https://github.com/apache/arrow-datafusion/blob/main/docs/source/user-guide/sql/scalar_functions.md#array-functions as well to include these functions

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Copy link
Contributor

@izveigor izveigor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks, @jayzhan211!

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jayzhan211 and @izveigor

@alamb alamb merged commit 49c91b5 into apache:main Jul 22, 2023
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate logical-expr Logical plan and expressions physical-expr Physical Expressions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement array_has function Implement array_has_any function Implement array_has_all function
4 participants