Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: array functions treat an array as an element #6986

Merged
merged 2 commits into from
Jul 18, 2023
Merged

feat: array functions treat an array as an element #6986

merged 2 commits into from
Jul 18, 2023

Conversation

izveigor
Copy link
Contributor

Which issue does this PR close?

Closes #6985
Follow on to #6879
Follow on to #6805

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Yes

Are there any user-facing changes?

Yes

@github-actions github-actions bot added physical-expr Physical Expressions core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Jul 16, 2023
@@ -554,28 +595,36 @@ select array_concat(column1, column2) from arrays_values_v2;
[11, 12]
NULL

# TODO: Concat columns with different dimensions fails
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to solve this problem.
cc @jayzhan211

.collect();
let field = Arc::new(Field::new("item", data_type, true));

aligned_array = Arc::new(ListArray::try_new(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a method how to push NULLs into ListArray::try_new with saving values? 🤔
@tustvold, can you give me some advice if you have free time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last argument to try_new is the Option<NullBuffer> right which defines which elements are nulls.

Perhaps I don't understand your question


# array_concat column-wise #11 (1D + Integers)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the regression, but PostgreSQL does not support concatenation between scalars (integers) and arrays.
Should we keep this feature or is it better to follow the PostgreSQL standard? 🤔
What do you think about it, @jayzhan211?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to support this.

[[]]
[[]]

# array_concat column-wise #10 (3D + 2D + 1D)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was writing these examples, I suddenly found a bug: #6992

# select array_concat(make_array(column3), column4) from arrays_values_v2;
# array_concat column-wise #9 (2D + 1D)
query ?
select array_concat(column4, make_array(column3)) from arrays_values_v2;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was trying to use make_array with lists, I accidentally found a new bug: #6993

Copy link
Contributor

@jayzhan211 jayzhan211 Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does 1D + 2D pass? Do only descending-order dimensions work?

@izveigor
Copy link
Contributor Author

Ready for the review, @alamb and @jayzhan211!
When I was writing the code, I found new bugs related to the topic of arrays: #6993 and #6992.

@alamb
Copy link
Contributor

alamb commented Jul 17, 2023

Thank you for this PR @izveigor -- I ran out of time today but I have this PR on my list to review tomorrow

Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @izveigor -- the tests are especially nice.

I think adding some coverage of nulls would improve things but this PR is an improvement over master so 👍

statement ok
CREATE TABLE nested_arrays
AS VALUES
(make_array(make_array(1, 2, 3), make_array(2, 9, 1), make_array(7, 8, 9), make_array(1, 2, 3), make_array(1, 7, 4), make_array(4, 5, 6)), make_array(7, 8, 9), 2, make_array([[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]])),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend adding NULL values -- both as elements as well as the arrays as that is often a corner case that gets overlooked

Perhaps something like

Suggested change
(make_array(make_array(1, 2, 3), make_array(2, 9, 1), make_array(7, 8, 9), make_array(1, 2, 3), make_array(1, 7, 4), make_array(4, 5, 6)), make_array(7, 8, 9), 2, make_array([[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]])),
(make_array(make_array(1, 2, 3), NULL, make_array(7, 8, 9), make_array(1, 2, 3), make_array(1, 7, 4), make_array(4, 5, 6)), make_array(7, 8, 9), 2, make_array([[NULL, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]])),

.collect();
let field = Arc::new(Field::new("item", data_type, true));

aligned_array = Arc::new(ListArray::try_new(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last argument to try_new is the Option<NullBuffer> right which defines which elements are nulls.

Perhaps I don't understand your question

@alamb alamb merged commit 0624378 into apache:main Jul 18, 2023
@alamb
Copy link
Contributor

alamb commented Jul 18, 2023

Thanks @izveigor and @jayzhan211

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate physical-expr Physical Expressions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Array functions should treat an array as an element
3 participants