Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement func array_pop_front #8142

Merged
merged 5 commits into from
Nov 15, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions datafusion/expr/src/built_in_function.rs
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,8 @@ pub enum BuiltinScalarFunction {
ArrayHasAll,
/// array_has_any
ArrayHasAny,
/// array_pop_front
ArrayPopFront,
/// array_pop_back
ArrayPopBack,
/// array_dims
Expand Down Expand Up @@ -392,6 +394,7 @@ impl BuiltinScalarFunction {
BuiltinScalarFunction::ArrayElement => Volatility::Immutable,
BuiltinScalarFunction::ArrayLength => Volatility::Immutable,
BuiltinScalarFunction::ArrayNdims => Volatility::Immutable,
BuiltinScalarFunction::ArrayPopFront => Volatility::Immutable,
BuiltinScalarFunction::ArrayPopBack => Volatility::Immutable,
BuiltinScalarFunction::ArrayPosition => Volatility::Immutable,
BuiltinScalarFunction::ArrayPositions => Volatility::Immutable,
Expand Down Expand Up @@ -570,6 +573,7 @@ impl BuiltinScalarFunction {
},
BuiltinScalarFunction::ArrayLength => Ok(UInt64),
BuiltinScalarFunction::ArrayNdims => Ok(UInt64),
BuiltinScalarFunction::ArrayPopFront => Ok(input_expr_types[0].clone()),
BuiltinScalarFunction::ArrayPopBack => Ok(input_expr_types[0].clone()),
BuiltinScalarFunction::ArrayPosition => Ok(UInt64),
BuiltinScalarFunction::ArrayPositions => {
Expand Down Expand Up @@ -868,6 +872,7 @@ impl BuiltinScalarFunction {
// for now, the list is small, as we do not have many built-in functions.
match self {
BuiltinScalarFunction::ArrayAppend => Signature::any(2, self.volatility()),
BuiltinScalarFunction::ArrayPopFront => Signature::any(1, self.volatility()),
BuiltinScalarFunction::ArrayPopBack => Signature::any(1, self.volatility()),
BuiltinScalarFunction::ArrayConcat => {
Signature::variadic_any(self.volatility())
Expand Down Expand Up @@ -1512,6 +1517,7 @@ fn aliases(func: &BuiltinScalarFunction) -> &'static [&'static str] {
}
BuiltinScalarFunction::ArrayLength => &["array_length", "list_length"],
BuiltinScalarFunction::ArrayNdims => &["array_ndims", "list_ndims"],
BuiltinScalarFunction::ArrayPopFront => &["array_pop_front", "list_pop_front"],
BuiltinScalarFunction::ArrayPopBack => &["array_pop_back", "list_pop_back"],
BuiltinScalarFunction::ArrayPosition => &[
"array_position",
Expand Down
8 changes: 8 additions & 0 deletions datafusion/expr/src/expr_fn.rs
Original file line number Diff line number Diff line change
Expand Up @@ -590,6 +590,13 @@ scalar_expr!(
"returns the array without the last element."
);

scalar_expr!(
ArrayPopFront,
array_pop_front,
array,
"returns the array without the first element."
);

nary_scalar_expr!(ArrayConcat, array_concat, "concatenates arrays.");
scalar_expr!(
ArrayHas,
Expand Down Expand Up @@ -1166,6 +1173,7 @@ mod test {
test_scalar_expr!(FromUnixtime, from_unixtime, unixtime);

test_scalar_expr!(ArrayAppend, array_append, array, element);
test_scalar_expr!(ArrayPopFront, array_pop_front, array);
test_scalar_expr!(ArrayPopBack, array_pop_back, array);
test_unary_scalar_expr!(ArrayDims, array_dims);
test_scalar_expr!(ArrayLength, array_length, array, dimension);
Expand Down
42 changes: 37 additions & 5 deletions datafusion/physical-expr/src/array_expressions.rs
Original file line number Diff line number Diff line change
Expand Up @@ -564,13 +564,33 @@ pub fn array_slice(args: &[ArrayRef]) -> Result<ArrayRef> {
define_array_slice(list_array, key, extra_key, false)
}

fn general_array_pop(
list_array: &GenericListArray<i32>,
from_back: bool,
) -> Result<(Vec<i64>, Vec<i64>)> {
if from_back {
Veeupup marked this conversation as resolved.
Show resolved Hide resolved
let key = vec![0; list_array.len()];
// Atttetion: `arr.len() - 1` in extra key defines the last element position (position = index + 1, not inclusive) we want in the new array.
let extra_key: Vec<_> = list_array
.iter()
.map(|x| x.map_or(0, |arr| arr.len() as i64 - 1))
.collect();
Ok((key, extra_key))
} else {
// Atttetion: 2 in the `key`` defines the first element position (position = index + 1) we want in the new array.
// We only handle two cases of the first element index: if the old array has any elements, starts from 2 (index + 1), or starts from initial.
let key: Vec<_> = list_array.iter().map(|x| x.map_or(0, |_| 2)).collect();
let extra_key: Vec<_> = list_array
.iter()
.map(|x| x.map_or(0, |arr| arr.len() as i64))
.collect();
Ok((key, extra_key))
}
}

pub fn array_pop_back(args: &[ArrayRef]) -> Result<ArrayRef> {
let list_array = as_list_array(&args[0])?;
let key = vec![0; list_array.len()];
let extra_key: Vec<_> = list_array
.iter()
.map(|x| x.map_or(0, |arr| arr.len() as i64 - 1))
.collect();
let (key, extra_key) = general_array_pop(list_array, true)?;

define_array_slice(
list_array,
Expand Down Expand Up @@ -695,6 +715,18 @@ pub fn gen_range(args: &[ArrayRef]) -> Result<ArrayRef> {
Ok(arr)
}

pub fn array_pop_front(args: &[ArrayRef]) -> Result<ArrayRef> {
let list_array = as_list_array(&args[0])?;
let (key, extra_key) = general_array_pop(list_array, false)?;

define_array_slice(
list_array,
&Int64Array::from(key),
&Int64Array::from(extra_key),
false,
)
}

/// Array_append SQL function
pub fn array_append(args: &[ArrayRef]) -> Result<ArrayRef> {
let list_array = as_list_array(&args[0])?;
Expand Down
3 changes: 3 additions & 0 deletions datafusion/physical-expr/src/functions.rs
Original file line number Diff line number Diff line change
Expand Up @@ -359,6 +359,9 @@ pub fn create_physical_fun(
BuiltinScalarFunction::ArrayNdims => {
Arc::new(|args| make_scalar_function(array_expressions::array_ndims)(args))
}
BuiltinScalarFunction::ArrayPopFront => Arc::new(|args| {
make_scalar_function(array_expressions::array_pop_front)(args)
}),
BuiltinScalarFunction::ArrayPopBack => {
Arc::new(|args| make_scalar_function(array_expressions::array_pop_back)(args))
}
Expand Down
1 change: 1 addition & 0 deletions datafusion/proto/proto/datafusion.proto
Original file line number Diff line number Diff line change
Expand Up @@ -638,6 +638,7 @@ enum ScalarFunction {
ArrayUnion = 120;
OverLay = 121;
Range = 122;
ArrayPopFront = 123;
}

message ScalarFunctionNode {
Expand Down
3 changes: 3 additions & 0 deletions datafusion/proto/src/generated/pbjson.rs

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions datafusion/proto/src/generated/prost.rs

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 5 additions & 1 deletion datafusion/proto/src/logical_plan/from_proto.rs
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ use datafusion_expr::{
WindowFrameUnits,
};
use datafusion_expr::{
array_empty, array_pop_back,
array_empty, array_pop_back, array_pop_front,
expr::{Alias, Placeholder},
};
use std::sync::Arc;
Expand Down Expand Up @@ -473,6 +473,7 @@ impl From<&protobuf::ScalarFunction> for BuiltinScalarFunction {
ScalarFunction::Flatten => Self::Flatten,
ScalarFunction::ArrayLength => Self::ArrayLength,
ScalarFunction::ArrayNdims => Self::ArrayNdims,
ScalarFunction::ArrayPopFront => Self::ArrayPopFront,
ScalarFunction::ArrayPopBack => Self::ArrayPopBack,
ScalarFunction::ArrayPosition => Self::ArrayPosition,
ScalarFunction::ArrayPositions => Self::ArrayPositions,
Expand Down Expand Up @@ -1330,6 +1331,9 @@ pub fn parse_expr(
parse_expr(&args[0], registry)?,
parse_expr(&args[1], registry)?,
)),
ScalarFunction::ArrayPopFront => {
Ok(array_pop_front(parse_expr(&args[0], registry)?))
}
ScalarFunction::ArrayPopBack => {
Ok(array_pop_back(parse_expr(&args[0], registry)?))
}
Expand Down
1 change: 1 addition & 0 deletions datafusion/proto/src/logical_plan/to_proto.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1480,6 +1480,7 @@ impl TryFrom<&BuiltinScalarFunction> for protobuf::ScalarFunction {
BuiltinScalarFunction::Flatten => Self::Flatten,
BuiltinScalarFunction::ArrayLength => Self::ArrayLength,
BuiltinScalarFunction::ArrayNdims => Self::ArrayNdims,
BuiltinScalarFunction::ArrayPopFront => Self::ArrayPopFront,
BuiltinScalarFunction::ArrayPopBack => Self::ArrayPopBack,
BuiltinScalarFunction::ArrayPosition => Self::ArrayPosition,
BuiltinScalarFunction::ArrayPositions => Self::ArrayPositions,
Expand Down
38 changes: 38 additions & 0 deletions datafusion/sqllogictest/test_files/array.slt
Original file line number Diff line number Diff line change
Expand Up @@ -826,6 +826,44 @@ select array_pop_back(column1) from arrayspop;
[]
[, 10, 11]

## array_pop_front (aliases: `list_pop_front`)

# array_pop_front scalar function #1
query ??
select array_pop_front(make_array(1, 2, 3, 4, 5)), array_pop_front(make_array('h', 'e', 'l', 'l', 'o'));
----
[2, 3, 4, 5] [e, l, l, o]

# array_pop_front scalar function #2 (after array_pop_front, array is empty)
query ?
select array_pop_front(make_array(1));
----
[]

# array_pop_front scalar function #3 (array_pop_front the empty array)
query ?
select array_pop_front(array_pop_front(make_array(1)));
----
[]

# array_pop_front scalar function #5 (array_pop_front the nested arrays)
query ?
select array_pop_front(make_array(make_array(1, 2, 3), make_array(2, 9, 1), make_array(7, 8, 9), make_array(1, 2, 3), make_array(1, 7, 4), make_array(4, 5, 6)));
----
[[2, 9, 1], [7, 8, 9], [1, 2, 3], [1, 7, 4], [4, 5, 6]]

# array_pop_front scalar function #6 (array_pop_front the nested arrays with NULL)
query ?
select array_pop_front(make_array(NULL, make_array(1, 2, 3), make_array(2, 9, 1), make_array(7, 8, 9), make_array(1, 2, 3), make_array(1, 7, 4)));
----
[[1, 2, 3], [2, 9, 1], [7, 8, 9], [1, 2, 3], [1, 7, 4]]

# array_pop_front scalar function #8 (after array_pop_front, nested array is empty)
query ?
select array_pop_front(make_array(make_array(1, 2, 3)));
----
[]

## array_slice (aliases: list_slice)

# array_slice scalar function #1 (with positive indexes)
Expand Down
1 change: 1 addition & 0 deletions docs/source/user-guide/expressions.md
Original file line number Diff line number Diff line change
Expand Up @@ -219,6 +219,7 @@ Unlike to some databases the math functions in Datafusion works the same way as
| flatten(array) | Converts an array of arrays to a flat array `flatten([[1], [2, 3], [4, 5, 6]]) -> [1, 2, 3, 4, 5, 6]` |
| array_length(array, dimension) | Returns the length of the array dimension. `array_length([1, 2, 3, 4, 5]) -> 5` |
| array_ndims(array) | Returns the number of dimensions of the array. `array_ndims([[1, 2, 3], [4, 5, 6]]) -> 2` |
| array_pop_front(array) | Returns the array without the first element. `array_pop_front([1, 2, 3]) -> [2, 3]` |
| array_pop_back(array) | Returns the array without the last element. `array_pop_back([1, 2, 3]) -> [1, 2]` |
| array_position(array, element) | Searches for an element in the array, returns first occurrence. `array_position([1, 2, 2, 3, 4], 2) -> 2` |
| array_positions(array, element) | Searches for an element in the array, returns all occurrences. `array_positions([1, 2, 2, 3, 4], 2) -> [2, 3]` |
Expand Down
25 changes: 25 additions & 0 deletions docs/source/user-guide/sql/scalar_functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -1515,6 +1515,7 @@ from_unixtime(expression)
- [array_length](#array_length)
- [array_ndims](#array_ndims)
- [array_prepend](#array_prepend)
- [array_pop_front](#array_pop_front)
- [array_pop_back](#array_pop_back)
- [array_position](#array_position)
- [array_positions](#array_positions)
Expand Down Expand Up @@ -1868,6 +1869,30 @@ array_prepend(element, array)
- list_prepend
- list_push_front

### `array_pop_front`

Returns the array without the first element.

```
array_pop_first(array)
```

#### Arguments

- **array**: Array expression.
Can be a constant, column, or function, and any combination of array operators.

#### Example

```
❯ select array_pop_first([1, 2, 3]);
+-------------------------------+
| array_pop_first(List([1,2,3])) |
+-------------------------------+
| [2, 3] |
+-------------------------------+
```

### `array_pop_back`

Returns the array without the last element.
Expand Down