-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
different array behavior with PG's multidimensional array #3811
Comments
I guess it was supposed to tolerate json values like BTW PG is not strict on some other aspects either:
We may not follow PG exactly for all its array design, but we do lack a clear definition/expectation of what we support. |
Here are more cases: # constructing
select array[array[1,2], array[3]];
# concatenating
select array[array[1,2]] || array[array[3]];
# batch aggregating
select array_agg(a) from (values (array[array[1,2]]), (array[array[3]])) t(a);
# streaming aggregating
create table t (a int[][]);
insert into t values (array[array[1,2]]), (array[array[3]]);
create materialized view mv as select array_agg(a) as agg from t;
select * from mv; All these fail on PG but succeed on RW. |
Notes about PG's array: https://www.postgresql.org/docs/current/arrays.html
So basically each array value is an ND array. But there are no dimension restrictions among the array values in the same column. |
They are essentially 2 different types: nested list vs tensor, and can be different in several ways:
Based on usages we have seen so far, our expected behavior would be:
|
Other PG compatible DBs’ behavior:
Materialize has a separate type list besides array.
https://materialize.com/docs/sql/types/array/
Cockroach has only 1d array
https://www.cockroachlabs.com/docs/stable/array.html
…On Mon, 22 May 2023 at 09:07, xiangjinwu ***@***.***> wrote:
They are essentially 2 different types: nested list vs tensor, and can be
different in several ways:
- whether array[array[1, 2], array[1]] is an invalid value
- whether array[array[1]]'s type is int[] (rather than int[][])
- whether unnest(array[array[1]]) returns 1 (rather than array[1])
- whether the starting index of a dimension can be 7 or -2 (rather
than always 1 or 0)
Based on usages we have seen so far, our expected behavior would be:
- Allow jagged value.
- Pros: jagged array is also found in Avro, Arrow, Flink, DuckDB
- Cons: the following PostgreSQL functions would run slower when
inspecting inner dimensions: array_dims, array_length, array_upper,
generate_subscripts. Or we can reject multi-dimensional input first.
- Use type int[][]
- Pros: intuitive, and allows array_append(array[array[1]],
array[2])
- Cons: disallows PostgreSQL expression
'{{1,2,3},{4,5,6},{7,8,9}}'::int[], and there would be infinite
types in the type system (int[], int[][], int[][][], ...)
- Keep unnest(array[array[1]]) to return 1, and cardinality(array[array[1,
2]]) to return 2
- Pros: PostgreSQL compatible. A working query ported from
PostgreSQL won't return a different result silently.
- Cons: Needs recursion. unnest(array[array[1]]) = array[1] in
DuckDB, and cardinality(array[array[1, 2]]) = 1 in Flink.
- Do not support customizing dimension lower bound
- Pros: simple
- Cons: disallows PostgreSQL expression '[2:4]={1,2,3}'::int[]
—
Reply to this email directly, view it on GitHub
<#3811 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJBQZNIMIIN4EDKURKQV3F3XHMGDBANCNFSM53MJMU6A>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Describe the bug
In risingwave
In PG, this isn't allowed
Expected behavior
Same as PG?
cc @neverchanje
The text was updated successfully, but these errors were encountered: