Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusing schema errors when using window partition #5229

Closed
bellwether-softworks opened this issue Feb 9, 2023 · 2 comments
Closed

Confusing schema errors when using window partition #5229

bellwether-softworks opened this issue Feb 9, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@bellwether-softworks
Copy link

Describe the bug
I'm attempting to use the ROW_NUMBER() function, and encountering the following error:

Arrow error: External error: Arrow error: Invalid argument error: batches[2] schema is different with argument schema.

The query in question:

SELECT ROW_NUMBER() OVER (PARTITION BY part_id ORDER BY range_begin) AS rownum
FROM __temp_inverted_ranges_phase_1

The error no longer occurs if the PARTITION BY clause is omitted.

Expected behavior
Successfully executed ROW_NUMBER function, or meaningful explanation of the schema failure.

Additional context

The table in question is the result from a preceding query; when attempting to perform a similar ROW_NUMBER invocation against that data, no errors occur.

When viewing the schema for the data backing the failing query:

+-------------+-----------+-------------+---------------+--------------------------------+--------------+
| column_name | data_type | is_nullable | table_catalog | table_name                     | table_schema |
+-------------+-----------+-------------+---------------+--------------------------------+--------------+
| part_id     | Int32     | NO          | datafusion    | __temp_inverted_ranges_phase_1 | public       |
+-------------+-----------+-------------+---------------+--------------------------------+--------------+
| range_begin | Float32   | YES         | datafusion    | __temp_inverted_ranges_phase_1 | public       |
+-------------+-----------+-------------+---------------+--------------------------------+--------------+
| range_end   | Float32   | YES         | datafusion    | __temp_inverted_ranges_phase_1 | public       |
+-------------+-----------+-------------+---------------+--------------------------------+--------------+

An illustrative dataset is as shown below:

+---------+-------------+-----------+
| part_id | range_begin | range_end |
+---------+-------------+-----------+
| 221770  | 30.03167    | 128.24088 |
+---------+-------------+-----------+
| 221842  | 67.875      | 64.375    |
+---------+-------------+-----------+
| 221883  | 107.25      | 107.25    |
+---------+-------------+-----------+
| 221883  | 133.875     | 112.96875 |
+---------+-------------+-----------+
| 221969  | 22.21875    | 0.0       |
+---------+-------------+-----------+
| 221979  | 75.46887    | 75.46887  |
+---------+-------------+-----------+
| 221988  | 15.96875    | 25.84375  |
+---------+-------------+-----------+
| 221988  | 47.96875    | 50.96875  |
+---------+-------------+-----------+
| 222006  | 68.71875    | 68.71875  |
+---------+-------------+-----------+
| 222013  | 32.71875    | 38.625004 |
+---------+-------------+-----------+
@bellwether-softworks bellwether-softworks added the bug Something isn't working label Feb 9, 2023
@bellwether-softworks
Copy link
Author

Closing for now, as I realize I haven't accounted for other points of failure with the app that might be responsible for this error.

@alamb
Copy link
Contributor

alamb commented Feb 11, 2023

It also might be similar to the recently fixed #5090

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants