Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in 43.0.0: coalesce no longer works between Utf8 and Utf8View columns #13568

Closed
Tracked by #13504
ttencate opened this issue Nov 26, 2024 · 3 comments
Closed
Tracked by #13504
Labels
bug Something isn't working

Comments

@ttencate
Copy link

Describe the bug

coalesce() no longer considers Utf8 and Utf8View columns as the same type.

To Reproduce

use datafusion::common::arrow::array::{ArrayRef, StringArray, StringViewArray};
use datafusion::common::arrow::record_batch::RecordBatch;
use datafusion::prelude::*;
use std::sync::Arc;

#[tokio::main]
async fn main() {
    let ctx = SessionContext::new();
    let df = ctx
        .read_batch(
            RecordBatch::try_from_iter([
                (
                    "utf8",
                    Arc::new(StringArray::from(vec!["a", "b"])) as ArrayRef,
                ),
                (
                    "utf8view",
                    Arc::new(StringViewArray::from(vec!["a", "b"])) as ArrayRef,
                ),
            ])
            .unwrap(),
        )
        .unwrap();
    df.select(vec![coalesce(vec![col("utf8"), col("utf8view")])])
        .unwrap()
        .collect()
        .await
        .unwrap();
}

Result:

thread 'main' panicked at src/main.rs:25:10:
called `Result::unwrap()` on an `Err` value: Plan("Execution error: User-defined coercion failed with Execution(\"Fail to find the coerced type, errors: Some(Execution(\\\"Expect to get struct but got Utf8\\\"))\") No function matches the given name and argument types 'coalesce(Utf8, Utf8View)'. You might need to add explicit type casts.\n\tCandidate functions:\n\tcoalesce(UserDefined)")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Expected behavior

No error.

Additional context

It worked fine in version 42.2.0.

@alamb
Copy link
Contributor

alamb commented Nov 27, 2024

I vaguely remember that @jayzhan211 worked on coalesece recently, maybe that was releated

@jayzhan211
Copy link
Contributor

jayzhan211 commented Nov 27, 2024

It seems the issue is fixed already, I couldn't reproduce the error on the latest commit

Including your test and this

statement count 0
create table t(a varchar, b varchar) as values ('a', 'b'), ('c', 'd');

statement ok
create table t2 as
select
    a as c1,
    arrow_cast(b, 'Utf8View') as c2
from t;

query T
select coalesce(c2, c1) from t2;
----
b
d

@alamb
Copy link
Contributor

alamb commented Dec 22, 2024

I verified that the reproducer query now runs without error on main (will be release in DataFusion 44).

Thank you again for the report @ttencate

@alamb alamb closed this as completed Dec 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants