-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-48364][SQL] Add AbstractMapType type casting and fix RaiseError parameter map to work with collated strings #46661
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, please fill the PR description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated PR description, good to go
@@ -85,6 +85,11 @@ object CollationTypeCasts extends TypeCoercionRule { | |||
private def extractStringType(dt: DataType): StringType = dt match { | |||
case st: StringType => st | |||
case ArrayType(et, _) => extractStringType(et) | |||
case MapType(kt, vt, _) => if (hasStringType(kt)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this assumes the key value should be the same string type. I think it's fine, but we can revisit this later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this might be a bit confusing - we'll keep it in mind for possible future revisit
thanks, merging to master! |
…d strings ### What changes were proposed in this pull request? Following queries return unexpected results: ``` select collation(map('a', 'b' collate utf8_binary_lcase)['a']); select collation(element_at(map('a', 'b' collate utf8_binary_lcase), 'a')); ``` Both return `UTF8_BINARY` instead of `UTF8_BINARY_LCASE`. The error was introduced by changes in #46661 that tried to solve other problem with `RaiseError` expression (please refer to PR for details). Partially revert changes of mentioned PR and fix the `RaiseError` issue by explicitly setting `UTF8_BINARY` collation. ### Why are the changes needed? To fix wrong results of mentioned queries. ### Does this PR introduce _any_ user-facing change? Yes, it fixed explained error. ### How was this patch tested? Added test to `CollationSQLExpressionsSuite`. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #46758 from nikolamand-db/SPARK-48430. Authored-by: Nikola Mandic <nikola.mandic@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…d strings ### What changes were proposed in this pull request? Following queries return unexpected results: ``` select collation(map('a', 'b' collate utf8_binary_lcase)['a']); select collation(element_at(map('a', 'b' collate utf8_binary_lcase), 'a')); ``` Both return `UTF8_BINARY` instead of `UTF8_BINARY_LCASE`. The error was introduced by changes in apache#46661 that tried to solve other problem with `RaiseError` expression (please refer to PR for details). Partially revert changes of mentioned PR and fix the `RaiseError` issue by explicitly setting `UTF8_BINARY` collation. ### Why are the changes needed? To fix wrong results of mentioned queries. ### Does this PR introduce _any_ user-facing change? Yes, it fixed explained error. ### How was this patch tested? Added test to `CollationSQLExpressionsSuite`. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#46758 from nikolamand-db/SPARK-48430. Authored-by: Nikola Mandic <nikola.mandic@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
Following up on the introduction of AbstractMapType (#46458) and changes that introduce collation awareness for RaiseError expression (#46461), this PR should add the appropriate type casting rules for AbstractMapType.
Why are the changes needed?
Fix the CI failure for the
Support RaiseError misc expression with collation
test when ANSI is off.Does this PR introduce any user-facing change?
Yes, type casting is now allowed for map types with collated strings.
How was this patch tested?
Extended suite
CollationSQLExpressionsANSIOffSuite
with ANSI disabled.Was this patch authored or co-authored using generative AI tooling?
No.