-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change Cast.toString as "cast" instead of "ansi_cast" under ANSI mode #5047
Change Cast.toString as "cast" instead of "ansi_cast" under ANSI mode #5047
Conversation
Signed-off-by: remzi <13716567376yh@gmail.com>
build |
@@ -1544,11 +1544,7 @@ case class GpuCast( | |||
} | |||
} | |||
|
|||
override def toString: String = if (ansiMode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a test for this? I think this is something that needs to go into the shim layer since it varies by Spark version.
This is the Spark 3.1.x implementation:
override def toString: String = {
val ansi = if (ansiEnabled) "ansi_" else ""
s"${ansi}cast($child as ${dataType.simpleString})"
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the name change only impacts Spark 3.3.0 and later so we need to introduce shim code to determine the name for this expression
The change to make it ansi vs not in the text was introduced in 3.0.0 and reverted in 3.3.0. For me it is so minor of a change that I just don't think it is worth shimming it/testing it. But I could be convinced if someone is actually parsing the text returned by this. |
I also doubt that users would be relying on the name of the expression. At some point, it would be nice if our tests compared the schema of results from GPU and CPU and not just the results themselves. That would automate some of our audit work. I'll remove the change request. |
That is a great point and is something that should be super simpler to add into asserts.py because we are passing in the dataframe. For the most part there is a clear mapping between the SQL type and a corresponding python type that is returned, but in a few cases it is ambiguous and we really should fix those cases. If you want to file an issue for that please do it. If not I will. You cannot see it, but I am face palming after I looked and realized that I forgot to add that into the integration tests when I first wrote them. |
Thanks. I filed #5072 for improving the tests |
Signed-off-by: remzi 13716567376yh@gmail.com
close #4870