You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Cause: java.lang.AssertionError: value at 6 is null at ai.rapids.cudf.HostColumnVectorCore.assertsForGet(HostColumnVectorCore.java:228) at ai.rapids.cudf.HostColumnVectorCore.getInt(HostColumnVectorCore.java:254) at com.nvidia.spark.rapids.RapidsHostColumnVectorCore.getInt(RapidsHostColumnVectorCore.java:126) at org.apache.spark.sql.vectorized.ColumnarArray.getInt(ColumnarArray.java:128) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.MapObjects_0$(generated.java:57) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(generated.java:28) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$Deserializer.apply(ExpressionEncoder.scala:181) at com.nvidia.spark.rapids.shims.v2.GpuRowBasedScalaUDF.$anonfun$scalaConverter$2(GpuRowBasedScalaUDF.scala:70) at org.apache.spark.sql.rapids.GpuRowBasedScalaUDFBase.$anonfun$childAccessors$2(GpuScalaUDF.scala:141)
This issue is for #3942
Exception
Root Cause,
Spark may have some bug on the CodeGen for MapObjects, please see https://github.com/apache/spark/blob/branch-3.2/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala#L1095-L1097
The generated code is like below
We can see, it first call getInt then call isNullAt instead of isNullAt and getInt
I checked RapidsHostColumnVectorCore and found it didn't check isNullAt and call cudf.getInt, where throw exception when the row is null.
I also checked the getInt of OnHeapColumnVector
it will return the default value. Same as OffHeapColumnVector.
Obviously, this is caused by Spark code. But we shouldn't throw exception which cause the spark job failed.
The text was updated successfully, but these errors were encountered: