Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] parquet_test.py pytests FAILED on Databricks-9.1-ML-spark-3.1.2 #4069

Closed
NvTimLiu opened this issue Nov 10, 2021 · 4 comments · Fixed by #3982 or #4080
Closed

[BUG] parquet_test.py pytests FAILED on Databricks-9.1-ML-spark-3.1.2 #4069

NvTimLiu opened this issue Nov 10, 2021 · 4 comments · Fixed by #3982 or #4080
Assignees
Labels
bug Something isn't working P0 Must have for release

Comments

@NvTimLiu
Copy link
Collaborator

Describe the bug

2021-11-10T06:53:08.173Z] 
[2021-11-10T06:53:08.173Z] =================================== FAILURES ===================================
[2021-11-10T06:53:08.173Z]  test_nested_pruning_and_case_insensitive[true--reader_confs0-[['struct', Struct(['c_1', String],['case_insensitive', Long],['c_3', Short])]]-[['STRUCT', Struct(['case_INSENsitive', Long])]]] _�[0m
[2021-11-10T06:53:08.173Z] [gw0] linux -- Python 3.8.12 /databricks/conda/envs/cudf-udf/bin/python
[2021-11-10T06:53:08.173Z] 
[2021-11-10T06:53:08.173Z] spark_tmp_path = '/tmp/pyspark_tests//754491/'
[2021-11-10T06:53:08.173Z] data_gen = [['struct', Struct(['c_1', String],['case_insensitive', Long],['c_3', Short])]]
[2021-11-10T06:53:08.173Z] read_schema = [['STRUCT', Struct(['case_INSENsitive', Long])]]
[2021-11-10T06:53:08.173Z] reader_confs = {'spark.rapids.sql.format.parquet.reader.type': 'PERFILE'}
[2021-11-10T06:53:08.173Z] v1_enabled_list = '', nested_enabled = 'true'
[2021-11-10T06:53:08.173Z] 
[2021-11-10T06:53:08.173Z]     @pytest.mark.parametrize('data_gen,read_schema', _nested_pruning_schemas, ids=idfn)
[2021-11-10T06:53:08.173Z]     @pytest.mark.parametrize('reader_confs', reader_opt_confs)
[2021-11-10T06:53:08.173Z]     @pytest.mark.parametrize('v1_enabled_list', ["", "parquet"])
[2021-11-10T06:53:08.173Z]     @pytest.mark.parametrize('nested_enabled', ["true", "false"])
[2021-11-10T06:53:08.173Z]     def test_nested_pruning_and_case_insensitive(spark_tmp_path, data_gen, read_schema, reader_confs, v1_enabled_list, nested_enabled):
[2021-11-10T06:53:08.173Z]         data_path = spark_tmp_path + '/PARQUET_DATA'
[2021-11-10T06:53:08.173Z]         with_cpu_session(
[2021-11-10T06:53:08.173Z]                 lambda spark : gen_df(spark, data_gen).write.parquet(data_path),
[2021-11-10T06:53:08.173Z]                 conf=rebase_write_corrected_conf)
[2021-11-10T06:53:08.173Z]         all_confs = copy_and_update(reader_confs, {
[2021-11-10T06:53:08.173Z]             'spark.sql.sources.useV1SourceList': v1_enabled_list,
[2021-11-10T06:53:08.173Z]             'spark.sql.optimizer.nestedSchemaPruning.enabled': nested_enabled,
[2021-11-10T06:53:08.173Z]             'spark.sql.legacy.parquet.datetimeRebaseModeInRead': 'CORRECTED'})
[2021-11-10T06:53:08.173Z]         # This is a hack to get the type in a slightly less verbose way
[2021-11-10T06:53:08.173Z]         rs = StructGen(read_schema, nullable=False).data_type
[2021-11-10T06:53:08.173Z] >       assert_gpu_and_cpu_are_equal_collect(lambda spark : spark.read.schema(rs).parquet(data_path),
[2021-11-10T06:53:08.173Z]                 conf=all_confs)
[2021-11-10T06:53:08.174Z] 
[2021-11-10T06:53:08.174Z] ../../src/main/python/parquet_test.py:504: 
[2021-11-10T06:53:08.174Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2021-11-10T06:53:08.174Z] ../../src/main/python/asserts.py:505: in assert_gpu_and_cpu_are_equal_collect
[2021-11-10T06:53:08.174Z]     _assert_gpu_and_cpu_are_equal(func, 'COLLECT', conf=conf, is_cpu_first=is_cpu_first)
[2021-11-10T06:53:08.174Z] ../../src/main/python/asserts.py:425: in _assert_gpu_and_cpu_are_equal
[2021-11-10T06:53:08.174Z]     run_on_gpu()
[2021-11-10T06:53:08.174Z] ../../src/main/python/asserts.py:419: in run_on_gpu
[2021-11-10T06:53:08.174Z]     from_gpu = with_gpu_session(bring_back, conf=conf)
[2021-11-10T06:53:08.174Z] ../../src/main/python/spark_session.py:105: in with_gpu_session
[2021-11-10T06:53:08.174Z]     return with_spark_session(func, conf=copy)
[2021-11-10T06:53:08.174Z] ../../src/main/python/spark_session.py:70: in with_spark_session
[2021-11-10T06:53:08.174Z]     ret = func(_spark)
[2021-11-10T06:53:08.174Z] ../../src/main/python/asserts.py:198: in <lambda>
[2021-11-10T06:53:08.174Z]     bring_back = lambda spark: limit_func(spark).collect()
[2021-11-10T06:53:08.174Z] /databricks/spark/python/pyspark/sql/dataframe.py:697: in collect
[2021-11-10T06:53:08.174Z]     sock_info = self._jdf.collectToPython()
[2021-11-10T06:53:08.174Z] /databricks/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py:1304: in __call__
[2021-11-10T06:53:08.174Z]     return_value = get_return_value(
[2021-11-10T06:53:08.174Z] /databricks/spark/python/pyspark/sql/utils.py:117: in deco
[2021-11-10T06:53:08.174Z]     return f(*a, **kw)
[2021-11-10T06:53:08.174Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2021-11-10T06:53:08.174Z] 
[2021-11-10T06:53:08.174Z] answer = 'xro499862'
[2021-11-10T06:53:08.174Z] gateway_client = <py4j.java_gateway.GatewayClient object at 0x7f7ea9f70a00>
[2021-11-10T06:53:08.174Z] target_id = 'o499859', name = 'collectToPython'
[2021-11-10T06:53:08.174Z] 
[2021-11-10T06:53:08.174Z]     def get_return_value(answer, gateway_client, target_id=None, name=None):
[2021-11-10T06:53:08.174Z]         """Converts an answer received from the Java gateway into a Python object.
[2021-11-10T06:53:08.174Z]     
[2021-11-10T06:53:08.174Z]         For example, string representation of integers are converted to Python
[2021-11-10T06:53:08.174Z]         integer, string representation of objects are converted to JavaObject
[2021-11-10T06:53:08.174Z]         instances, etc.
[2021-11-10T06:53:08.174Z]     
[2021-11-10T06:53:08.174Z]         :param answer: the string returned by the Java gateway
[2021-11-10T06:53:08.174Z]         :param gateway_client: the gateway client used to communicate with the Java
[2021-11-10T06:53:08.174Z]             Gateway. Only necessary if the answer is a reference (e.g., object,
[2021-11-10T06:53:08.174Z]             list, map)
[2021-11-10T06:53:08.174Z]         :param target_id: the name of the object from which the answer comes from
[2021-11-10T06:53:08.174Z]             (e.g., *object1* in `object1.hello()`). Optional.
[2021-11-10T06:53:08.174Z]         :param name: the name of the member from which the answer comes from
[2021-11-10T06:53:08.174Z]             (e.g., *hello* in `object1.hello()`). Optional.
[2021-11-10T06:53:08.174Z]         """
[2021-11-10T06:53:08.174Z]         if is_error(answer)[0]:
[2021-11-10T06:53:08.174Z]             if len(answer) > 1:
[2021-11-10T06:53:08.174Z]                 type = answer[1]
[2021-11-10T06:53:08.174Z]                 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
[2021-11-10T06:53:08.174Z]                 if answer[1] == REFERENCE_TYPE:
[2021-11-10T06:53:08.174Z] >                   raise Py4JJavaError(
[2021-11-10T06:53:08.174Z]                         "An error occurred while calling {0}{1}{2}.\n".
[2021-11-10T06:53:08.174Z]                         format(target_id, ".", name), value)
[2021-11-10T06:53:08.174Z]                    py4j.protocol.Py4JJavaError: An error occurred while calling o499859.collectToPython.
[2021-11-10T06:53:08.174Z]                    : org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 14363.0 failed 1 times, most recent failure: Lost task 1.0 in stage 14363.0 (TID 56503) (ip-10-59-180-78.us-west-2.compute.internal executor driver): ai.rapids.cudf.CudfException: cuDF failure at: /home/jenkins/agent/workspace/jenkins-cudf_nightly-dev-github-518-cuda11/cpp/src/io/parquet/reader_impl.cu:386: Found no metadata for schema index
[2021-11-10T06:53:08.174Z]                    	at ai.rapids.cudf.Table.readParquet(Native Method)
[2021-11-10T06:53:08.174Z]                    	at ai.rapids.cudf.Table.readParquet(Table.java:862)
[2021-11-10T06:53:08.174Z]                    	at com.nvidia.spark.rapids.ParquetPartitionReader.$anonfun$readToTable$1(GpuParquetScanBase.scala:1491)
[2021-11-10T06:53:08.174Z]                    	at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:28)
[2021-11-10T06:53:08.174Z]                    	at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:26)
[2021-11-10T06:53:08.174Z]                    	at com.nvidia.spark.rapids.FilePartitionReaderBase.withResource(GpuMultiFileReader.scala:236)
[2021-11-10T06:53:08.174Z]                    	at com.nvidia.spark.rapids.ParquetPartitionReader.readToTable(GpuParquetScanBase.scala:1490)
[2021-11-10T06:53:08.174Z]                    	at com.nvidia.spark.rapids.ParquetPartitionReader.$anonfun$readBatch$1(GpuParquetScanBase.scala:1451)
[2021-11-10T06:53:08.175Z]                    	at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:28)
[2021-11-10T06:53:08.175Z]                    	at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:26)
[2021-11-10T06:53:08.175Z]                    	at com.nvidia.spark.rapids.FilePartitionReaderBase.withResource(GpuMultiFileReader.scala:236)
[2021-11-10T06:53:08.175Z]                    	at com.nvidia.spark.rapids.ParquetPartitionReader.readBatch(GpuParquetScanBase.scala:1439)
[2021-11-10T06:53:08.175Z]                    	at com.nvidia.spark.rapids.ParquetPartitionReader.next(GpuParquetScanBase.scala:1424)
[2021-11-10T06:53:08.175Z]                    	at com.nvidia.spark.rapids.PartitionReaderWithBytesRead.next(GpuDataSourceRDD.scala:94)
[2021-11-10T06:53:08.175Z]                    	at com.nvidia.spark.rapids.ColumnarPartitionReaderWithPartitionValues.next(ColumnarPartitionReaderWithPartitionValues.scala:36)
[2021-11-10T06:53:08.175Z]                    	at org.apache.spark.sql.execution.datasources.v2.PartitionedFileReader.next(FilePartitionReaderFactory.scala:54)
[2021-11-10T06:53:08.175Z]                    	at org.apache.spark.sql.execution.datasources.v2.FilePartitionReader.next(FilePartitionReader.scala:67)
[2021-11-10T06:53:08.175Z]                    	at com.nvidia.spark.rapids.PartitionIterator.hasNext(GpuDataSourceRDD.scala:61)
[2021-11-10T06:53:08.175Z]                    	at com.nvidia.spark.rapids.MetricsBatchIterator.hasNext(GpuDataSourceRDD.scala:78)
[2021-11-10T06:53:08.175Z]                    	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
[2021-11-10T06:53:08.175Z]                    	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
[2021-11-10T06:53:08.175Z]                    	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
[2021-11-10T06:53:08.175Z]                    	at com.nvidia.spark.rapids.ColumnarToRowIterator.$anonfun$fetchNextBatch$2(GpuColumnarToRowExec.scala:223)
[2021-11-10T06:53:08.175Z]                    	at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:28)
[2021-11-10T06:53:08.175Z]                    	at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:26)
[2021-11-10T06:53:08.175Z]                    	at com.nvidia.spark.rapids.ColumnarToRowIterator.withResource(GpuColumnarToRowExec.scala:178)
[2021-11-10T06:53:08.175Z]                    	at com.nvidia.spark.rapids.ColumnarToRowIterator.fetchNextBatch(GpuColumnarToRowExec.scala:222)
[2021-11-10T06:53:08.175Z]                    	at com.nvidia.spark.rapids.ColumnarToRowIterator.loadNextBatch(GpuColumnarToRowExec.scala:199)
[2021-11-10T06:53:08.175Z]                    	at com.nvidia.spark.rapids.ColumnarToRowIterator.hasNext(GpuColumnarToRowExec.scala:239)
[2021-11-10T06:53:08.175Z]                    	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
[2021-11-10T06:53:08.175Z]                    	at org.apache.spark.sql.execution.collect.UnsafeRowBatchUtils$.encodeUnsafeRows(UnsafeRowBatchUtils.scala:80)
[2021-11-10T06:53:08.175Z]                    	at org.apache.spark.sql.execution.collect.Collector.$anonfun$processFunc$1(Collector.scala:178)
[2021-11-10T06:53:08.175Z]                    	at org.apache.spark.scheduler.ResultTask.$anonfun$runTask$3(ResultTask.scala:75)
[2021-11-10T06:53:08.175Z]                    	at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
[2021-11-10T06:53:08.175Z]                    	at org.apache.spark.scheduler.ResultTask.$anonfun$runTask$1(ResultTask.scala:75)
[2021-11-10T06:53:08.175Z]                    	at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
[2021-11-10T06:53:08.175Z]                    	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:55)
[2021-11-10T06:53:08.175Z]                    	at org.apache.spark.scheduler.Task.doRunTask(Task.scala:150)
[2021-11-10T06:53:08.175Z]                    	at org.apache.spark.scheduler.Task.$anonfun$run$1(Task.scala:119)
[2021-11-10T06:53:08.175Z]                    	at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
[2021-11-10T06:53:08.175Z]                    	at org.apache.spark.scheduler.Task.run(Task.scala:91)
[2021-11-10T06:53:08.175Z]                    	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$13(Executor.scala:813)
[2021-11-10T06:53:08.175Z]                    	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1605)
[2021-11-10T06:53:08.175Z]                    	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:816)
[2021-11-10T06:53:08.175Z]                    	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
[2021-11-10T06:53:08.175Z]                    	at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
[2021-11-10T06:53:08.175Z]                    	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:672)
[2021-11-10T06:53:08.175Z]                    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[2021-11-10T06:53:08.175Z]                    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[2021-11-10T06:53:08.176Z]                    	at java.lang.Thread.run(Thread.java:748)
[2021-11-10T06:53:08.176Z]                    
[2021-11-10T06:53:08.176Z]                    Driver stacktrace:
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2828)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2775)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2769)
[2021-11-10T06:53:08.176Z]                    	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
[2021-11-10T06:53:08.176Z]                    	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
[2021-11-10T06:53:08.176Z]                    	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2769)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1305)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1305)
[2021-11-10T06:53:08.176Z]                    	at scala.Option.foreach(Option.scala:407)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1305)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:3036)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2977)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2965)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:1067)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.SparkContext.runJobInternal(SparkContext.scala:2476)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.execution.collect.Collector.runSparkJobs(Collector.scala:264)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.execution.collect.Collector.collect(Collector.scala:299)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.execution.collect.Collector$.collect(Collector.scala:82)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.execution.collect.Collector$.collect(Collector.scala:88)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.execution.collect.InternalRowFormat$.collect(cachedSparkResults.scala:75)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.execution.collect.InternalRowFormat$.collect(cachedSparkResults.scala:62)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.execution.ResultCacheManager.$anonfun$getOrComputeResultInternal$1(ResultCacheManager.scala:512)
[2021-11-10T06:53:08.176Z]                    	at scala.Option.getOrElse(Option.scala:189)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.execution.ResultCacheManager.getOrComputeResultInternal(ResultCacheManager.scala:511)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.execution.ResultCacheManager.getOrComputeResult(ResultCacheManager.scala:399)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.execution.ResultCacheManager.getOrComputeResult(ResultCacheManager.scala:374)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.execution.SparkPlan.executeCollectResult(SparkPlan.scala:406)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.Dataset.$anonfun$collectToPython$1(Dataset.scala:3613)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3825)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$5(SQLExecution.scala:130)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:273)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:104)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:854)
[2021-11-10T06:53:08.176Z]                    	at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:77)
[2021-11-10T06:53:08.177Z]                    	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:223)
[2021-11-10T06:53:08.177Z]                    	at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3823)
[2021-11-10T06:53:08.177Z]                    	at org.apache.spark.sql.Dataset.collectToPython(Dataset.scala:3611)
[2021-11-10T06:53:08.177Z]                    	at sun.reflect.GeneratedMethodAccessor116.invoke(Unknown Source)
[2021-11-10T06:53:08.177Z]                    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2021-11-10T06:53:08.177Z]                    	at java.lang.reflect.Method.invoke(Method.java:498)
[2021-11-10T06:53:08.177Z]                    	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
[2021-11-10T06:53:08.177Z]                    	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
[2021-11-10T06:53:08.177Z]                    	at py4j.Gateway.invoke(Gateway.java:295)
[2021-11-10T06:53:08.177Z]                    	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
[2021-11-10T06:53:08.177Z]                    	at py4j.commands.CallCommand.execute(CallCommand.java:79)
[2021-11-10T06:53:08.177Z]                    	at py4j.GatewayConnection.run(GatewayConnection.java:251)
[2021-11-10T06:53:08.177Z]                    	at java.lang.Thread.run(Thread.java:748)
[2021-11-10T06:53:08.177Z]                    Caused by: ai.rapids.cudf.CudfException: cuDF failure at: /home/jenkins/agent/workspace/jenkins-cudf_nightly-dev-github-518-cuda11/cpp/src/io/parquet/reader_impl.cu:386: Found no metadata for schema index
[2021-11-10T06:53:08.177Z]                    	at ai.rapids.cudf.Table.readParquet(Native Method)
[2021-11-10T06:53:08.177Z]                    	at ai.rapids.cudf.Table.readParquet(Table.java:862)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.ParquetPartitionReader.$anonfun$readToTable$1(GpuParquetScanBase.scala:1491)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:28)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:26)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.FilePartitionReaderBase.withResource(GpuMultiFileReader.scala:236)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.ParquetPartitionReader.readToTable(GpuParquetScanBase.scala:1490)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.ParquetPartitionReader.$anonfun$readBatch$1(GpuParquetScanBase.scala:1451)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:28)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:26)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.FilePartitionReaderBase.withResource(GpuMultiFileReader.scala:236)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.ParquetPartitionReader.readBatch(GpuParquetScanBase.scala:1439)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.ParquetPartitionReader.next(GpuParquetScanBase.scala:1424)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.PartitionReaderWithBytesRead.next(GpuDataSourceRDD.scala:94)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.ColumnarPartitionReaderWithPartitionValues.next(ColumnarPartitionReaderWithPartitionValues.scala:36)
[2021-11-10T06:53:08.177Z]                    	at org.apache.spark.sql.execution.datasources.v2.PartitionedFileReader.next(FilePartitionReaderFactory.scala:54)
[2021-11-10T06:53:08.177Z]                    	at org.apache.spark.sql.execution.datasources.v2.FilePartitionReader.next(FilePartitionReader.scala:67)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.PartitionIterator.hasNext(GpuDataSourceRDD.scala:61)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.MetricsBatchIterator.hasNext(GpuDataSourceRDD.scala:78)
[2021-11-10T06:53:08.177Z]                    	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
[2021-11-10T06:53:08.177Z]                    	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
[2021-11-10T06:53:08.177Z]                    	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.ColumnarToRowIterator.$anonfun$fetchNextBatch$2(GpuColumnarToRowExec.scala:223)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:28)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:26)
[2021-11-10T06:53:08.177Z]                    	at com.nvidia.spark.rapids.ColumnarToRowIterator.withResource(GpuColumnarToRowExec.scala:178)
[2021-11-10T06:53:08.178Z]                    	at com.nvidia.spark.rapids.ColumnarToRowIterator.fetchNextBatch(GpuColumnarToRowExec.scala:222)
[2021-11-10T06:53:08.178Z]                    	at com.nvidia.spark.rapids.ColumnarToRowIterator.loadNextBatch(GpuColumnarToRowExec.scala:199)
[2021-11-10T06:53:08.178Z]                    	at com.nvidia.spark.rapids.ColumnarToRowIterator.hasNext(GpuColumnarToRowExec.scala:239)
[2021-11-10T06:53:08.178Z]                    	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
[2021-11-10T06:53:08.178Z]                    	at org.apache.spark.sql.execution.collect.UnsafeRowBatchUtils$.encodeUnsafeRows(UnsafeRowBatchUtils.scala:80)
[2021-11-10T06:53:08.178Z]                    	at org.apache.spark.sql.execution.collect.Collector.$anonfun$processFunc$1(Collector.scala:178)
[2021-11-10T06:53:08.178Z]                    	at org.apache.spark.scheduler.ResultTask.$anonfun$runTask$3(ResultTask.scala:75)
[2021-11-10T06:53:08.178Z]                    	at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
[2021-11-10T06:53:08.178Z]                    	at org.apache.spark.scheduler.ResultTask.$anonfun$runTask$1(ResultTask.scala:75)
[2021-11-10T06:53:08.178Z]                    	at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
[2021-11-10T06:53:08.178Z]                    	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:55)
[2021-11-10T06:53:08.178Z]                    	at org.apache.spark.scheduler.Task.doRunTask(Task.scala:150)
[2021-11-10T06:53:08.178Z]                    	at org.apache.spark.scheduler.Task.$anonfun$run$1(Task.scala:119)
[2021-11-10T06:53:08.178Z]                    	at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
[2021-11-10T06:53:08.178Z]                    	at org.apache.spark.scheduler.Task.run(Task.scala:91)
[2021-11-10T06:53:08.178Z]                    	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$13(Executor.scala:813)
[2021-11-10T06:53:08.178Z]                    	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1605)
[2021-11-10T06:53:08.178Z]                    	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:816)
[2021-11-10T06:53:08.178Z]                    	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
[2021-11-10T06:53:08.178Z]                    	at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
[2021-11-10T06:53:08.178Z]                    	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:672)
[2021-11-10T06:53:08.178Z]                    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[2021-11-10T06:53:08.178Z]                    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[2021-11-10T06:53:08.178Z]                    	... 1 more
[2021-11-10T06:53:08.178Z] 
[2021-11-10T06:53:08.178Z] /databricks/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py:326: Py4JJavaError
[2021-11-10T06:53:08.178Z] ----------------------------- Captured stdout call -----------------------------
..... 
2021-11-10T06:53:08.875Z] =========================== short test summary info ============================
[2021-11-10T06:53:08.875Z] FAILED ../../src/main/python/parquet_test.py::test_nested_pruning_and_case_insensitive[true--reader_confs0-[['struct', Struct(['c_1', String],['case_insensitive', Long],['c_3', Short])]]-[['STRUCT', Struct(['case_INSENsitive', Long])]]]
[2021-11-10T06:53:08.875Z] FAILED ../../src/main/python/parquet_test.py::test_nested_pruning_and_case_insensitive[true--reader_confs0-[['struct', Struct(['c_1', String],['case_insensitive', Long],['c_3', Short])]]-[['struct', Struct(['CASE_INSENSITIVE', Long])]]]
[2021-11-10T06:53:08.875Z] FAILED ../../src/main/python/parquet_test.py::test_nested_pruning_and_case_insensitive[true--reader_confs0-[['struct', Struct(['c_1', String],['case_insensitive', Long],['c_3', Short])]]-[['stRUct', Struct(['CASE_INSENSITIVE', Long])]]]
[2021-11-10T06:53:08.875Z] FAILED ../../src/main/python/parquet_test.py::test_nested_pruning_and_case_insensitive[true--reader_confs1-[['struct', Struct(['c_1', String],['case_insensitive', Long],['c_3', Short])]]-[['STRUCT', Struct(['case_INSENsitive', Long])]]]
[2021-11-10T06:53:08.875Z] FAILED ../../src/main/python/parquet_test.py::test_nested_pruning_and_case_insensitive[true--reader_confs1-[['struct', Struct(['c_1', String],['case_insensitive', Long],['c_3', Short])]]-[['struct', Struct(['CASE_INSENSITIVE', Long])]]]
[2021-11-10T06:53:08.875Z] FAILED ../../src/main/python/parquet_test.py::test_nested_pruning_and_case_insensitive[true--reader_confs1-[['struct', Struct(['c_1', String],['case_insensitive', Long],['c_3', Short])]]-[['stRUct', Struct(['CASE_INSENSITIVE', Long])]]]
[2021-11-10T06:53:08.876Z] FAILED ../../src/main/python/parquet_test.py::test_nested_pruning_and_case_insensitive[true--reader_confs2-[['struct', Struct(['c_1', String],['case_insensitive', Long],['c_3', Short])]]-[['STRUCT', Struct(['case_INSENsitive', Long])]]]
[2021-11-10T06:53:08.876Z] FAILED ../../src/main/python/parquet_test.py::test_nested_pruning_and_case_insensitive[true--reader_confs2-[['struct', Struct(['c_1', String],['case_insensitive', Long],['c_3', 
14:53:09  = 36 failed, 10781 passed, 136 skipped, 404 xfailed, 156 xpassed, 76 warnings in 6261.85s (1:44:21) =

Steps/Code to reproduce bug
Build rapids-4-spark and run IT on Databricks 9.1 ML spark-3.1.2

Environment details (please complete the following information)

  • Environment location: [Local spark, Databricks 9.1ML spark-3.1.2]
@NvTimLiu NvTimLiu added bug Something isn't working ? - Needs Triage Need team to review and classify labels Nov 10, 2021
@pxLi pxLi changed the title [BUG] parquet_test.py pytests FAILED on Databricks-9.1-ML-spark-3.0.2 [BUG] parquet_test.py pytests FAILED on Databricks-9.1-ML-spark-3.1.2 Nov 10, 2021
@NvTimLiu
Copy link
Collaborator Author

@wbo4958 Does #3982 related to this issue?

@NvTimLiu
Copy link
Collaborator Author

@wbo4958 parquet tests only failed on DB9.1,

tests PASS on DB7.3/DB8.2 and other non-DB environments

@wbo4958
Copy link
Collaborator

wbo4958 commented Nov 10, 2021

Looks like 9.1 runtime has changed the API "ParquetReadSupport.clipParquetSchema" which result in different result

  • For DB 9.1 3.1.2 runtime
 clippedSchemaTmp:message spark_schema {
  optional group STRUCT {
    optional int64 case_INSENsitive;
  }
}
  • spark 3.1.2
 clippedSchemaTmp:message spark_schema {
  optional group struct {
    optional int64 case_insensitive;
  }
}

will continue check tomorrow

@wbo4958
Copy link
Collaborator

wbo4958 commented Nov 11, 2021

"ParquetReadSupport.clipParquetSchema" in DB9.1 will return the same name with readDataSchema instead of parquet file schema which will result clipBlocks return empty ColumnChunkMetaData. So issue happened

NvTimLiu added a commit to NvTimLiu/spark-rapids that referenced this issue Nov 11, 2021
…turns

the readSchema-same-name schema when case insensitive, which will cause
clipBlocks return in-correct results since clipBlocks only takes care of
case sensitive matching.

Signed-off-by: Bobby Wang wbo4958@gmail.com

To fix NVIDIA#4069
@pxLi pxLi removed the ? - Needs Triage Need team to review and classify label Nov 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P0 Must have for release
Projects
None yet
4 participants