Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Failed to find data source: com.nvidia.spark.rapids.tests.datasourcev2.parquet.ArrowColumnarDataSourceV2 #3592

Closed
pxLi opened this issue Sep 22, 2021 · 3 comments · Fixed by #3594
Assignees
Labels
bug Something isn't working P0 Must have for release test Only impacts tests

Comments

@pxLi
Copy link
Collaborator

pxLi commented Sep 22, 2021

Describe the bug
All integration tests Caused by: java.lang.ClassNotFoundException: com.nvidia.spark.rapids.tests.datasourcev2.parquet.ArrowColumnarDataSourceV2.DefaultSource

[2021-09-22T00:32:59.458Z]         if is_error(answer)[0]:

[2021-09-22T00:32:59.458Z]             if len(answer) > 1:

[2021-09-22T00:32:59.458Z]                 type = answer[1]

[2021-09-22T00:32:59.458Z]                 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)

[2021-09-22T00:32:59.458Z]                 if answer[1] == REFERENCE_TYPE:

[2021-09-22T00:32:59.458Z] >                   raise Py4JJavaError(

[2021-09-22T00:32:59.458Z]                         "An error occurred while calling {0}{1}{2}.\n".

[2021-09-22T00:32:59.458Z]                         format(target_id, ".", name), value)

[2021-09-22T00:32:59.458Z] E                   py4j.protocol.Py4JJavaError: An error occurred while calling o361.load.

[2021-09-22T00:32:59.458Z] E                   : java.lang.ClassNotFoundException: Failed to find data source: com.nvidia.spark.rapids.tests.datasourcev2.parquet.ArrowColumnarDataSourceV2. Please find packages at http://spark.apache.org/third-party-projects.html

[2021-09-22T00:32:59.458Z] E                   	at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:679)

[2021-09-22T00:32:59.459Z] E                   	at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:733)

[2021-09-22T00:32:59.459Z] E                   	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:248)

[2021-09-22T00:32:59.459Z] E                   	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:221)

[2021-09-22T00:32:59.459Z] E                   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

[2021-09-22T00:32:59.459Z] E                   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

[2021-09-22T00:32:59.459Z] E                   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

[2021-09-22T00:32:59.459Z] E                   	at java.lang.reflect.Method.invoke(Method.java:498)

[2021-09-22T00:32:59.459Z] E                   	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)

[2021-09-22T00:32:59.459Z] E                   	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)

[2021-09-22T00:32:59.459Z] E                   	at py4j.Gateway.invoke(Gateway.java:282)

[2021-09-22T00:32:59.459Z] E                   	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)

[2021-09-22T00:32:59.459Z] E                   	at py4j.commands.CallCommand.execute(CallCommand.java:79)

[2021-09-22T00:32:59.459Z] E                   	at py4j.GatewayConnection.run(GatewayConnection.java:238)

[2021-09-22T00:32:59.459Z] E                   	at java.lang.Thread.run(Thread.java:748)

[2021-09-22T00:32:59.459Z] E                   Caused by: java.lang.ClassNotFoundException: com.nvidia.spark.rapids.tests.datasourcev2.parquet.ArrowColumnarDataSourceV2.DefaultSource

[2021-09-22T00:32:59.459Z] E                   	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)

[2021-09-22T00:32:59.459Z] E                   	at java.lang.ClassLoader.loadClass(ClassLoader.java:418)

[2021-09-22T00:32:59.459Z] E                   	at java.lang.ClassLoader.loadClass(ClassLoader.java:351)

[2021-09-22T00:32:59.459Z] E                   	at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$5(DataSource.scala:653)

[2021-09-22T00:32:59.459Z] E                   	at scala.util.Try$.apply(Try.scala:213)

[2021-09-22T00:32:59.459Z] E                   	at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$4(DataSource.scala:653)

[2021-09-22T00:32:59.459Z] E                   	at scala.util.Failure.orElse(Try.scala:224)

[2021-09-22T00:32:59.459Z] E                   	at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:653)

[2021-09-22T00:32:59.459Z] E                   	... 14 more

[2021-09-22T00:32:59.459Z] 

failure list,

[2021-09-22T00:32:59.463Z] FAILED ../../src/main/python/datasourcev2_read_test.py::test_read_int - py4j....
[2021-09-22T00:32:59.463Z] FAILED ../../src/main/python/datasourcev2_read_test.py::test_read_strings - p...
[2021-09-22T00:32:59.463Z] FAILED ../../src/main/python/datasourcev2_read_test.py::test_read_all_types
[2021-09-22T00:32:59.463Z] FAILED ../../src/main/python/datasourcev2_read_test.py::test_read_all_types_count
[2021-09-22T00:32:59.463Z] FAILED ../../src/main/python/datasourcev2_read_test.py::test_read_arrow_off
@pxLi pxLi added bug Something isn't working ? - Needs Triage Need team to review and classify labels Sep 22, 2021
@tgravescs
Copy link
Collaborator

I bet these are not picking up the shim specific version of integration tests jar. I seem to remember thinking the scripts would still work but we put in a change for the run script to look specifically for jar with classifier version

@pxLi
Copy link
Collaborator Author

pxLi commented Sep 22, 2021

looks like related to #3533 since we hardcoded spark301 in spark-nightly script

@tgravescs
Copy link
Collaborator

tgravescs commented Sep 22, 2021

yes we need to pull all the jars or the shim specific jar at: https://github.com/NVIDIA/spark-rapids/blob/branch-21.10/jenkins/spark-tests.sh#L37

@pxLi pxLi self-assigned this Sep 22, 2021
@pxLi pxLi added test Only impacts tests and removed ? - Needs Triage Need team to review and classify labels Sep 22, 2021
@tgravescs tgravescs added the P0 Must have for release label Sep 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P0 Must have for release test Only impacts tests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants