Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Enable integration tests in CI for Spark 3.2.0 without failing our regular builds #3483

Closed
gerashegalov opened this issue Sep 14, 2021 · 14 comments
Assignees
Labels
P0 Must have for release test Only impacts tests

Comments

@gerashegalov
Copy link
Collaborator

gerashegalov commented Sep 14, 2021

Is your feature request related to a problem? Please describe.
Run all integration tests against Spark 3.2.0 using a multi-shim jar at least using minimumFeatureVersionMix if not all

Note failures may overlap with unit test issue: #3376

Describe the solution you'd like
part of CI that reports all the failures against Spark 3.2.0 to track progress but without failing if part of the regular test builds

Describe alternatives you've considered
Running locally with NUM_LOCAL_EXECS >= 2 probably captures some/most of the same failures. But CI is preferable.

Additional context
shim rework

@gerashegalov gerashegalov added feature request New feature or request ? - Needs Triage Need team to review and classify labels Sep 14, 2021
@Salonijain27 Salonijain27 added P0 Must have for release and removed ? - Needs Triage Need team to review and classify labels Sep 14, 2021
@tgravescs
Copy link
Collaborator

tgravescs commented Sep 15, 2021

first just manual run:
 
232 failed
 

  • - Ansi:

 
 
error_message = 'java.lang.ArithmeticException: divide by zero':
 
FAILED ../../src/main/python/arithmetic_ops_test.py::test_div_by_zero_ansi[1/0]
FAILED ../../src/main/python/arithmetic_ops_test.py::test_div_by_zero_ansi[a/0]
FAILED ../../src/main/python/arithmetic_ops_test.py::test_div_by_zero_ansi[a/b]
 

  • - csv_test.py

 
 
pyspark.sql.utils.IllegalArgumentException: Part of the plan is not columnar class org.apache.spark.sql.execution.CommandResultExec^[[0m
 
FAILED ../../src/main/python/csv_test.py::test_basic_read. ( a lot more of these)
...
...
 

 
 
^[[1m^[[31mE                   : java.lang.ClassCastException: java.lang.Long cannot be cast to org.apache.spark.unsafe.types.CalendarInterval^[[0m
 
FAILED ../../src/main/python/date_time_test.py::test_timesub[(-584, 1563)] - ...
 

  • - hash_aggregate_test:

 
 
^[[1m^[[31mE               pyspark.sql.utils.IllegalArgumentException: Part of the plan is not columnar class org.apache.spark.sql.execution.aggregate.HashAggregateExec^[[0m
 
FAILED ../../src/main/python/hash_aggregate_test.py::test_hash_grpby_avg_nulls[true-{'spark.rapids.sql.variableFloatAgg.enabled': 'true', 'spark.rapids.sql.hasNans': 'false', 'spark.rapids.sql.castStringToFloat.enabled': 'true'}-[('a', RepeatSeq(String)), ('b', Integer), ('c', Null)]][IGNORE_ORDER]
 
These have a bunch of cpu/gpu mismatch data:
 
FAILED ../../src/main/python/hash_aggregate_test.py::test_hash_multiple_mode_query[{'spark.rapids.sql.variableFloatAgg.enabled': 'true', 'spark.rapids.sql.hasNans': 'false', 'spark.rapids.sql.castStringToFloat.enabled': 'true'}-[('a', RepeatSeq(Float)), ('b', Integer), ('c', Long)]][IGNORE_ORDER, INCOMPAT, APPROXIMATE_FLOAT]
 

 
 
values diff
 
FAILED ../../src/main/python/repart_test.py::test_hash_repartition_exact[([('a', Float)], ['a'])-1][IGNORE_ORDER({'local': True})]
 
 

  • - udf_tests ( from latest CI run:

 [2021-09-16T02:08:23.048Z] FAILED ../../src/main/python/udf_test.py::test_group_aggregate_udf_more_types[Float][IGNORE_ORDER({'local': True})]
[2021-09-16T02:08:23.048Z] FAILED ../../src/main/python/udf_test.py::test_group_aggregate_udf_more_types[Double][IGNORE_ORDER({'local': True})]
[2021-09-16T02:08:23.048Z] FAILED ../../src/main/python/udf_test.py::test_group_apply_udf_more_types[Float][IGNORE_ORDER]
[2021-09-16T02:08:23.048Z] FAILED ../../src/main/python/udf_test.py::test_group_apply_udf_more_types[Double][IGNORE_ORDER]
 
 

  • qa nightly

[2021-09-16T02:08:23.048Z] FAILED ../../src/main/python/qa_nightly_select_test.py::test_needs_sort_select[SUM(byteF) OVER (PARTITION BY byteF ORDER BY CAST(dateF AS TIMESTAMP) RANGE BETWEEN INTERVAL 1 DAYS PRECEDING AND INTERVAL 1 DAYS FOLLOWING ) as sum_total][IGNORE_ORDER, INCOMPAT, APPROXIMATE_FLOAT] -> should be covered by DayTimeInterval changes, window exec falls back to cpu

 
 
results differ and things not on GPU
 
FAILED ../../src/main/python/window_function_test.py::test_window_running_no_part[data:Byte-1000]
FAILED ../../src/main/python/window_function_test.py::test_window_running_no_part[data:Byte-1g]
FAILED ../../src/main/python/window_function_test.py::test_window_running_no_part[data:Short-1000]
FAILED ../../src/main/python/window_function_test.py::test_window_running_no_part[data:Short-1g]
FAILED ../../src/main/python/window_function_test.py::test_window_running_no_part[data:Integer-1000]
FAILED ../../src/main/python/window_function_test.py::test_window_running_no_part[data:Integer-1g]
FAILED ../../src/main/python/window_function_test.py::test_running_float_sum_no_part[1000][APPROXIMATE_FLOAT]
FAILED ../../src/main/python/window_function_test.py::test_running_float_sum_no_part[1g][APPROXIMATE_FLOAT]
FAILED ../../src/main/python/window_function_test.py::test_window_aggs_for_ranges_timestamps[[('a', RepeatSeq(Long)), ('b', Timestamp(not_null)), ('c', Integer)]][IGNORE_ORDER({'local': True})]
FAILED ../../src/main/python/window_function_test.py::test_window_aggs_for_ranges_timestamps[[('a', RepeatSeq(not_null)(Long(not_null))), ('b', Timestamp), ('c', Integer)]][IGNORE_ORDER({'local': True})]
 
 

@tgravescs
Copy link
Collaborator

tgravescs commented Sep 15, 2021

error_message = 'java.lang.ArithmeticException: divide by zero':
 
FAILED ../../src/main/python/arithmetic_ops_test.py::test_div_by_zero_ansi[1/0]
FAILED ../../src/main/python/arithmetic_ops_test.py::test_div_by_zero_ansi[a/0]
FAILED ../../src/main/python/arithmetic_ops_test.py::test_div_by_zero_ansi[a/b]

this is caused by Spark now throwing SparkArithmeticException instead of java.lang.ArithmeticException

#3499

@tgravescs
Copy link
Collaborator

csv_test was setup issue, passing now.

@revans2
Copy link
Collaborator

revans2 commented Sep 15, 2021

A lot of the window function tests are going to be related to #3415 once that is in, then we can look at any more failing tests.

@tgravescs
Copy link
Collaborator

tgravescs commented Sep 15, 2021

for the datetime_test it looks like on 3.1.1 TimeAdd literal type was org.apache.spark.unsafe.types.CalendarInterval but not in 3.2.0 its Long and thus results in a:

: java.lang.ClassCastException: java.lang.Long cannot be cast to org.apache.spark.unsafe.types.CalendarInterval. ?

This is all around the new handling of DayTimeIntervalType

@tgravescs
Copy link
Collaborator

Also note I'm working on getting the 3.2.0 jenkins job running, we just need to get the 3.2.1-snapshot stuff all the way through a build and get a jar.

@revans2
Copy link
Collaborator

revans2 commented Sep 15, 2021

It turns out that average was extended to support more types, so not the query is falling back to the CPU when it does not need to. I am going to do dig into when some of the types were updated so I have a better idea of what needs to be changed.

@revans2
Copy link
Collaborator

revans2 commented Sep 15, 2021

So it turns out that Spark is inserting an average for NullType even though technically Spark says it does not support it in the code. I can put in a quick fix to get the tests working again, it is just a little odd.

@tgravescs
Copy link
Collaborator

I didn't get a chance to look into repart_test very far so if someone else has time, I made data smaller and diff has a few rows where hashed value isn't the same. This is on test_hash_repartition_exact

@jlowe
Copy link
Member

jlowe commented Sep 16, 2021

The repart_test failures are caused by Spark now normalizing -0.0 to 0.0 when it did not before. A bit surprising they decided to break consistent hashing to fix this, but apparently they did. See SPARK-35207.

@revans2
Copy link
Collaborator

revans2 commented Sep 17, 2021

The repart_test failures are caused by Spark now normalizing -0.0 to 0.0 when it did not before. A bit surprising they decided to break consistent hashing to fix this, but apparently they did. See SPARK-35207.

But it is just -0, NaNs still are different from each other.

@revans2
Copy link
Collaborator

revans2 commented Sep 17, 2021

Apparently this also impacts md5 but the tests are not as complete. We should extend our binary support to get better testing for it.

@revans2
Copy link
Collaborator

revans2 commented Sep 17, 2021

The only QA test still failing is.

FAILED ../../src/main/python/qa_nightly_select_test.py::test_needs_sort_select[SUM(byteF) OVER (PARTITION BY byteF ORDER BY CAST(dateF AS TIMESTAMP) RANGE BETWEEN INTERVAL 1 DAYS PRECEDING AND INTERVAL 1 DAYS FOLLOWING ) as sum_total][IGNORE_ORDER, INCOMPAT, APPROXIMATE_FLOAT]

It is most likely the window update because it is over a DAY interval.

@tgravescs
Copy link
Collaborator

all these are passing, if new ones show up, I'll file separate issue.

@sameerz sameerz added test Only impacts tests and removed feature request New feature or request labels Sep 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P0 Must have for release test Only impacts tests
Projects
None yet
Development

No branches or pull requests

6 participants