Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] join_test failed in integration tests #8061

Closed
NvTimLiu opened this issue Apr 9, 2023 · 6 comments
Closed

[BUG] join_test failed in integration tests #8061

NvTimLiu opened this issue Apr 9, 2023 · 6 comments
Assignees
Labels
bug Something isn't working cudf_dependency An issue or PR with this label depends on a new feature in cudf

Comments

@NvTimLiu
Copy link
Collaborator

NvTimLiu commented Apr 9, 2023

Describe the bug

Failed on Branch-23.06 only

 FAILED ../../src/main/python/join_test.py::test_sortmerge_join_struct_as_key[Left-Struct(['child0', String(not_null)],['child1', Byte(not_null)],['child2', Short(not_null)],['child3', Integer(not_null)],['child4', Long(not_null)],['child5', Boolean(not_null)],['child6', Date(not_null)],['child7', Timestamp(not_null)])][INJECT_OOM, IGNORE_ORDER({'local': True})] - TypeError: object of type 'NoneType' has no len()

 FAILED ../../src/main/python/join_test.py::test_sortmerge_join_struct_as_key[Right-Struct(['child0', String(not_null)],['child1', Byte(not_null)],['child2', Short(not_null)],['child3', Integer(not_null)],['child4', Long(not_null)],['child5', Boolean(not_null)],['child6', Date(not_null)],['child7', Timestamp(not_null)])][INJECT_OOM, IGNORE_ORDER({'local': True})] - TypeError: object of type 'NoneType' has no len()

 FAILED ../../src/main/python/join_test.py::test_sortmerge_join_struct_mixed_key[Right-Struct(['child0', String(not_null)],['child1', Byte(not_null)],['child2', Short(not_null)],['child3', Integer(not_null)],['child4', Long(not_null)],['child5', Boolean(not_null)],['child6', Date(not_null)],['child7', Timestamp(not_null)])][INJECT_OOM, IGNORE_ORDER({'local': True})] - TypeError: object of type 'NoneType' has no len()

 FAILED ../../src/main/python/join_test.py::test_broadcast_join_right_struct_as_key[Left-Struct(['child0', String(not_null)],['child1', Byte(not_null)],['child2', Short(not_null)],['child3', Integer(not_null)],['child4', Long(not_null)],['child5', Boolean(not_null)],['child6', Date(not_null)],['child7', Timestamp(not_null)])][INJECT_OOM, IGNORE_ORDER({'local': True})] - TypeError: object of type 'NoneType' has no len()

 FAILED ../../src/main/python/join_test.py::test_broadcast_join_right_struct_as_key[Right-Struct(['child0', String(not_null)],['child1', Byte(not_null)],['child2', Short(not_null)],['child3', Integer(not_null)],['child4', Long(not_null)],['child5', Boolean(not_null)],['child6', Date(not_null)],['child7', Timestamp(not_null)])][IGNORE_ORDER({'local': True})] - TypeError: object of type 'NoneType' has no len()

 FAILED ../../src/main/python/join_test.py::test_broadcast_join_right_struct_mixed_key[Right-Struct(['child0', String(not_null)],['child1', Byte(not_null)],['child2', Short(not_null)],['child3', Integer(not_null)],['child4', Long(not_null)],['child5', Boolean(not_null)],['child6', Date(not_null)],['child7', Timestamp(not_null)])][INJECT_OOM, IGNORE_ORDER({'local': True})] - TypeError: object of type 'NoneType' has no len()

Detail log

test_sortmerge_join_struct_as_key[Right-Struct(['child0', String(not_null)],['child1', Byte(not_null)],['child2', Short(not_null)],['child3', Integer(not_null)],['child4', Long(not_null)],['child5', Boolean(not_null)],['child6', Date(not_null)],['child7', Timestamp(not_null)])] _�
 [gw1] linux -- Python 3.8.10 /usr/bin/python
 
 data_gen = Struct(['child0', String(not_null)],['child1', Byte(not_null)],['child2', Short(not_null)],['child3', Integer(not_null)],['child4', Long(not_null)],['child5', Boolean(not_null)],['child6', Date(not_null)],['child7', Timestamp(not_null)])
 join_type = 'Right'
 
     @ignore_order(local=True)
     @pytest.mark.parametrize('data_gen', struct_gens, ids=idfn)
     @pytest.mark.parametrize('join_type', ['Inner', 'Left', 'Right', 'Cross', 'LeftSemi', 'LeftAnti'], ids=idfn)
     def test_sortmerge_join_struct_as_key(data_gen, join_type):
         def do_join(spark):
             left, right = create_df(spark, data_gen, 500, 250)
             return left.join(right, left.a == right.r_a, join_type)
 >       assert_gpu_and_cpu_are_equal_collect(do_join, conf=_sortmerge_join_conf)
 
../../src/main/python/join_test.py 672: 
 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../src/main/python/asserts.py 562: in assert_gpu_and_cpu_are_equal_collect
     _assert_gpu_and_cpu_are_equal(func, 'COLLECT', conf=conf, is_cpu_first=is_cpu_first)
../../src/main/python/asserts.py 493: in _assert_gpu_and_cpu_are_equal
     assert_equal(from_cpu, from_gpu)
../../src/main/python/asserts.py 106: in assert_equal
     _assert_equal(cpu, gpu, float_check=get_float_check(), path=[])
../../src/main/python/asserts.py 42: in _assert_equal
     _assert_equal(cpu[index], gpu[index], float_check, path + [index])
../../src/main/python/asserts.py 35: in _assert_equal
     _assert_equal(cpu[field], gpu[field], float_check, path + [field])
 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
 
 cpu = Row(child0='\x02?i/\xad\x9d\x81è', child1=0, child2=-9690, child3=1021976800, child4=-7292110493237686856, child5=False, child6=datetime.date(4859, 10, 23), child7=datetime.datetime(5205, 4, 22, 0, 17, 58, 529789))
 gpu = None
 float_check = <function get_float_check.<locals>.<lambda> at 0x7fced8812160>
 path = [8, 'a']
 
     def _assert_equal(cpu, gpu, float_check, path):
         t = type(cpu)
         if (t is Row):
 >           assert len(cpu) == len(gpu), "CPU and GPU row have different lengths at {} CPU: {} GPU: {}".format(path, len(cpu), len(gpu))
E           TypeError: object of type 'NoneType' has no len()
 
../../src/main/python/asserts.py 31: TypeError
 ----------------------------- Captured stdout call -----------------------------
 ### CPU RUN ###
 ### GPU RUN ###
 ### COLLECT: GPU TOOK 0.5862219333648682 CPU TOOK 0.5391867160797119 ###
@NvTimLiu NvTimLiu added bug Something isn't working ? - Needs Triage Need team to review and classify labels Apr 9, 2023
@jlowe jlowe self-assigned this Apr 10, 2023
@jlowe
Copy link
Member

jlowe commented Apr 10, 2023

I suspect this is caused by a change in cudf. I rolled back the plugin source by a week and it still fails, implying it's not something in the plugin source that is triggering the failure.

@ttnghia
Copy link
Collaborator

ttnghia commented Apr 10, 2023

Maybe related to rapidsai/cudf#12787? @jlowe Can you try to reverse this in spark-rapids-jni and test it?

@jlowe
Copy link
Member

jlowe commented Apr 10, 2023

Maybe related to rapidsai/cudf#12787?

Yes that's my theory as well. Testing it now.

@jlowe
Copy link
Member

jlowe commented Apr 10, 2023

Yep, tests pass with this reverted and fail when it is restored.

@ttnghia
Copy link
Collaborator

ttnghia commented Apr 12, 2023

Tested rapidsai/cudf#13120 against the failed tests. All these tests pass with that fix.

@ttnghia
Copy link
Collaborator

ttnghia commented Apr 13, 2023

Close as resolved.

@ttnghia ttnghia closed this as completed Apr 13, 2023
@sameerz sameerz added the cudf_dependency An issue or PR with this label depends on a new feature in cudf label Apr 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cudf_dependency An issue or PR with this label depends on a new feature in cudf
Projects
None yet
Development

No branches or pull requests

5 participants