We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_sortmerge_join_struct_mixed_key_with_null_filter is failing on LeftSemi and LeftAnti:
LeftSemi:
14:53:32 _ test_sortmerge_join_struct_mixed_key_with_null_filter[LeftSemi-Struct(['child0', String(not_null)],['child1', Byte(not_null)],['child2', Short(not_null)],['child3', Integer(not_null)],['child4', Long(not_null)],['child5', Boolean(not_null)],['child6', Date(not_null)],['child7', Timestamp(not_null)])] _ 14:53:32 [gw2] linux -- Python 3.8.11 /usr/bin/python 14:53:32 14:53:32 data_gen = Struct(['child0', String(not_null)],['child1', Byte(not_null)],['child2', Short(not_null)],['child3', Integer(not_null)],['child4', Long(not_null)],['child5', Boolean(not_null)],['child6', Date(not_null)],['child7', Timestamp(not_null)]) 14:53:32 join_type = 'LeftSemi' 14:53:32 14:53:32 @ignore_order(local=True) 14:53:32 @pytest.mark.parametrize('data_gen', struct_gens, ids=idfn) 14:53:32 @pytest.mark.parametrize('join_type', ['Inner', 'Left', 'Right', 'Cross', 'LeftSemi', 'LeftAnti'], ids=idfn) 14:53:32 def test_sortmerge_join_struct_mixed_key_with_null_filter(data_gen, join_type): 14:53:32 def do_join(spark): 14:53:32 left = two_col_df(spark, data_gen, int_gen, length=500) 14:53:32 right = two_col_df(spark, data_gen, int_gen, length=500) 14:53:32 return left.join(right, (left.a == right.a) & (left.b == right.b), join_type) 14:53:32 # Disable constraintPropagation to test null filter on built table with nullable structures. 14:53:32 conf = {'spark.sql.constraintPropagation.enabled': 'false', **_sortmerge_join_conf} 14:53:32 > assert_gpu_and_cpu_are_equal_collect(do_join, conf=conf) 14:53:32 14:53:32 ../../src/main/python/join_test.py:623: 14:53:32 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 14:53:32 ../../src/main/python/asserts.py:440: in assert_gpu_and_cpu_are_equal_collect 14:53:32 _assert_gpu_and_cpu_are_equal(func, 'COLLECT', conf=conf, is_cpu_first=is_cpu_first) 14:53:32 ../../src/main/python/asserts.py:432: in _assert_gpu_and_cpu_are_equal 14:53:32 assert_equal(from_cpu, from_gpu) 14:53:32 ../../src/main/python/asserts.py:101: in assert_equal 14:53:32 _assert_equal(cpu, gpu, float_check=get_float_check(), path=[]) 14:53:32 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 14:53:32 14:53:32 cpu = [Row(a=Row(child0='\x00\x08y®\x96\x0269', child1=-70, child2=-32768, child3=-1322782629, child4=-1690857710608059544, ...alse, child6=datetime.date(2735, 1, 11), child7=datetime.datetime(319, 4, 24, 18, 37, 44, 718000)), b=-504146606), ...] 14:53:32 gpu = [Row(a=None, b=None), Row(a=None, b=-1938242823), Row(a=None, b=-1902567188), Row(a=None, b=-1839266743), Row(a=None, b=-1771431272), Row(a=None, b=-1709949039), ...] 14:53:32 float_check = <function get_float_check.<locals>.<lambda> at 0x7f06ed78e940> 14:53:32 path = [] 14:53:32 14:53:32 def _assert_equal(cpu, gpu, float_check, path): 14:53:32 t = type(cpu) 14:53:32 if (t is Row): 14:53:32 assert len(cpu) == len(gpu), "CPU and GPU row have different lengths at {} CPU: {} GPU: {}".format(path, len(cpu), len(gpu)) 14:53:32 if hasattr(cpu, "__fields__") and hasattr(gpu, "__fields__"): 14:53:32 assert cpu.__fields__ == gpu.__fields__, "CPU and GPU row have different fields at {} CPU: {} GPU: {}".format(path, cpu.__fields__, gpu.__fields__) 14:53:32 for field in cpu.__fields__: 14:53:32 _assert_equal(cpu[field], gpu[field], float_check, path + [field]) 14:53:32 else: 14:53:32 for index in range(len(cpu)): 14:53:32 _assert_equal(cpu[index], gpu[index], float_check, path + [index]) 14:53:32 elif (t is list): 14:53:32 > assert len(cpu) == len(gpu), "CPU and GPU list have different lengths at {} CPU: {} GPU: {}".format(path, len(cpu), len(gpu)) 14:53:32 E AssertionError: CPU and GPU list have different lengths at [] CPU: 450 GPU: 500 14:53:32 14:53:32 ../../src/main/python/asserts.py:40: AssertionError
LeftAnti:
14:53:32 _ test_sortmerge_join_struct_mixed_key_with_null_filter[LeftAnti-Struct(['child0', String(not_null)],['child1', Byte(not_null)],['child2', Short(not_null)],['child3', Integer(not_null)],['child4', Long(not_null)],['child5', Boolean(not_null)],['child6', Date(not_null)],['child7', Timestamp(not_null)])] _ 14:53:32 [gw2] linux -- Python 3.8.11 /usr/bin/python 14:53:32 14:53:32 data_gen = Struct(['child0', String(not_null)],['child1', Byte(not_null)],['child2', Short(not_null)],['child3', Integer(not_null)],['child4', Long(not_null)],['child5', Boolean(not_null)],['child6', Date(not_null)],['child7', Timestamp(not_null)]) 14:53:32 join_type = 'LeftAnti' 14:53:32 14:53:32 @ignore_order(local=True) 14:53:32 @pytest.mark.parametrize('data_gen', struct_gens, ids=idfn) 14:53:32 @pytest.mark.parametrize('join_type', ['Inner', 'Left', 'Right', 'Cross', 'LeftSemi', 'LeftAnti'], ids=idfn) 14:53:32 def test_sortmerge_join_struct_mixed_key_with_null_filter(data_gen, join_type): 14:53:32 def do_join(spark): 14:53:32 left = two_col_df(spark, data_gen, int_gen, length=500) 14:53:32 right = two_col_df(spark, data_gen, int_gen, length=500) 14:53:32 return left.join(right, (left.a == right.a) & (left.b == right.b), join_type) 14:53:32 # Disable constraintPropagation to test null filter on built table with nullable structures. 14:53:32 conf = {'spark.sql.constraintPropagation.enabled': 'false', **_sortmerge_join_conf} 14:53:32 > assert_gpu_and_cpu_are_equal_collect(do_join, conf=conf) 14:53:32 14:53:32 ../../src/main/python/join_test.py:623: 14:53:32 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 14:53:32 ../../src/main/python/asserts.py:440: in assert_gpu_and_cpu_are_equal_collect 14:53:32 _assert_gpu_and_cpu_are_equal(func, 'COLLECT', conf=conf, is_cpu_first=is_cpu_first) 14:53:32 ../../src/main/python/asserts.py:432: in _assert_gpu_and_cpu_are_equal 14:53:32 assert_equal(from_cpu, from_gpu) 14:53:32 ../../src/main/python/asserts.py:101: in assert_equal 14:53:32 _assert_equal(cpu, gpu, float_check=get_float_check(), path=[]) 14:53:32 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 14:53:32 14:53:32 cpu = [Row(a=None, b=None), Row(a=None, b=-1938242823), Row(a=None, b=-1902567188), Row(a=None, b=-1839266743), Row(a=None, b=-1771431272), Row(a=None, b=-1709949039), ...] 14:53:32 gpu = [] 14:53:32 float_check = <function get_float_check.<locals>.<lambda> at 0x7f06ef182b80> 14:53:32 path = [] 14:53:32 14:53:32 def _assert_equal(cpu, gpu, float_check, path): 14:53:32 t = type(cpu) 14:53:32 if (t is Row): 14:53:32 assert len(cpu) == len(gpu), "CPU and GPU row have different lengths at {} CPU: {} GPU: {}".format(path, len(cpu), len(gpu)) 14:53:32 if hasattr(cpu, "__fields__") and hasattr(gpu, "__fields__"): 14:53:32 assert cpu.__fields__ == gpu.__fields__, "CPU and GPU row have different fields at {} CPU: {} GPU: {}".format(path, cpu.__fields__, gpu.__fields__) 14:53:32 for field in cpu.__fields__: 14:53:32 _assert_equal(cpu[field], gpu[field], float_check, path + [field]) 14:53:32 else: 14:53:32 for index in range(len(cpu)): 14:53:32 _assert_equal(cpu[index], gpu[index], float_check, path + [index]) 14:53:32 elif (t is list): 14:53:32 > assert len(cpu) == len(gpu), "CPU and GPU list have different lengths at {} CPU: {} GPU: {}".format(path, len(cpu), len(gpu)) 14:53:32 E AssertionError: CPU and GPU list have different lengths at [] CPU: 50 GPU: 0 14:53:32 14:53:32 ../../src/main/python/asserts.py:40: AssertionError
The text was updated successfully, but these errors were encountered:
jlowe
Successfully merging a pull request may close this issue.
test_sortmerge_join_struct_mixed_key_with_null_filter is failing on LeftSemi and LeftAnti:
LeftSemi:
LeftAnti:
The text was updated successfully, but these errors were encountered: