Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] test_dpp_via_aggregate_subquery_aqe_off failed with table already exists #4840

Closed
jlowe opened this issue Feb 22, 2022 · 1 comment · Fixed by #4851
Closed

[BUG] test_dpp_via_aggregate_subquery_aqe_off failed with table already exists #4840

jlowe opened this issue Feb 22, 2022 · 1 comment · Fixed by #4851
Assignees
Labels
bug Something isn't working test Only impacts tests

Comments

@jlowe
Copy link
Member

jlowe commented Feb 22, 2022

A nightly integration test run failed with a table already exists exception:

[2022-02-22T21:59:38.650Z] ______________ test_dpp_via_aggregate_subquery_aqe_off[2-parquet] ______________
[2022-02-22T21:59:38.650Z] 
[2022-02-22T21:59:38.650Z] store_format = 'parquet', s_index = 2
[2022-02-22T21:59:38.650Z] spark_tmp_table_factory = <conftest.TmpTableFactory object at 0x7f0151c0a9d0>
[2022-02-22T21:59:38.650Z] 
[2022-02-22T21:59:38.650Z]     @ignore_order
[2022-02-22T21:59:38.650Z]     @pytest.mark.parametrize('store_format', ['parquet', 'orc'], ids=idfn)
[2022-02-22T21:59:38.651Z]     @pytest.mark.parametrize('s_index', list(range(len(_statements))), ids=idfn)
[2022-02-22T21:59:38.651Z]     @pytest.mark.skipif(is_databricks_runtime(), reason="DPP can not cooperate with rapids plugin on Databricks runtime")
[2022-02-22T21:59:38.651Z]     def test_dpp_via_aggregate_subquery_aqe_off(store_format, s_index, spark_tmp_table_factory):
[2022-02-22T21:59:38.651Z] >       __dpp_via_aggregate_subquery(store_format, s_index, spark_tmp_table_factory, 'false')
[2022-02-22T21:59:38.651Z] 
[2022-02-22T21:59:38.651Z] ../../src/main/python/dpp_test.py:227: 
[2022-02-22T21:59:38.651Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2022-02-22T21:59:38.651Z] ../../src/main/python/dpp_test.py:212: in __dpp_via_aggregate_subquery
[2022-02-22T21:59:38.651Z]     create_fact_table(fact_table, store_format)
[2022-02-22T21:59:38.651Z] ../../src/main/python/dpp_test.py:55: in create_fact_table
[2022-02-22T21:59:38.651Z]     with_cpu_session(fn)
[2022-02-22T21:59:38.651Z] ../../src/main/python/spark_session.py:92: in with_cpu_session
[2022-02-22T21:59:38.651Z]     return with_spark_session(func, conf=copy)
[2022-02-22T21:59:38.651Z] ../../src/main/python/spark_session.py:76: in with_spark_session
[2022-02-22T21:59:38.651Z]     ret = func(_spark)
[2022-02-22T21:59:38.651Z] ../../src/main/python/dpp_test.py:51: in fn
[2022-02-22T21:59:38.651Z]     df.write.format(table_format) \
[2022-02-22T21:59:38.651Z] /home/jenkins/agent/workspace/jenkins-rapids_it-3.0.x-SNAPSHOT-dev-github-304/jars/spark-3.0.4-SNAPSHOT-bin-hadoop3.2/python/lib/pyspark.zip/pyspark/sql/readwriter.py:871: in saveAsTable
[2022-02-22T21:59:38.651Z]     self._jwrite.saveAsTable(name)
[2022-02-22T21:59:38.651Z] /home/jenkins/agent/workspace/jenkins-rapids_it-3.0.x-SNAPSHOT-dev-github-304/jars/spark-3.0.4-SNAPSHOT-bin-hadoop3.2/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py:1304: in __call__
[2022-02-22T21:59:38.651Z]     return_value = get_return_value(
[2022-02-22T21:59:38.651Z] /home/jenkins/agent/workspace/jenkins-rapids_it-3.0.x-SNAPSHOT-dev-github-304/jars/spark-3.0.4-SNAPSHOT-bin-hadoop3.2/python/lib/pyspark.zip/pyspark/sql/utils.py:134: in deco
[2022-02-22T21:59:38.651Z]     raise_from(converted)
[2022-02-22T21:59:38.651Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2022-02-22T21:59:38.651Z] 
[2022-02-22T21:59:38.651Z] e = AnalysisException("Can not create the managed table('`tmp_table_981165_0`'). The associated location('file:/home/jenki...:79)\n\tat py4j.GatewayConnection.run(GatewayConnection.java:238)\n\tat java.lang.Thread.run(Thread.java:750)\n", None)
[2022-02-22T21:59:38.651Z] 
[2022-02-22T21:59:38.651Z] >   ???
[2022-02-22T21:59:38.651Z] E   pyspark.sql.utils.AnalysisException: Can not create the managed table('`tmp_table_981165_0`'). The associated location('file:/home/jenkins/agent/workspace/jenkins-rapids_it-3.0.x-SNAPSHOT-dev-github-304/jars/integration_tests/target/run_dir_dpp_test/spark-warehouse/tmp_table_981165_0') already exists.;

Looks like we need to improve our table name random number generation. Either two threads can end up with the same random number sequence based on how random.randint behaves, or we hit the 1-in-a-million chance that two threads ended up picking the same number. If it's the latter, maybe we need to make it 1-in-a-billion.

@jlowe jlowe added bug Something isn't working ? - Needs Triage Need team to review and classify labels Feb 22, 2022
@revans2
Copy link
Collaborator

revans2 commented Feb 23, 2022

The concurrency is not per thread, it is per process. So we could use the PID again, like we did with the tmp directory.

@jlowe jlowe self-assigned this Feb 23, 2022
@jlowe jlowe added test Only impacts tests and removed ? - Needs Triage Need team to review and classify labels Feb 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working test Only impacts tests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants