Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support _ in spark conf of integration tests #6358

Merged
merged 16 commits into from
Aug 23, 2022

Conversation

gerashegalov
Copy link
Collaborator

@gerashegalov gerashegalov commented Aug 18, 2022

Fixes #6351

  • Allow escaping _ in Spark conf by duplicating _ to allow running Iceberg/ pytests with xdist

Additional refactoring and fixes

  • Remove a stale comment from spark-tests.sh
  • Allow overriding PYSP_TEST_spark_rapids_memory_gpu_allocSize to be able to run multiple Spark apps with multiple executors per pytest session with TEST_PARALLEL >= 2 on a small GPU
  • Add support for --packages and --jars in xdist using findspark
    • Fix xdist failing to init findspark when pyspark already on PYTHONPATH
    • Enforce findspark usage with xdist
      • don't start Spark and allocate GPU resources for xdist master
      • remove direct imports of pyspark from conftest to make sure it happens after session start

Will file a follow-up issue to convert non-xdist tests relying on SPARK_SUBMIT_ARGS for external sources such as Delta and Iceberg to xdist . The following works locally:

export TEST_PARALLEL=2 
export SPARK_HOME=~/dist/spark-3.2.2-bin-hadoop3.2
export PYSP_TEST_spark_master="spark://$(hostname):7077"
export PYSP_TEST_spark_executor_cores="2"
export PYSP_TEST_spark_cores_max="4"
export PYSP_TEST_spark_jars_packages="org.apache.iceberg:iceberg-spark-runtime-3.2_2.12:0.13.2"
export PYSP_TEST_spark_sql_extensions="org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions"
export PYSP_TEST_spark_sql_catalog_spark__catalog="org.apache.iceberg.spark.SparkSessionCatalog"
export PYSP_TEST_spark_sql_catalog_spark__catalog_type="hadoop"
export PYSP_TEST_spark_sql_catalog_spark__catalog_warehouse="$(mktemp -d -p /tmp spark-warehouse-XXXXXX)"
export PYSP_TEST_spark_rapids_memory_gpu_allocSize='512m'
./integration_tests/run_pyspark_from_build.sh -s -m iceberg --iceberg

Signed-off-by: Gera Shegalov gera@apache.org

Signed-off-by: Gera Shegalov <gera@apache.org>
Fixes NVIDIA#6351

- Allow escaping _ by duplicating it
- Add support for --packages and --jars in xdist
- Remove a stale comment from spark-tests

TODO: use xdist for external sources such as Delta and Iceberg

Signed-off-by: Gera Shegalov <gera@apache.org>
@gerashegalov gerashegalov self-assigned this Aug 18, 2022
@gerashegalov gerashegalov added this to the Aug 8 - Aug 19 milestone Aug 18, 2022
@gerashegalov gerashegalov added bug Something isn't working test Only impacts tests labels Aug 18, 2022
@gerashegalov
Copy link
Collaborator Author

build

Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the issues I have are nits. Combining a lot of small things together into a single PR is not ideal for me, but I am okay with it. Just confused because it was not reflected in the name of the PR at all. I had to read the description to know what was happening.

The one I really care about is setting the memory size for the test.

integration_tests/README.md Show resolved Hide resolved
integration_tests/run_pyspark_from_build.sh Show resolved Hide resolved
jenkins/spark-tests.sh Show resolved Hide resolved
@gerashegalov gerashegalov requested a review from revans2 August 18, 2022 15:54
@gerashegalov
Copy link
Collaborator Author

Can't repro the CI failure locally, retrying

@gerashegalov
Copy link
Collaborator Author

build

@gerashegalov
Copy link
Collaborator Author

Since it's a very basic ClassNofFound for SQLPlugin class it may be related to a potential findspark version discrepancy.

Use logging

Signed-off-by: Gera Shegalov <gera@apache.org>
Signed-off-by: Gera Shegalov <gera@apache.org>
@gerashegalov
Copy link
Collaborator Author

build

@gerashegalov
Copy link
Collaborator Author

gerashegalov commented Aug 19, 2022

The reason there is a discrepancy with the CI is that CI explicitly puts pyspark on the PYTHONPATH

2022-08-18T12:53:56.6273334Z [2022-08-18T12:46:09.218Z] + export PYTHONPATH=/home/jenkins/agent/workspace/jenkins-rapids_premerge-github-5402-ci-2/.download/spark-3.1.1-bin-hadoop3.2/python:/home/jenkins/agent/workspace/jenkins-rapids_premerge-github-5402-ci-2/.download/spark-3.1.1-bin-hadoop3.2/python/pyspark/:/home/jenkins/agent/workspace/jenkins-rapids_premerge-github-5402-ci-2/.download/spark-3.1.1-bin-hadoop3.2/python/lib/py4j-0.10.9-src.zip

Thus the code that relies on pyspark not being found to fallback on findspark initialization does not kick in

On the other hand in the wrapper script we explicitly avoid using xdist if findspark is not found.

@gerashegalov gerashegalov changed the title Support _ in spark conf of integration tests Support _ in spark conf of integration tests [databricks] Aug 20, 2022
@gerashegalov
Copy link
Collaborator Author

build

@gerashegalov
Copy link
Collaborator Author

build

Signed-off-by: Gera Shegalov <gera@apache.org>
@gerashegalov
Copy link
Collaborator Author

build

1 similar comment
@gerashegalov
Copy link
Collaborator Author

build

Signed-off-by: Gera Shegalov <gera@apache.org>
@gerashegalov
Copy link
Collaborator Author

build

@gerashegalov
Copy link
Collaborator Author

build

revans2
revans2 previously approved these changes Aug 22, 2022
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining the changes. This is fine. Just would like to see a comment on the one thing that is still left, but it is not critical.

integration_tests/run_pyspark_from_build.sh Show resolved Hide resolved
Signed-off-by: Gera Shegalov <gera@apache.org>
@gerashegalov gerashegalov changed the title Support _ in spark conf of integration tests [databricks] Support _ in spark conf of integration tests Aug 22, 2022
@gerashegalov
Copy link
Collaborator Author

build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working test Only impacts tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Implement escape characters for spark property encoding in PYSP_TEST env variables
3 participants