Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store Cluster Shape Recommendation in User Tools Qualification Output #1005

Merged
merged 4 commits into from
May 23, 2024

Conversation

parthosa
Copy link
Collaborator

@parthosa parthosa commented May 9, 2024

Fixes #1004. This PR adds functionality to store the cluster shape recommendation in a file cluster_shape_recommendation.json within the user tools intermediate output directory qual_202405xxxx_yyyy/intermediate_output

CMD:

spark_rapids qualification --platform <platform> --eventlogs <eventlogs> --tools_jar <local tools dev jar path>

Note:

  • The --tools_jar <local tools dev jar path> parameter is required until the latest changes are published on Maven.

Output:

Output File Location: qual_202405xxxx_yyyy/intermediate_output/cluster_shape_recommendation.json

emr/databricks
[
  {
    "clusterName": "1234-5678-test",
    "sourceCluster": {
      "driverInstance": "m6gd.xlarge",
      "executorInstance": "m6gd.2xlarge",
      "numExecutors": 2
    },
    "targetCluster": {
      "driverInstance": "m6gd.xlarge",
      "executorInstance": "g5.2xlarge",
      "numExecutors": 2
    }
  }
]
dataproc
[
  {
    "clusterName": "default-cluster-name",
    "sourceCluster": {
      "driverInstance": "n1-standard-16",
      "executorInstance": "n1-standard-16",
      "numExecutors": 4
    },
    "targetCluster": {
      "driverInstance": "n1-standard-16",
      "executorInstance": "n1-standard-16",
      "numExecutors": 4,
      "gpuInfo": {
        "device": "nvidia-tesla-t4",
        "gpuPerWorker": 2
      },
      "additionalConfig": {
        "localSsd": 2
      }
    }
  }
]

Console Output:

    - Intermediate output generated by tools: /<path>/qual_20240522004525_xxxx/intermediate_output

Changes

  1. New Method: Added get_cluster_configuration(), which returns a dictionary containing the cluster shape and other configurations. This method can be overridden by platform-specific classes to provide additional properties.
  2. Introduced __generate_cluster_recommendation_report() to generate cluster shape recommendations from CPU to GPU clusters using the above New Method.
  3. While generation of the user tools report, the cluster shape recommendation is written to a JSON file.
  4. Additionally, cluster creation scripts can now use the above New Method to fill the rendering arguments.
    • This is done to avoid redundant code.

Testing

  1. Tested on different platforms - emr, dataproc, databricks and onprem.
  2. This file will not be generated if source cluster information is not available.

Signed-off-by: Partho Sarthi <psarthi@nvidia.com>
@parthosa parthosa added feature request New feature or request user_tools Scope the wrapper module running CSP, QualX, and reports (python) labels May 9, 2024
@parthosa parthosa self-assigned this May 9, 2024
Signed-off-by: Partho Sarthi <psarthi@nvidia.com>
@amahussein
Copy link
Collaborator

  • Tested on different platforms - emr, dataproc, databricks and onprem.

  • This file will not be generated if cluster information is not available.

Interesting, I thought that we do not have cluster recommendation output section for DB, and onPrem.

Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, we did not generate cluster recommendation in the output for Databricks/onPrem
Does this PR generates the cluster-recommendations for those platforms?

@parthosa parthosa marked this pull request as draft May 14, 2024 18:58
@parthosa
Copy link
Collaborator Author

Waiting for #1002 as it creates intermediate_output directory. cluster_information.json will be stored in the intermediate directory.

# Conflicts:
#	user_tools/src/spark_rapids_pytools/rapids/qualification.py
#	user_tools/src/spark_rapids_pytools/resources/qualification-conf.yaml
@tgravescs
Copy link
Collaborator

can you please update description to have details about when it generates this? Ie qual tool, profiling tool, python vs java, etc.

# Conflicts:
#	user_tools/src/spark_rapids_pytools/rapids/qualification.py
@parthosa parthosa marked this pull request as ready for review May 22, 2024 01:13
@parthosa parthosa requested a review from amahussein May 22, 2024 01:13
@parthosa parthosa changed the title Generate cluster shape recommendation Store Cluster Shape Recommendation in User Tools Qualification Output May 22, 2024
@parthosa
Copy link
Collaborator Author

parthosa commented May 22, 2024

can you please update description to have details about when it generates this? Ie qual tool, profiling tool, python vs java, etc.

@tgravescs Updated the PR title and description to include more details about the changes.

Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @parthosa
LGTME

@parthosa parthosa merged commit a6fdc86 into NVIDIA:dev May 23, 2024
15 checks passed
@parthosa parthosa deleted the spark-rapids-tools-1004 branch May 23, 2024 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request user_tools Scope the wrapper module running CSP, QualX, and reports (python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Store CLI cluster shape in an output file
3 participants