Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Remove gcloud CLI dependency for Dataproc platform #1223

Merged

Conversation

cindyyuanjiang
Copy link
Collaborator

Contributes to #1191

Changes

  • Dataproc Qual tool will use generated instance catalog file to pull instance type info.
  • In Initialization phase, Dataproc Qual tool used to call gcloud CLI cmds to set configuration properties, this PR updates the tool to use the local config file.

Testing

  • With cluster properties file
spark_rapids qualification --eventlogs <my-event-logs> --platform dataproc --cluster <my-cluster-props-file>

  • With cluster name
spark_rapids qualification --eventlogs <my-event-logs> --platform dataproc --cluster <my-cluster-name>

Signed-off-by: cindyyuanjiang <cindyj@nvidia.com>
@cindyyuanjiang
Copy link
Collaborator Author

We plan to add support for Dataproc-GKE in a separate PR, since we encountered an issue when testing it. The issue needs to be resolved first.

@cindyyuanjiang cindyyuanjiang self-assigned this Jul 23, 2024
@cindyyuanjiang cindyyuanjiang added feature request New feature or request user_tools Scope the wrapper module running CSP, QualX, and reports (python) usability track issues related to the Tools's user experience labels Jul 23, 2024
Signed-off-by: cindyyuanjiang <cindyj@nvidia.com>
Copy link
Collaborator

@parthosa parthosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cindyyuanjiang. Tested the functionality on a clean ubuntu machine. Made some minor refactoring comments.

Signed-off-by: cindyyuanjiang <cindyj@nvidia.com>
Copy link
Collaborator

@parthosa parthosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cindyyuanjiang for this feature.

user_tools/src/spark_rapids_pytools/cloud_api/dataproc.py Outdated Show resolved Hide resolved
@amahussein amahussein merged commit 7ea8a10 into NVIDIA:dev Jul 25, 2024
14 checks passed
@cindyyuanjiang cindyyuanjiang deleted the spark-rapids-tools-1191-dataproc branch July 26, 2024 00:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request usability track issues related to the Tools's user experience user_tools Scope the wrapper module running CSP, QualX, and reports (python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants