-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dynamic setup of gpu_limit in gke-job-template module #3319
Add dynamic setup of gpu_limit in gke-job-template module #3319
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a section/comment on verifying the change- is there a way to see how many GPUs nvidia-smi has requested for e.g.?
Updated PR description with logs from test jobs running on develop and feature branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
549e252
into
GoogleCloudPlatform:develop
Triggered PR test for all blueprints with usage of
gke-node-pool
andgke-job-template
module completed with OK status.Verification
nvidia-smi
is showing stat from 1 gpu (default value set ingke-job-template
module)nvidia-smi
is showing stat from 8 gpu (usinggke-node-pool
module output to set this up dynamically), user can still override it using varrequested_gpu_per_pod
Submission Checklist
NOTE: Community submissions can take up to 2 weeks to be reviewed.
Please take the following actions before submitting this pull request.