Skip to content

Commit

Permalink
Move to SlurmGCP image 6.7
Browse files Browse the repository at this point in the history
  • Loading branch information
mr0re1 committed Sep 24, 2024
1 parent fe4a73f commit 625bd9c
Show file tree
Hide file tree
Showing 82 changed files with 446 additions and 446 deletions.
2 changes: 1 addition & 1 deletion community/examples/AMD/hpc-amd-slurm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@ deployment_groups:
# these images must match the images used by Slurm modules below because
# we are building OpenMPI with PMI support in libraries contained in
# Slurm installation
family: slurm-gcp-6-6-hpc-rocky-linux-8
family: slurm-gcp-6-7-hpc-rocky-linux-8
project: schedmd-slurm-public

- id: low_cost_nodeset
Expand Down
2 changes: 1 addition & 1 deletion community/examples/hpc-slurm-ubuntu2004.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ vars:
slurm_image:
# Please refer to the following link for the latest images:
# https://github.com/GoogleCloudPlatform/slurm-gcp/blob/master/docs/images.md#supported-operating-systems
family: slurm-gcp-6-6-ubuntu-2004-lts
family: slurm-gcp-6-7-ubuntu-2004-lts
project: schedmd-slurm-public
instance_image_custom: true

Expand Down
2 changes: 1 addition & 1 deletion community/examples/hpc-slurm6-apptainer.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ deployment_groups:
settings:
source_image_project_id: [schedmd-slurm-public]
# see latest in https://github.com/GoogleCloudPlatform/slurm-gcp/blob/master/docs/images.md#published-image-family
source_image_family: slurm-gcp-6-6-hpc-rocky-linux-8
source_image_family: slurm-gcp-6-7-hpc-rocky-linux-8
# You can find size of source image by using following command
# gcloud compute images describe-from-family <source_image_family> --project schedmd-slurm-public
disk_size: $(vars.disk_size)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,12 +71,12 @@ limitations under the License.
| <a name="input_folder_id"></a> [folder\_id](#input\_folder\_id) | Folder ID where the project should be created. It can be skipped if already setting organization\_id. Leave blank if the project should be created directly underneath the Organization node. | `string` | `""` | no |
| <a name="input_image_family"></a> [image\_family](#input\_image\_family) | DEPRECATED: Image of the AI notebook. | `string` | `null` | no |
| <a name="input_image_project"></a> [image\_project](#input\_image\_project) | DEPRECATED: Google Cloud project where the image is hosted. | `string` | `null` | no |
| <a name="input_instance_image"></a> [instance\_image](#input\_instance\_image) | Image of the AI notebook.<br/><br/>Expected Fields:<br/>name: The name of the image. Mutually exclusive with family.<br/>family: The image family to use. Mutually exclusive with name.<br/>project: The project where the image is hosted. | `map(string)` | <pre>{<br/> "family": "tf-latest-cpu",<br/> "project": "deeplearning-platform-release"<br/>}</pre> | no |
| <a name="input_instance_image"></a> [instance\_image](#input\_instance\_image) | Image of the AI notebook.<br><br>Expected Fields:<br>name: The name of the image. Mutually exclusive with family.<br>family: The image family to use. Mutually exclusive with name.<br>project: The project where the image is hosted. | `map(string)` | <pre>{<br> "family": "tf-latest-cpu",<br> "project": "deeplearning-platform-release"<br>}</pre> | no |
| <a name="input_ip_cidr_range"></a> [ip\_cidr\_range](#input\_ip\_cidr\_range) | Unique IP CIDR Range for AI Notebooks subnet | `string` | `"10.142.190.0/24"` | no |
| <a name="input_machine_type"></a> [machine\_type](#input\_machine\_type) | Type of VM you would like to spin up | `string` | `"n1-standard-1"` | no |
| <a name="input_network_name"></a> [network\_name](#input\_network\_name) | Name of the network to be created. | `string` | `"ai-notebook"` | no |
| <a name="input_organization_id"></a> [organization\_id](#input\_organization\_id) | Organization ID where GCP Resources need to get spin up. It can be skipped if already setting folder\_id | `string` | `""` | no |
| <a name="input_owner_id"></a> [owner\_id](#input\_owner\_id) | Billing Account associated to the GCP Resources | `list(any)` | <pre>[<br/> ""<br/>]</pre> | no |
| <a name="input_owner_id"></a> [owner\_id](#input\_owner\_id) | Billing Account associated to the GCP Resources | `list(any)` | <pre>[<br> ""<br>]</pre> | no |
| <a name="input_project"></a> [project](#input\_project) | Project in which to launch the AI Notebooks. | `string` | `""` | no |
| <a name="input_project_name"></a> [project\_name](#input\_project\_name) | Project name or ID, if it's an existing project. | `string` | `"gcluster-discovery"` | no |
| <a name="input_random_id"></a> [random\_id](#input\_random\_id) | Adds a suffix of 4 random characters to the `project_id` | `string` | `null` | no |
Expand Down
12 changes: 6 additions & 6 deletions community/modules/compute/htcondor-execute-point/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -226,7 +226,7 @@ limitations under the License.
| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_allow_automatic_updates"></a> [allow\_automatic\_updates](#input\_allow\_automatic\_updates) | If false, disables automatic system package updates on the created instances. This feature is<br/>only available on supported images (or images derived from them). For more details, see<br/>https://cloud.google.com/compute/docs/instances/create-hpc-vm#disable_automatic_updates | `bool` | `true` | no |
| <a name="input_allow_automatic_updates"></a> [allow\_automatic\_updates](#input\_allow\_automatic\_updates) | If false, disables automatic system package updates on the created instances. This feature is<br>only available on supported images (or images derived from them). For more details, see<br>https://cloud.google.com/compute/docs/instances/create-hpc-vm#disable_automatic_updates | `bool` | `true` | no |
| <a name="input_central_manager_ips"></a> [central\_manager\_ips](#input\_central\_manager\_ips) | List of IP addresses of HTCondor Central Managers | `list(string)` | n/a | yes |
| <a name="input_deployment_name"></a> [deployment\_name](#input\_deployment\_name) | Cluster Toolkit deployment name. HTCondor cloud resource names will include this value. | `string` | n/a | yes |
| <a name="input_disk_size_gb"></a> [disk\_size\_gb](#input\_disk\_size\_gb) | Boot disk size in GB | `number` | `100` | no |
Expand All @@ -236,21 +236,21 @@ limitations under the License.
| <a name="input_enable_shielded_vm"></a> [enable\_shielded\_vm](#input\_enable\_shielded\_vm) | Enable the Shielded VM configuration (var.shielded\_instance\_config). | `bool` | `false` | no |
| <a name="input_execute_point_runner"></a> [execute\_point\_runner](#input\_execute\_point\_runner) | A list of Toolkit runners for configuring an HTCondor execute point | `list(map(string))` | `[]` | no |
| <a name="input_execute_point_service_account_email"></a> [execute\_point\_service\_account\_email](#input\_execute\_point\_service\_account\_email) | Service account for HTCondor execute point (e-mail format) | `string` | n/a | yes |
| <a name="input_guest_accelerator"></a> [guest\_accelerator](#input\_guest\_accelerator) | List of the type and count of accelerator cards attached to the instance. | <pre>list(object({<br/> type = string,<br/> count = number<br/> }))</pre> | `[]` | no |
| <a name="input_guest_accelerator"></a> [guest\_accelerator](#input\_guest\_accelerator) | List of the type and count of accelerator cards attached to the instance. | <pre>list(object({<br> type = string,<br> count = number<br> }))</pre> | `[]` | no |
| <a name="input_htcondor_bucket_name"></a> [htcondor\_bucket\_name](#input\_htcondor\_bucket\_name) | Name of HTCondor configuration bucket | `string` | n/a | yes |
| <a name="input_instance_image"></a> [instance\_image](#input\_instance\_image) | HTCondor execute point VM image<br/><br/>Expected Fields:<br/>name: The name of the image. Mutually exclusive with family.<br/>family: The image family to use. Mutually exclusive with name.<br/>project: The project where the image is hosted. | `map(string)` | <pre>{<br/> "family": "hpc-rocky-linux-8",<br/> "project": "cloud-hpc-image-public"<br/>}</pre> | no |
| <a name="input_instance_image"></a> [instance\_image](#input\_instance\_image) | HTCondor execute point VM image<br><br>Expected Fields:<br>name: The name of the image. Mutually exclusive with family.<br>family: The image family to use. Mutually exclusive with name.<br>project: The project where the image is hosted. | `map(string)` | <pre>{<br> "family": "hpc-rocky-linux-8",<br> "project": "cloud-hpc-image-public"<br>}</pre> | no |
| <a name="input_labels"></a> [labels](#input\_labels) | Labels to add to HTConodr execute points | `map(string)` | n/a | yes |
| <a name="input_machine_type"></a> [machine\_type](#input\_machine\_type) | Machine type to use for HTCondor execute points | `string` | `"n2-standard-4"` | no |
| <a name="input_max_size"></a> [max\_size](#input\_max\_size) | Maximum size of the HTCondor execute point pool. | `number` | `5` | no |
| <a name="input_metadata"></a> [metadata](#input\_metadata) | Metadata to add to HTCondor execute points | `map(string)` | `{}` | no |
| <a name="input_min_idle"></a> [min\_idle](#input\_min\_idle) | Minimum number of idle VMs in the HTCondor pool (if pool reaches var.max\_size, this minimum is not guaranteed); set to ensure jobs beginning run more quickly. | `number` | `0` | no |
| <a name="input_name_prefix"></a> [name\_prefix](#input\_name\_prefix) | Name prefix given to hostnames in this group of execute points; must be unique across all instances of this module | `string` | n/a | yes |
| <a name="input_network_self_link"></a> [network\_self\_link](#input\_network\_self\_link) | The self link of the network HTCondor execute points will join | `string` | `"default"` | no |
| <a name="input_network_storage"></a> [network\_storage](#input\_network\_storage) | An array of network attached storage mounts to be configured | <pre>list(object({<br/> server_ip = string,<br/> remote_mount = string,<br/> local_mount = string,<br/> fs_type = string,<br/> mount_options = string,<br/> client_install_runner = map(string)<br/> mount_runner = map(string)<br/> }))</pre> | `[]` | no |
| <a name="input_network_storage"></a> [network\_storage](#input\_network\_storage) | An array of network attached storage mounts to be configured | <pre>list(object({<br> server_ip = string,<br> remote_mount = string,<br> local_mount = string,<br> fs_type = string,<br> mount_options = string,<br> client_install_runner = map(string)<br> mount_runner = map(string)<br> }))</pre> | `[]` | no |
| <a name="input_project_id"></a> [project\_id](#input\_project\_id) | Project in which the HTCondor execute points will be created | `string` | n/a | yes |
| <a name="input_region"></a> [region](#input\_region) | The region in which HTCondor execute points will be created | `string` | n/a | yes |
| <a name="input_service_account_scopes"></a> [service\_account\_scopes](#input\_service\_account\_scopes) | Scopes by which to limit service account attached to central manager. | `set(string)` | <pre>[<br/> "https://www.googleapis.com/auth/cloud-platform"<br/>]</pre> | no |
| <a name="input_shielded_instance_config"></a> [shielded\_instance\_config](#input\_shielded\_instance\_config) | Shielded VM configuration for the instance (must set var.enabled\_shielded\_vm) | <pre>object({<br/> enable_secure_boot = bool<br/> enable_vtpm = bool<br/> enable_integrity_monitoring = bool<br/> })</pre> | <pre>{<br/> "enable_integrity_monitoring": true,<br/> "enable_secure_boot": true,<br/> "enable_vtpm": true<br/>}</pre> | no |
| <a name="input_service_account_scopes"></a> [service\_account\_scopes](#input\_service\_account\_scopes) | Scopes by which to limit service account attached to central manager. | `set(string)` | <pre>[<br> "https://www.googleapis.com/auth/cloud-platform"<br>]</pre> | no |
| <a name="input_shielded_instance_config"></a> [shielded\_instance\_config](#input\_shielded\_instance\_config) | Shielded VM configuration for the instance (must set var.enabled\_shielded\_vm) | <pre>object({<br> enable_secure_boot = bool<br> enable_vtpm = bool<br> enable_integrity_monitoring = bool<br> })</pre> | <pre>{<br> "enable_integrity_monitoring": true,<br> "enable_secure_boot": true,<br> "enable_vtpm": true<br>}</pre> | no |
| <a name="input_spot"></a> [spot](#input\_spot) | Provision VMs using discounted Spot pricing, allowing for preemption | `bool` | `false` | no |
| <a name="input_subnetwork_self_link"></a> [subnetwork\_self\_link](#input\_subnetwork\_self\_link) | The self link of the subnetwork HTCondor execute points will join | `string` | `null` | no |
| <a name="input_target_size"></a> [target\_size](#input\_target\_size) | Initial size of the HTCondor execute point pool; set to null (default) to avoid Terraform management of size. | `number` | `null` | no |
Expand Down
2 changes: 1 addition & 1 deletion community/modules/compute/mig/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ No modules.
| <a name="input_name"></a> [name](#input\_name) | Name of the MIG. If not provided, will be generated from `var.deployment_name` | `string` | `null` | no |
| <a name="input_project_id"></a> [project\_id](#input\_project\_id) | Project in which the MIG will be created | `string` | n/a | yes |
| <a name="input_target_size"></a> [target\_size](#input\_target\_size) | Target number of instances in the MIG | `number` | `0` | no |
| <a name="input_versions"></a> [versions](#input\_versions) | Application versions managed by this instance group. Each version deals with a specific instance template | <pre>list(object({<br/> name = string<br/> instance_template = string<br/> target_size = optional(object({<br/> fixed = optional(number)<br/> percent = optional(number)<br/> }))<br/> }))</pre> | n/a | yes |
| <a name="input_versions"></a> [versions](#input\_versions) | Application versions managed by this instance group. Each version deals with a specific instance template | <pre>list(object({<br> name = string<br> instance_template = string<br> target_size = optional(object({<br> fixed = optional(number)<br> percent = optional(number)<br> }))<br> }))</pre> | n/a | yes |
| <a name="input_wait_for_instances"></a> [wait\_for\_instances](#input\_wait\_for\_instances) | Whether to wait for all instances to be created/updated before returning | `bool` | `false` | no |
| <a name="input_zone"></a> [zone](#input\_zone) | Compute Platform zone. Required, currently only zonal MIGs are supported | `string` | n/a | yes |

Expand Down
2 changes: 1 addition & 1 deletion community/modules/compute/notebook/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ No modules.
|------|-------------|------|---------|:--------:|
| <a name="input_deployment_name"></a> [deployment\_name](#input\_deployment\_name) | Name of the HPC deployment; used as part of name of the notebook. | `string` | n/a | yes |
| <a name="input_gcs_bucket_path"></a> [gcs\_bucket\_path](#input\_gcs\_bucket\_path) | Bucket name, can be provided from the google-cloud-storage module | `string` | `null` | no |
| <a name="input_instance_image"></a> [instance\_image](#input\_instance\_image) | Instance Image | `map(string)` | <pre>{<br/> "family": "tf-latest-cpu",<br/> "name": null,<br/> "project": "deeplearning-platform-release"<br/>}</pre> | no |
| <a name="input_instance_image"></a> [instance\_image](#input\_instance\_image) | Instance Image | `map(string)` | <pre>{<br> "family": "tf-latest-cpu",<br> "name": null,<br> "project": "deeplearning-platform-release"<br>}</pre> | no |
| <a name="input_labels"></a> [labels](#input\_labels) | Labels to add to the resource Key-value pairs. | `map(string)` | n/a | yes |
| <a name="input_machine_type"></a> [machine\_type](#input\_machine\_type) | The machine type to employ | `string` | n/a | yes |
| <a name="input_mount_runner"></a> [mount\_runner](#input\_mount\_runner) | mount content from the google-cloud-storage module | `map(string)` | n/a | yes |
Expand Down
Loading

0 comments on commit 625bd9c

Please sign in to comment.