For information on customizing VM images with extra software and configuration settings, see Building Images.
Please see the blueprint catalog for examples.
Note
This information is applicable for most source modules, but there are some modules that have their own image specification. Please read the documentation for any module utilized.
When an Cluster Toolkit blueprint points to a predefined source module (e.g.
community/modules/compute/schedmd-slurm-gcp-v5-node-group
), generally the
module has a default image defined. In order to override this default image, a
user may specify the instance_image
setting in the yaml blueprint, within
either the specific module definition or the global variables. The
instance_image
setting is defined by three parameters within the blueprint:
instance_image:
project: centos-cloud
family: centos-v7 # If family is defined, omit name
name: centos-7-v20230809 # If name is defined, omit family
The project
setting defines the space where the image will be found. Either
this is set to a known project where HPC images are hosted (e.g
cloud-hpc-image-public
, schedmd-slurm-public
, etc.) or a private project
owned by you or your team.
The family
setting defines a group of images built with the same label, and
generally with some underlying similarities, usually an OS version or a software
version installed on top of the OS. When this is specified, instances will be
created with the latest image within the family. This will keep software more up
to date, but will be less deterministic.
The name
setting defines a specific static image. While these images are less
likely to be modified, it cannot be guaranteed. It is possible that an image
publisher may choose to delete and re-publish images with the same name.
Note
The name
setting is not always available, depending on the source module.
In these cases, please default back to the family setting.
The following is a list of commonly used base images that can be used in a blueprint:
settings:
instance_image:
family: hpc-rocky-linux-8
project: cloud-hpc-image-public
instance_image:
family: debian-11
project: debian-cloud
instance_image:
family: ubuntu-2004-lts
project: ubuntu-os-cloud
Users may want to be able to guarantee that an image has not been changed across multiple HPC deployments. One way to guarantee that the same image is used, would be to either create a custom image (Image Building), or to copy an image to a personal or team project and reference that.
The following command will copy a specified image from a source project to your own:
# Copy image from one project to another
gcloud compute images create <new_image_name> --project=<your project> --source-image=<source_image_name> --source-image-project=<source_project>
Alternatively, a user can specify a family of images you wish to pull from (i.e.
--source-image-family
instead of --source-image
). See more on
gcloud compute images create.
Once the image has been created or copied, the user can specify their own
project and the new image name in the instance_image
field discussed in
Instance Images
The Cluster Toolkit has officially supported the HPC CentOS 7 VM Image as the primary VM image for HPC workloads on Google Cloud since it's release. Since the HPC CentOS 7 VM Image comes pre-tuned for optimal performance on typical HPC workloads, it is the default VM image in our modules, unless there is specific requirement for a different OS distribution.
HPC Rocky Linux 8 is planned to become the primary supported VM image for HPC workloads on Google Cloud from 2024.
The Cluster Toolkit officially supports Debian 11 based VM images in the majority of our modules, with a couple of exceptions.
The Cluster Toolkit officially supports Ubuntu 20.04 LTS based VM images in the majority of our modules, with a couple of exceptions.
See building Windows images for a description of our support for Windows images.
Deployment Type/Scheduler | Feature | CentOS 7 | Debian 11 | Rocky Linux 8 | Ubuntu 20.04 | |
---|---|---|---|---|---|---|
Cloud Batch | Lustre | ✓ | ✓ | |||
Shared filestore | ✓ | ✓ | ✓ | ✓ | ||
Startup script | ✓ | ✓ | ✓ | ✓ | ||
Slurm | Chrome Remote Desktop | ✓ | ||||
Lustre | ✓ | ✓ | ||||
Shared filestore | ✓ | ✓ | ✓ | ✓ | ||
Startup script | ✓ | ✓ | ✓ | ✓ | ||
VM Instance | Chrome Remote Desktop | ✓ | * | |||
Lustre | ✓ | ✓ | ✓ | |||
Shared filestore | ✓ | ✓ | ✓ | ✓ | ||
Startup script | ✓ | ✓ | ✓ | ✓ | ||
HTCondor | ✓ | ✓ | ||||
Omnia | ✓ |
* Chrome Remote desktop does not support Ubuntu 20.04, but it does support Ubuntu 22.04.
The Cluster Toolkit strives to provide flexibility wherever possible. It is possible to set a VM image in many Cluster Toolkit modules. While we do not officially support images not listed here, other public and custom images should work with the majority of modules with or without further customization, such as custom startup-scripts.
SchedMD publishes "Slurm on GCP" public images, which are documented here. This documentation covers which images are available and what software is installed on them.
Slurm images are compatible by the minor version releases of the Terraform and
Packer modules. For example, images built for version 5.8 are compatible with
all Terraform modules from 5.8.0 but below 5.9.0. The version of the Slurm
modules used by your copy of the Toolkit in the local filesystem can be
inspected by looking for the source line in
community/modules/compute/schedmd-slurm-gcp-v5-partition/main.tf
.
The latest GitHub release supports these images.
Note
Set the instance_image_custom
to true
in the blueprint to let terraform
know you are aware that you are using a custom image.
See: ML Slurm and Image Builder
Warning
When building custom images, the Terraform and Packer modules must share the same version.
These instructions apply to the following modules: