Skip to content

Latest commit

 

History

History
337 lines (280 loc) · 12.5 KB

vm-images.md

File metadata and controls

337 lines (280 loc) · 12.5 KB

VM Images

For information on customizing VM images with extra software and configuration settings, see Building Images.

Please see the blueprint catalog for examples.

Specifying Blueprint Images

Instance Images

Note

This information is applicable for most source modules, but there are some modules that have their own image specification. Please read the documentation for any module utilized.

When an Cluster Toolkit blueprint points to a predefined source module (e.g. community/modules/compute/schedmd-slurm-gcp-v5-node-group), generally the module has a default image defined. In order to override this default image, a user may specify the instance_image setting in the yaml blueprint, within either the specific module definition or the global variables. The instance_image setting is defined by three parameters within the blueprint:

instance_image:
  project: centos-cloud
  family: centos-v7        # If family is defined, omit name
  name: centos-7-v20230809 # If name is defined, omit family

The project setting defines the space where the image will be found. Either this is set to a known project where HPC images are hosted (e.g cloud-hpc-image-public, schedmd-slurm-public, etc.) or a private project owned by you or your team.

The family setting defines a group of images built with the same label, and generally with some underlying similarities, usually an OS version or a software version installed on top of the OS. When this is specified, instances will be created with the latest image within the family. This will keep software more up to date, but will be less deterministic.

The name setting defines a specific static image. While these images are less likely to be modified, it cannot be guaranteed. It is possible that an image publisher may choose to delete and re-publish images with the same name.

Note

The name setting is not always available, depending on the source module. In these cases, please default back to the family setting.

The following is a list of commonly used base images that can be used in a blueprint:

    settings:

      instance_image:
        family: hpc-rocky-linux-8
        project: cloud-hpc-image-public

      instance_image:
        family: debian-11
        project: debian-cloud

      instance_image:
        family: ubuntu-2004-lts
        project: ubuntu-os-cloud

Pinning Specifics Images

Users may want to be able to guarantee that an image has not been changed across multiple HPC deployments. One way to guarantee that the same image is used, would be to either create a custom image (Image Building), or to copy an image to a personal or team project and reference that.

The following command will copy a specified image from a source project to your own:

# Copy image from one project to another
gcloud compute images create <new_image_name> --project=<your project> --source-image=<source_image_name> --source-image-project=<source_project>

Alternatively, a user can specify a family of images you wish to pull from (i.e. --source-image-family instead of --source-image). See more on gcloud compute images create.

Once the image has been created or copied, the user can specify their own project and the new image name in the instance_image field discussed in Instance Images

Cluster Toolkit Supported Images

HPC CentOS 7

The Cluster Toolkit has officially supported the HPC CentOS 7 VM Image as the primary VM image for HPC workloads on Google Cloud since it's release. Since the HPC CentOS 7 VM Image comes pre-tuned for optimal performance on typical HPC workloads, it is the default VM image in our modules, unless there is specific requirement for a different OS distribution.

HPC Rocky Linux 8

HPC Rocky Linux 8 is planned to become the primary supported VM image for HPC workloads on Google Cloud from 2024.

Debian 11

The Cluster Toolkit officially supports Debian 11 based VM images in the majority of our modules, with a couple of exceptions.

Ubuntu 20.04 LTS

The Cluster Toolkit officially supports Ubuntu 20.04 LTS based VM images in the majority of our modules, with a couple of exceptions.

Windows

See building Windows images for a description of our support for Windows images.

Supported features

Deployment Type/Scheduler Feature CentOS 7Debian 11Rocky Linux 8Ubuntu 20.04
Cloud Batch Lustre
Shared filestore
Startup script
Slurm Chrome Remote Desktop
Lustre
Shared filestore
Startup script
VM Instance Chrome Remote Desktop *
Lustre
Shared filestore
Startup script
HTCondor
Omnia

* Chrome Remote desktop does not support Ubuntu 20.04, but it does support Ubuntu 22.04.

Other Images

The Cluster Toolkit strives to provide flexibility wherever possible. It is possible to set a VM image in many Cluster Toolkit modules. While we do not officially support images not listed here, other public and custom images should work with the majority of modules with or without further customization, such as custom startup-scripts.

Slurm on GCP

Publicly Published Slurm Images

SchedMD publishes "Slurm on GCP" public images, which are documented here. This documentation covers which images are available and what software is installed on them.

Slurm images are compatible by the minor version releases of the Terraform and Packer modules. For example, images built for version 5.8 are compatible with all Terraform modules from 5.8.0 but below 5.9.0. The version of the Slurm modules used by your copy of the Toolkit in the local filesystem can be inspected by looking for the source line in community/modules/compute/schedmd-slurm-gcp-v5-partition/main.tf.

The latest GitHub release supports these images.

Custom Slurm Images

Note

Set the instance_image_custom to true in the blueprint to let terraform know you are aware that you are using a custom image.

See: ML Slurm and Image Builder

Warning

When building custom images, the Terraform and Packer modules must share the same version.

These instructions apply to the following modules: