Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update references to HPC Toolkit to Cluster Toolkit #2829

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,4 @@ Please take the following actions before submitting this pull request.
* Add or modify unit tests to cover code changes
* Ensure that unit test coverage remains above 80%
* Update all applicable documentation
* Follow Cloud HPC Toolkit Contribution guidelines [#](https://goo.gle/hpc-toolkit-contributing)
* Follow Cluster Toolkit Contribution guidelines [#](https://goo.gle/hpc-toolkit-contributing)
48 changes: 24 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
# Google HPC-Toolkit
# Google Cluster Toolkit (formally HPC Toolkit)

## Description

HPC Toolkit is an open-source software offered by Google Cloud which makes it
easy for customers to deploy HPC environments on Google Cloud.
Cluster Toolkit is an open-source software offered by Google Cloud which makes it
easy for customers to deploy AI/ML and HPC environments on Google Cloud.

HPC Toolkit allows customers to deploy turnkey HPC environments (compute,
Cluster Toolkit allows customers to deploy turnkey AI/ML and HPC environments (compute,
networking, storage, etc.) following Google Cloud best-practices, in a repeatable
manner. The HPC Toolkit is designed to be highly customizable and extensible,
and intends to address the HPC deployment needs of a broad range of customers.
manner. The Cluster Toolkit is designed to be highly customizable and extensible,
and intends to address the AI/ML and HPC deployment needs of a broad range of customers.

## Detailed documentation and examples

The Toolkit comes with a suite of [tutorials], [examples], and full
documentation for a suite of [modules] that have been designed for HPC use cases.
documentation for a suite of [modules] that have been designed for AI/ML and HPC use cases.
More information can be found on the
[Google Cloud Docs](https://cloud.google.com/hpc-toolkit/docs/overview).
[Google Cloud Docs](https://cloud.google.com/cluster-toolkit/docs/overview).

[tutorials]: docs/tutorials/README.md
[examples]: examples/README.md
Expand All @@ -24,8 +24,8 @@ More information can be found on the
## Quickstart

Running through the
[quickstart tutorial](https://cloud.google.com/hpc-toolkit/docs/quickstarts/slurm-cluster)
is the recommended path to get started with the HPC Toolkit.
[quickstart tutorial](https://cloud.google.com/cluster-toolkit/docs/quickstarts/slurm-cluster)
is the recommended path to get started with the Cluster Toolkit.

---

Expand All @@ -42,11 +42,11 @@ make

> **_NOTE:_** You may need to [install dependencies](#dependencies) first.

## HPC Toolkit Components
## Cluster Toolkit Components

Learn about the components that make up the HPC Toolkit and more on how it works
Learn about the components that make up the Cluster Toolkit and more on how it works
on the
[Google Cloud Docs Product Overview](https://cloud.google.com/hpc-toolkit/docs/overview#components).
[Google Cloud Docs Product Overview](https://cloud.google.com/cluster-toolkit/docs/overview#components).

## GCP Credentials

Expand Down Expand Up @@ -105,7 +105,7 @@ minutes. Please consider it only for blueprints that are quickly deployed.

### Standard Images

The HPC Toolkit officially supports the following VM images:
The Cluster Toolkit officially supports the following VM images:

* HPC CentOS 7
* HPC Rocky Linux 8
Expand All @@ -119,37 +119,37 @@ For more information on these and other images, see

> **_Warning:_** Slurm Terraform modules cannot be directly used on the standard OS images. They must be used in combination with images built for the versioned release of the Terraform module.

The HPC Toolkit provides modules and examples for implementing pre-built and custom Slurm VM images, see [Slurm on GCP](docs/vm-images.md#slurm-on-gcp)
The Cluster Toolkit provides modules and examples for implementing pre-built and custom Slurm VM images, see [Slurm on GCP](docs/vm-images.md#slurm-on-gcp)

## Blueprint Validation

The Toolkit contains "validator" functions that perform basic tests of the
blueprint to ensure that deployment variables are valid and that the HPC
blueprint to ensure that deployment variables are valid and that the AI/ML and HPC
environment can be provisioned in your Google Cloud project. Further information
can be found in [dedicated documentation](docs/blueprint-validation.md).

## Enable GCP APIs

In a new GCP project there are several APIs that must be enabled to deploy your
HPC cluster. These will be caught when you perform `terraform apply` but you can
cluster. These will be caught when you perform `terraform apply` but you can
save time by enabling them upfront.

See
[Google Cloud Docs](https://cloud.google.com/hpc-toolkit/docs/setup/configure-environment#enable-apis)
[Google Cloud Docs](https://cloud.google.com/cluster-toolkit/docs/setup/configure-environment#enable-apis)
for instructions.

## GCP Quotas

You may need to request additional quota to be able to deploy and use your HPC
You may need to request additional quota to be able to deploy and use your
cluster.

See
[Google Cloud Docs](https://cloud.google.com/hpc-toolkit/docs/setup/hpc-blueprint#request-quota)
[Google Cloud Docs](https://cloud.google.com/cluster-toolkit/docs/setup/hpc-blueprint#request-quota)
for more information.

## Billing Reports

You can view your billing reports for your HPC cluster on the
You can view your billing reports for your cluster on the
[Cloud Billing Reports](https://cloud.google.com/billing/docs/how-to/reports)
page. ​​To view the Cloud Billing reports for your Cloud Billing account,
including viewing the cost information for all of the Cloud projects that are
Expand Down Expand Up @@ -279,7 +279,7 @@ hpc-slurm/
## Dependencies

See
[Cloud Docs on Installing Dependencies](https://cloud.google.com/hpc-toolkit/docs/setup/install-dependencies).
[Cloud Docs on Installing Dependencies](https://cloud.google.com/cluster-toolkit/docs/setup/install-dependencies).

### Notes on Packer

Expand All @@ -303,12 +303,12 @@ applied at boot-time.
## Development

The following setup is in addition to the [dependencies](#dependencies) needed
to build and run HPC-Toolkit.
to build and run Cluster-Toolkit.

Please use the `pre-commit` hooks [configured](./.pre-commit-config.yaml) in
this repository to ensure that all changes are validated, tested and properly
documented before pushing code changes. The pre-commits configured
in the HPC Toolkit have a set of dependencies that need to be installed before
in the Cluster Toolkit have a set of dependencies that need to be installed before
successfully passing.

Follow these steps to install and setup pre-commit in your cloned repository:
Expand Down
12 changes: 6 additions & 6 deletions cmd/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# HPC Toolkit Commands
# Cluster Toolkit (formally HPC Toolkit) Commands

## gcluster

`gcluster` is the tool used by Cloud HPC Toolkit to create deployments of HPC
`gcluster` is the tool used by Cluster Toolkit to create deployments of AI/ML and HPC
clusters, also referred to as the gHPC Engine.

### Usage - gcluster
Expand All @@ -14,7 +14,7 @@ gcluster [SUBCOMMAND]

### Subcommands - gcluster

* [`deploy`](#gcluster-deploy): Deploy an HPC cluster on Google Cloud
* [`deploy`](#gcluster-deploy): Deploy an AI/ML or HPC cluster on Google Cloud
* [`create`](#gcluster-create): Create a new deployment
* [`expand`](#gcluster-expand): Expand the blueprint without creating a new deployment
* [`completion`](#gcluster-completion): Generate completion script
Expand All @@ -33,7 +33,7 @@ gcluster --version

## gcluster deploy

`gcluster deploy` deploys an HPC cluster on Google Cloud using the deployment directory created by `gcluster create` or creates one from supplied blueprint file.
`gcluster deploy` deploys a cluster on Google Cloud using the deployment directory created by `gcluster create` or creates one from supplied blueprint file.

### Usage - deploy

Expand All @@ -43,7 +43,7 @@ gcluster deploy (<DEPLOYMENT_DIRECTORY> | <BLUEPRINT_FILE>) [flags]

## gcluster create

`gcluster create` creates a deployment directory. This deployment directory is used to deploy an HPC cluster on Google Cloud.
`gcluster create` creates a deployment directory. This deployment directory is used to deploy a cluster on Google Cloud.

### Usage - create

Expand All @@ -59,7 +59,7 @@ gcluster create BLUEPRINT_FILE [FLAGS]

* `--backend-config strings`: Comma-separated list of name=value variables to set Terraform backend configuration. Can be used multiple times.
* `-h, --help`: display detailed help for the create command.
* `-o, --out string`: sets the output directory where the HPC deployment directory will be created.
* `-o, --out string`: sets the output directory where the AI/ML or HPC deployment directory will be created.
* `-w, --overwrite-deployment`: If specified, an existing deployment directory is overwritten by the new deployment.

* Terraform state IS preserved.
Expand Down
4 changes: 2 additions & 2 deletions cmd/root.go
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ func checkGitHashMismatch() (mismatch bool, branch, hash, dir string) {

// hpcToolkitRepo will find the path of the directory containing the hpc-toolkit
// starting with the working directory and evaluating the parent directories until
// the toolkit repository is found. If the HPC Toolkit repository is not found by
// the toolkit repository is found. If the Cluster Toolkit repository is not found by
// traversing the path, then the executable directory is checked.
func hpcToolkitRepo() (repo *git.Repository, dir string, err error) {
// first look in the working directory and it's parents until a git repo is
Expand Down Expand Up @@ -174,7 +174,7 @@ func hpcToolkitRepo() (repo *git.Repository, dir string, err error) {
}

// isHpcToolkitRepo will verify that the found git repository has a commit with
// the known hash of the initial commit of the HPC Toolkit repository
// the known hash of the initial commit of the Cluster Toolkit repository
func isHpcToolkitRepo(r git.Repository) bool {
h := plumbing.NewHash(GitInitialHash)
_, err := r.CommitObject(h)
Expand Down
2 changes: 1 addition & 1 deletion community/examples/AMD/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# AMD solutions for the HPC Toolkit
# AMD solutions for the Cluster Toolkit (formally HPC Toolkit)

> [!NOTE]
> This document uses Slurm-GCP v6. If you want to use Slurm-GCP v5 version you
Expand Down
2 changes: 1 addition & 1 deletion community/examples/flux-framework/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ The cluster includes
- A login node
- Four compute nodes each of which is an instance of the c2-standard-16 machine type

> **_NOTE:_** prior to running this HPC Toolkit example the [Flux Framework GCP Images](https://github.com/GoogleCloudPlatform/scientific-computing-examples/tree/main/fluxfw-gcp/img#flux-framework-gcp-images)
> **_NOTE:_** prior to running this Cluster Toolkit example the [Flux Framework GCP Images](https://github.com/GoogleCloudPlatform/scientific-computing-examples/tree/main/fluxfw-gcp/img#flux-framework-gcp-images)
> must be created in your project.

### Initial Setup for flux-framework Cluster
Expand Down
4 changes: 2 additions & 2 deletions community/examples/intel/README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Intel Solutions for the HPC Toolkit
# Intel Solutions for the Cluster Toolkit (formally HPC Toolkit)

> **_NOTE:_** The [hpc-slurm-daos.yaml](hpc-slurm-daos.yaml) will not be compatible
> for newer version of slurm-gcp v6.

<!-- TOC generated with: md_toc github community/examples/intel/README.md | sed -e "s/\s-\s/ * /"-->
<!-- TOC -->

- [Intel Solutions for the HPC Toolkit](#intel-solutions-for-the-hpc-toolkit)
- [Intel Solutions for the Cluster Toolkit](#intel-solutions-for-the-cluster-toolkit)
- [DAOS Cluster](#daos-cluster)
- [Initial Setup for DAOS Cluster](#initial-setup-for-daos-cluster)
- [Deploy the DAOS Cluster](#deploy-the-daos-cluster)
Expand Down
2 changes: 1 addition & 1 deletion community/examples/omnia-cluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

---

# WARNING: this example has been deprecated as of v1.28.0 of the HPC Toolkit
# WARNING: this example has been deprecated as of v1.28.0 of the Cluster Toolkit

blueprint_name: omnia-cluster

Expand Down
4 changes: 2 additions & 2 deletions community/front-end/ofe/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Google HPC Toolkit Open Front End
# Google Cluster Toolkit Open Front End

This is a web front-end for HPC applications on GCP. It delegates to the Cloud
HPC Toolkit to create cloud resources for HPC clusters. Through the convenience
Cluster Toolkit to create cloud resources for HPC clusters. Through the convenience
of a web interface, system administrators can manage the life cycles of HPC
clusters and install applications; users can prepare & submit HPC jobs and run
benchmarks. This web application is built upon the Django framework.
Expand Down
4 changes: 2 additions & 2 deletions community/front-end/ofe/cli/ghpcfe.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

"""The Command Line Interface to access the HPC Toolkit FrontEnd"""
"""The Command Line Interface to access the Cluster Toolkit FrontEnd"""

import click
import requests
Expand Down Expand Up @@ -65,7 +65,7 @@ def config():
"""
print("Configuration file will be written at $HOME/.ghpcfe/config")
print()
server = input("Enter the URL of the HPC Toolkit FrontEnd website: ")
server = input("Enter the URL of the Cluster Toolkit FrontEnd website: ")
try:
requests.get(server, timeout=10)
# pylint: disable=unused-variable
Expand Down
8 changes: 4 additions & 4 deletions community/front-end/ofe/deploy.sh
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@

################################################################################
# #
# HPC Toolkit FrontEnd deployment script #
# Cluster Toolkit FrontEnd deployment script #
# #
################################################################################
#
Expand Down Expand Up @@ -332,7 +332,7 @@ check_account() {
echo ""
echo "Warning: account is not Owner or Editor of project"
echo " Please ensure account has correct permissions before proceeding."
echo " See HPC Toolkit FrontEnd Administrator's Guide for details."
echo " See Cluster Toolkit FrontEnd Administrator's Guide for details."
echo ""
case $(ask " Proceed [y/N] ") in
[Yy]*) ;;
Expand All @@ -344,7 +344,7 @@ check_account() {
fi

# TODO: perform more extensive check the account has all required roles.
# - these could change over, depending back-end GCP / HPC Toolkit
# - these could change over, depending back-end GCP / Cluster Toolkit
# requirements, so would require maintaining.
}

Expand Down Expand Up @@ -979,7 +979,7 @@ cat <<HEADER

--------------------------------------------------------------------------------

HPC Toolkit FrontEnd
Cluster Toolkit FrontEnd

--------------------------------------------------------------------------------

Expand Down
2 changes: 1 addition & 1 deletion community/front-end/ofe/docs/Applications.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# HPC Toolkit FrontEnd - Application Installation Guide
# Cluster Toolkit FrontEnd - Application Installation Guide

<!--
0 1 2 3 4 5 6 7 8
Expand Down
2 changes: 1 addition & 1 deletion community/front-end/ofe/docs/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
## HPC Toolkit FrontEnd Documentations
## Cluster Toolkit FrontEnd Documentations

- [Administrator's Guide](admin_guide.md)
- [User Guide](user_guide.md)
Expand Down
2 changes: 1 addition & 1 deletion community/front-end/ofe/docs/WorkbenchAdmin.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ An administrator can configure any type of machine type that is
available. Users with the "Normal User" class will only be able to
create workbenches using the preset machine type configurations while
users with the "Viewer" class will not be able to create workbenches
for themselves. The HPC toolkit frontend comes with some
for themselves. The Cluster Toolkit Frontend comes with some
pre-configured workbench presets:
- Small - 1x core with 3840 Memory (n1-standard-1)
- Medium - 2x cores with 7680 Memory (n1-standard-2)
Expand Down
6 changes: 3 additions & 3 deletions community/front-end/ofe/docs/WorkbenchUser.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# HPC Toolkit FrontEnd - Workbench User Guide
# Cluster Toolkit FrontEnd - Workbench User Guide
<!--
0 1 2 3 4 5 6 7 8
1234567890123456789012345678901234567890123456789012345678901234567890234567890
Expand Down Expand Up @@ -40,15 +40,15 @@ User, Machine Type, Boot disk type, Boot Disk Capacity and image family.
- Boot disk type - The type of disk storage used for the workbench boot disk
- Boot disk capacity - The amount of disk storage used for the workbench boot
disk
- Image family - Currently the HPC Toolkit FrontEnd supports Base Python3,
- Image family - Currently the Cluster Toolkit FrontEnd supports Base Python3,
Tensorflow, PyTorch and R images

## Add storage

The second part of the configuration is to add any desired shared file storage.
Once the initial configuration is saved an additional configuration section
will be displayed showing the options to mount any shared file storage known
about by the HPC Toolkit FrontEnd.
about by the Cluster Toolkit FrontEnd.

![workbench step 2](images/Workbench_userguide/create2.png)

Expand Down
10 changes: 5 additions & 5 deletions community/front-end/ofe/docs/admin_guide.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# HPC Toolkit FrontEnd - Administrator’s Guide
# Cluster Toolkit FrontEnd - Administrator’s Guide

<!--
0 1 2 3 4 5 6 7 8
1234567890123456789012345678901234567890123456789012345678901234567890234567890
-->

This document is for administrators of the HPC Toolkit FrontEnd (TKFE). An
This document is for administrators of the Cluster Toolkit FrontEnd (TKFE). An
administrator can deploy the TKFE portal, manage the lifecycle of HPC clusters,
set up networking and storage resources that support clusters, install
applications. and manage user access. Normal HPC users should refer to the
Expand All @@ -23,7 +23,7 @@ administrators, additional Django superusers can be created from the Admin site
within TKFE, once it is deployed and running.

The TFKE web application server uses the
[Cloud HPC Toolkit](https://github.com/GoogleCloudPlatform/hpc-toolkit) to
[Cluster Toolkit](https://github.com/GoogleCloudPlatform/hpc-toolkit) to
provision resources for networks, filesystems and clusters, using a service
account that has its credentials registered to TKFE. The service account is
used for access management and billing.
Expand Down Expand Up @@ -308,7 +308,7 @@ external filesystem located elsewhere on GCP.
## Cluster Management

HPC clusters can be created after setting up the hosting VPC and any
additional filesystems. The HPC Toolkit FrontEnd can manage the whole life
additional filesystems. The Cluster Toolkit FrontEnd can manage the whole life
cycles of clusters. Click the *Clusters* item in the main menu to list all
existing clusters.

Expand Down Expand Up @@ -496,7 +496,7 @@ Cloud resource deployment log files (from Terraform) are typically shown via
the FrontEnd web site. If those logs are not being shown, they can be found on
the service machine under
`/opt/gcluster/hpc-toolkit/frontend/(clusters|fs|vpc)/...`.
HPC Toolkit log files will also be found in those directories. The Terraform
Cluster Toolkit log files will also be found in those directories. The Terraform
log files and status files will be down a few directories, based off of the
Cluster Number, Deployment ID, and Terraform directory.

Expand Down
Loading
Loading