Update references to community resources

Correct paths to community resources, adds a README for community/resources and updates resources/README.md to remove community resources.
GoogleCloudPlatform · Apr 26, 2022 · 8ae40e6 · 8ae40e6
1 parent a2af9c7
commit 8ae40e6
Show file tree

Hide file tree

Showing 36 changed files with 148 additions and 201 deletions.
diff --git a/README.md b/README.md
@@ -360,8 +360,9 @@ resume.py ERROR: ... "Quota 'C2_CPUS' exceeded. Limit: 300.0 in region europe-we
 
 The solution here is to [request more of the specified quota](#gcp-quotas),
 `C2 CPUs` in the example above. Alternatively, you could switch the partition's
-[machine_type](https://github.com/GoogleCloudPlatform/hpc-toolkit/tree/main/resources/third-party/compute/SchedMD-slurm-on-gcp-partition#input_machine_type)
-, to one which has sufficient quota.
+[machine type][partition-machine-type], to one which has sufficient quota.
+
+[partition-machine-type]: community/resources/compute/SchedMD-slurm-on-gcp-partition/README.md#input_machine_type
 
 #### Placement Groups
 
@@ -379,10 +380,11 @@ $ cat /var/log/slurm/resume.log
 resume.py ERROR: group operation failed: Requested minimum count of 6 VMs could not be created.
 ```
 
-One way to resolve this is to set
-[enable_placement](https://github.com/GoogleCloudPlatform/hpc-toolkit/tree/main/resources/third-party/compute/SchedMD-slurm-on-gcp-partition#input_enable_placement)
+One way to resolve this is to set [enable_placement][partition-enable-placement]
 to `false` on the partition in question.
 
+[partition-enable-placement]: https://github.com/GoogleCloudPlatform/hpc-toolkit/tree/main/community/resources/compute/SchedMD-slurm-on-gcp-partition#input_enable_placement
+
 ### Terraform Deployment
 
 When `terraform apply` fails, Terraform generally provides a useful error

diff --git a/community/examples/README.md b/community/examples/README.md
@@ -31,76 +31,6 @@ terraform_backend_defaults:
 
 ## Config Descriptions
 
-### hpc-cluster-small.yaml
-
-Creates a basic auto-scaling SLURM cluster with mostly default settings. The
-blueprint also creates a new VPC network, and a filestore instance mounted to
-`/home`.
-
-There are 2 partitions in this example: `debug` and `compute`. The `debug`
-partition uses `n2-standard-2` VMs, which should work out of the box without
-needing to request additional quota. The purpose of the `debug` partition is to
-make sure that first time users are not immediately blocked by quota
-limitations.
-
-#### Compute Partition
-
-There is a `compute` partition that achieves higher performance. Any
-performance analysis should be done on the `compute` partition. By default it
-uses `c2-standard-60` VMs with placement groups enabled. You may need to request
-additional quota for `C2 CPUs` in the region you are deploying in. You can
-select the compute partition using the `srun -p compute` argument.
-
-Quota required for this example:
-
-* Cloud Filestore API: Basic SSD (Premium) capacity (GB) per region: **2660 GB**
-* Compute Engine API: Persistent Disk SSD (GB): **~10 GB**
-* Compute Engine API: N2 CPUs: **12**
-* Compute Engine API: C2 CPUs: **60/node** up to 1200 - _only needed for
-  `compute` partition_
-* Compute Engine API: Affinity Groups: **one for each job in parallel** - _only
-  needed for `compute` partition_
-* Compute Engine API: Resource policies: **one for each job in parallel** -
-  _only needed for `compute` partition_
-
-### hpc-cluster-high-io.yaml
-
-Creates a slurm cluster with tiered file systems for higher performance. It
-connects to the default VPC of the project and creates two partitions and a
-login node.
-
-File systems:
-
-* The homefs mounted at `/home` is a default "PREMIUM" tier filestore with
-  2.5TiB of capacity
-* The projectsfs is mounted at `/projects` and is a high scale SSD filestore
-  instance with 10TiB of capacity.
-* The scratchfs is mounted at `/scratch` and is a
-  [DDN Exascaler Lustre](../resources/third-party/file-system/DDN-EXAScaler/README.md)
-  file system designed for high IO performance. The capacity is ~10TiB.
-
-There are two partitions in this example: `low_cost` and `compute`. The
-`low_cost` partition uses `n2-standard-4` VMs. This partition can be used for
-debugging and workloads that do not require high performance.
-
-Similar to the small example, there is a
-[compute partition](#compute-partition) that should be used for any performance
-analysis.
-
-Quota required for this example:
-
-* Cloud Filestore API: Basic SSD (Premium) capacity (GB) per region: **2660 GB**
-* Cloud Filestore API: High Scale SSD capacity (GB) per region: **10240 GiB** - _min
-  quota request is 61440 GiB_
-* Compute Engine API: Persistent Disk SSD (GB): **~14000 GB**
-* Compute Engine API: N2 CPUs: **158**
-* Compute Engine API: C2 CPUs: **60/node** up to 12,000 - _only needed for
-  `compute` partition_
-* Compute Engine API: Affinity Groups: **one for each job in parallel** - _only
-  needed for `compute` partition_
-* Compute Engine API: Resource policies: **one for each job in parallel** -
-  _only needed for `compute` partition_
-
 ### spack-gromacs.yaml
 
 Spack is a HPC software package manager. This example creates a small slurm
@@ -152,50 +82,6 @@ omnia-manager node and 2 omnia-compute nodes, on the pre-existing default
 network. Omnia will be automatically installed after the nodes are provisioned.
 All nodes mount a filestore instance on `/home`.
 
-### image-builder.yaml
-
-This Blueprint helps create custom VM images by applying necessary software and
-configurations to existing images, such as the [HPC VM Image][hpcimage].
-Using a custom VM image can be more scalable than installing software using
-boot-time startup scripts because
-
-* it avoids reliance on continued availability of package repositories
-* VMs will join an HPC cluster and execute workloads more rapidly due to reduced
-  boot-time configuration
-* machines are guaranteed to boot with a static set of packages available when
-  the custom image was created. No potential for some machines to be upgraded
-  relative to other based upon their creation time!
-
-[hpcimage]: https://cloud.google.com/compute/docs/instances/create-hpc-vm
-
-**Note**: it is important _not to modify_ the subnetwork name in either of the
-two resource groups without modifying them both. These _must_ match!
-
-#### Custom Network (resource group)
-
-A tool called [Packer](https://packer.io) builds custom VM images by creating
-short-lived VMs, executing scripts on them, and saving the boot disk as an
-image that can be used by future VMs. The short-lived VM must operate in a
-network that
-
-* has outbound access to the internet for downloading software
-* has SSH access from the machine running Packer so that local files/scripts
-  can be copied to the VM
-
-This resource group creates such a network, while using [Cloud Nat][cloudnat]
-and [Identity-Aware Proxy (IAP)][iap] to allow outbound traffic and inbound SSH
-connections without exposing the machine to the internet on a public IP address.
-
-[cloudnat]: https://cloud.google.com/nat/docs/overview
-[iap]: https://cloud.google.com/iap/docs/using-tcp-forwarding
-
-#### Packer Template (resource group)
-
-The Packer template in this resource group accepts a list of Ansible playbooks
-which will be run on the VM to customize it.  Although it defaults to creating
-VMs with a public IP address, it can be easily set to use [IAP][iap] for SSH
-tunneling following the [example in its README](../resources/packer/custom-image/README.md).
-
 ## Config Schema
 
 A user defined config should follow the following schema:

diff --git a/community/resources/README.md b/community/resources/README.md
@@ -0,0 +1,59 @@
+# Community Resources
+
+To learn more about using and writing resources, see the [core resources
+documentation](../../resources/README.md).
+
+## Compute
+
+* [**SchedMD-slurm-on-gcp-partition**](compute/SchedMD-slurm-on-gcp-partition/README.md):
+  Creates a SLURM partition that can be used by the
+  SchedMD-slurm_on_gcp_controller.
+
+## Database
+
+*
+  [**slurm-cloudsql-federation**](database/slurm-cloudsql-federation/README.md):
+  Creates a [Google SQL Instance](https://cloud.google.com/sql/) meant to be
+  integrated with a
+  [slurm controller](./third-pary/scheduler/SchedMD-slurm-on-gcp-controller/README.md).
+
+## File System
+
+* [**nfs-server**](file-system/nfs-server/README.md): Creates a VM instance and
+  configures an NFS server that can be mounted by other VM instances.
+
+* [**DDN-EXAScaler**](third-party/file-system/DDN-EXAScaler/README.md): Creates
+  a DDN Exascaler lustre](<https://www.ddn.com/partners/google-cloud-platform/>)
+  file system. This resource has
+  [license costs](https://console.developers.google.com/marketplace/product/ddnstorage/exascaler-cloud).
+
+## Project
+
+* [**new-project**](project/new-project/README.md): Creates a Google Cloud Projects
+
+* [**service-account**](project/service-account/README.md): Creates [service
+  accounts](https://cloud.google.com/iam/docs/service-accounts) for a GCP project.
+
+* [**service-enablement**](project/service-enablement/README.md): Allows
+  enabling various APIs for a Google Cloud Project
+
+## Scripts
+
+* [**omnia-install**](scripts/omnia-install/README.md): Installs SLURM via omnia
+  onto a cluster of compute VMs
+
+* [**spack-install**](scripts/spack-install/README.md): Creates a startup script
+  to install spack on an instance or the slurm controller
+
+* [**wait-for-startup**](scripts/wait-for-startup/README.md): Waits for
+  successful completion of a startup script on a compute VM
+
+## Scheduler
+
+* [**SchedMD-slurm-on-gcp-controller**](scheduler/SchedMD-slurm-on-gcp-controller/README.md):
+  Creates a SLURM controller node using
+  [slurm-gcp](https://github.com/SchedMD/slurm-gcp/tree/master/tf/modules/controller)
+
+* [**SchedMD-slurm-on-gcp-login-node**](scheduler/SchedMD-slurm-on-gcp-login-node/README.md):
+  Creates a SLURM login node using
+  [slurm-gcp](https://github.com/SchedMD/slurm-gcp/tree/master/tf/modules/login)
diff --git a/community/resources/compute/SchedMD-slurm-on-gcp-partition/README.md b/community/resources/compute/SchedMD-slurm-on-gcp-partition/README.md
@@ -13,7 +13,7 @@ Create a partition resource with a max node count of 200, named "compute",
 connected to a resource subnetwork and with homefs mounted.
 
 ```yaml
-- source: ./resources/third-party/compute/SchedMD-slurm-on-gcp-partition
+- source: ./community/resources/compute/SchedMD-slurm-on-gcp-partition
   kind: terraform
   id: compute_partition
   settings:

diff --git a/community/resources/file-system/nfs-server/README.md b/community/resources/file-system/nfs-server/README.md
@@ -10,7 +10,7 @@ files with other clients over a network via the
 ### Example
 
 ```yaml
-- source: resources/file-system/nfs-server
+- source: ./community/resources/file-system/nfs-server
   kind: terraform
   id: homefs
   settings:

diff --git a/community/resources/project/new-project/README.md b/community/resources/project/new-project/README.md
@@ -9,7 +9,7 @@ This module is meant for use with Terraform 0.13.
 ### Example
 
 ```yaml
-- source: ./resources/project/new-project
+- source: ./community/resources/project/new-project
   kind: terraform
   id: project
   settings:

diff --git a/community/resources/project/service-account/README.md b/community/resources/project/service-account/README.md
@@ -5,7 +5,7 @@ Allows creation of service accounts for a Google Cloud Platform project.
 ### Example
 
 ```yaml
-- source: ./resources/service-account
+- source: ./community/resources/project/service-account
   kind: terraform
   id: service_acct
   settings:

diff --git a/community/resources/project/service-enablement/README.md b/community/resources/project/service-enablement/README.md
@@ -5,7 +5,7 @@ Allows management of multiple API services for a Google Cloud Platform project.
 ### Example
 
 ```yaml
-- source: ./resources/service-enablement
+- source: ./community/resources/project/service-enablement
   kind: terraform
   id: services-api
   settings:

diff --git a/community/resources/scheduler/SchedMD-slurm-on-gcp-controller/README.md b/community/resources/scheduler/SchedMD-slurm-on-gcp-controller/README.md
@@ -9,7 +9,7 @@ More information about Slurm On GCP can be found at the [project's GitHub page](
 ### Example
 
 ```yaml
-- source: ./resources/third-party/scheduler/SchedMD-slurm-on-gcp-controller
+- source: ./community/resources/scheduler/SchedMD-slurm-on-gcp-controller
   kind: terraform
   id: slurm_controller
   settings:

diff --git a/community/resources/scheduler/SchedMD-slurm-on-gcp-login-node/README.md b/community/resources/scheduler/SchedMD-slurm-on-gcp-login-node/README.md
@@ -11,7 +11,7 @@ resource.
 ### Example
 
 ```yaml
-- source: ./resources/third-party/scheduler/SchedMD-slurm-on-gcp-login-node
+- source: ./community/resources/scheduler/SchedMD-slurm-on-gcp-login-node
   kind: terraform
   id: slurm_login
   settings:

diff --git a/community/resources/scripts/spack-install/README.md b/community/resources/scripts/spack-install/README.md
@@ -35,26 +35,26 @@ https://www.googleapis.com/auth/devstorage.read_write
 As an example, the below is a possible definition of a spack installation.
 
 ```yaml
-  - source: ./resources/scripts/spack-install
+  - source: ./community/resources/scripts/spack-install
     kind: terraform
     id: spack
     settings:
-      install_dir: /apps/spack
+      install_dir: /sw/spack
       spack_url: https://github.com/spack/spack
       spack_ref: v0.17.0
       spack_cache_url:
       - mirror_name: 'gcs_cache'
         mirror_url: gs://example-buildcache/linux-centos7
       configs:
       - type: 'single-config'
-        value: 'config:install_tree:/apps/spack/opt'
+        value: 'config:install_tree:/sw/spack/opt'
         scope: 'site'
       - type: 'file'
         scope: 'site'
         value: |
           config:
             build_stage:
-              - /apps/spack/stage
+              - /sw/spack/stage
       - type: 'file'
         scope: 'site'
         value: |
@@ -91,7 +91,7 @@ Following the above description of this resource, it can be added to a Slurm
 deployment via the following:
 
 ```yaml
-- source: resources/third-party/scheduler/SchedMD-slurm-on-gcp-controller
+- source: ./community/resources/scheduler/SchedMD-slurm-on-gcp-controller
     kind: terraform
     id: slurm_controller
     use: [spack]
@@ -116,7 +116,7 @@ Alternatively, it can be added as a startup script via:
         destination: install_spack_deps.yml
       - type: shell
         content: $(spack.startup_script)
-        destination: "/apps/spack-install.sh"
+        destination: "/sw/spack-install.sh"
 ```
 
 <!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->

diff --git a/community/resources/scripts/wait-for-startup/README.md b/community/resources/scripts/wait-for-startup/README.md
@@ -15,7 +15,7 @@ up a node.
 ### Example
 
 ```yaml
-- source: ./resources/scripts/wait-for-startup
+- source: ./community/resources/scripts/wait-for-startup
   kind: terraform
   id: wait
   settings:

diff --git a/examples/README.md b/examples/README.md
@@ -76,7 +76,7 @@ File systems:
 * The projectsfs is mounted at `/projects` and is a high scale SSD filestore
   instance with 10TiB of capacity.
 * The scratchfs is mounted at `/scratch` and is a
-  [DDN Exascaler Lustre](../resources/third-party/file-system/DDN-EXAScaler/README.md)
+  [DDN Exascaler Lustre](../community/resources/file-system/DDN-EXAScaler/README.md)
   file system designed for high IO performance. The capacity is ~10TiB.
 
 There are two partitions in this example: `low_cost` and `compute`. The

diff --git a/examples/hpc-cluster-high-io.yaml b/examples/hpc-cluster-high-io.yaml
@@ -48,14 +48,14 @@ resource_groups:
       size_gb: 10240
       local_mount: /projects
 
-  - source: resources/third-party/file-system/DDN-EXAScaler
+  - source: ./community/resources/file-system/DDN-EXAScaler
     kind: terraform
     id: scratchfs
     use: [network1]
     settings:
       local_mount: /scratch
 
-  - source: resources/third-party/compute/SchedMD-slurm-on-gcp-partition
+  - source: ./community/resources/compute/SchedMD-slurm-on-gcp-partition
     kind: terraform
     id: low_cost_partition
     use:
@@ -71,7 +71,7 @@ resource_groups:
       machine_type: n2-standard-4
 
   # This compute_partition is far more performant than low_cost_partition.
-  - source: resources/third-party/compute/SchedMD-slurm-on-gcp-partition
+  - source: ./community/resources/compute/SchedMD-slurm-on-gcp-partition
     kind: terraform
     id: compute_partition
     use:
@@ -83,7 +83,7 @@ resource_groups:
       max_node_count: 200
       partition_name: compute
 
-  - source: resources/third-party/scheduler/SchedMD-slurm-on-gcp-controller
+  - source: ./community/resources/scheduler/SchedMD-slurm-on-gcp-controller
     kind: terraform
     id: slurm_controller
     use:
@@ -94,7 +94,7 @@ resource_groups:
     - low_cost_partition  # low cost partition will be default as it is listed first
     - compute_partition
 
-  - source: resources/third-party/scheduler/SchedMD-slurm-on-gcp-login-node
+  - source: ./community/resources/scheduler/SchedMD-slurm-on-gcp-login-node
     kind: terraform
     id: slurm_login
     use: