Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCS bucket module #836

Merged
merged 10 commits into from
Jan 19, 2023
154 changes: 154 additions & 0 deletions community/modules/file-system/cloud-storage-bucket/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
## Description

This module creates a [Google Cloud Storage (GCS) bucket](https://cloud.google.com/storage).

For more information on this and other network storage options in the Cloud HPC
Toolkit, see the extended [Network Storage documentation](../../../../docs/network_storage.md).

### Example

The following example will create a bucket named `simulation-results-xxxxxxxx`,
where `xxxxxxxx` is a randomly generated id.

```yaml
- id: bucket
source: community/modules/file-system/cloud-storage-bucket
settings:
name_prefix: simulation-results
random_suffix: true
```

> **_NOTE:_** Use of `random_suffix` may cause the following error when used
> with other modules:
> `value depends on resource attributes that cannot be determined until apply`.
> To resolve this set `random_suffix` to `false` (default).

<!-- -->

> **_NOTE:_** Bucket namespace is shared by all users of Google Cloud so it is
nick-stroud marked this conversation as resolved.
Show resolved Hide resolved
> possible to have a bucket name clash with an existing bucket that is not in
> your project. To resolve this try to use a more unique name, or set the
> `random_suffix` variable to `true`.

## Naming of Bucket

There are potentially three parts to the bucket name. Each of these parts are
configurable in the blueprint.

1. A **custom prefix**, provided by the user in the blueprint \
Provide the custom prefix using the `name_prefix` setting.

1. The **deployment name**, included by default \
The deployment name can be excluded by setting `use_deployment_name_in_bucket_name: false`.

1. A **random id** suffix, excluded by default \
The random id can be included by setting `random_suffix: true`.

If none of these are provided (no `name_prefix`,
`use_deployment_name_in_bucket_name: false`, & `random_suffix: false`), then the
bucket name will default to `no-bucket-name-provided`.

Since bucket namespace is shared by all users of Google Cloud, it is more likely
to experience naming clashes than with other resources. In many cases, adding
the `random_suffix` will resolve the naming clash issue.

> **Warning**: If a bucket is created with a `random_suffix` and then used as
> the bucket for a startup script in the same deployment group this will cause a
> `not known at apply time` error in terraform. The solution is to either create
> the bucket in a separate deployment group or to remove the random suffix.

## Mounting

To mount the Cloud Storage bucket you must first ensure that the GCS Fuse client
has been installed and then call the proper `mount` command.

Both of these steps are automatically handled with the use of the `use` command
in a selection of HPC Toolkit modules. See the [compatibility matrix][matrix] in
the network storage doc for a complete list of supported modules.

If mounting is not automatically handled as described above, the
`cloud-storage-bucket` module outputs runners that can be used with the
`startup-script` module to install the client and mount the file system. See the
following example:

```yaml
- id: bucket
source: community/modules/file-system/cloud-storage-bucket
settings: {local_mount: /data}

- id: mount-at-startup
source: modules/scripts/startup-script
settings:
runners:
- $(bucket.client_install_runner)
- $(bucket.mount_runner)
```

[matrix]: ../../../../docs/network_storage.md#compatibility-matrix

## License

<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
Copyright 2023 Google LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

## Requirements

| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 0.14.0 |
| <a name="requirement_google"></a> [google](#requirement\_google) | >= 3.83 |
| <a name="requirement_random"></a> [random](#requirement\_random) | ~> 3.0 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_google"></a> [google](#provider\_google) | >= 3.83 |
| <a name="provider_random"></a> [random](#provider\_random) | ~> 3.0 |

## Modules

No modules.

## Resources

| Name | Type |
|------|------|
| [google_storage_bucket.bucket](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/storage_bucket) | resource |
| [random_id.resource_name_suffix](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/id) | resource |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_deployment_name"></a> [deployment\_name](#input\_deployment\_name) | Name of the HPC deployment; used as part of name of the GCS bucket. | `string` | n/a | yes |
| <a name="input_labels"></a> [labels](#input\_labels) | Labels to add to the GCS bucket. List key, value pairs. | `any` | n/a | yes |
| <a name="input_local_mount"></a> [local\_mount](#input\_local\_mount) | The mount point where the contents of the device may be accessed after mounting. | `string` | `"/mnt"` | no |
| <a name="input_mount_options"></a> [mount\_options](#input\_mount\_options) | Mount options to be put in fstab. Note: `implicit_dirs` makes it easier to work with objects added by other tools, but there is a performance impact. See: [more information](https://github.com/GoogleCloudPlatform/gcsfuse/blob/master/docs/semantics.md#implicit-directories) | `string` | `"defaults,_netdev,implicit_dirs"` | no |
| <a name="input_name_prefix"></a> [name\_prefix](#input\_name\_prefix) | Name Prefix. | `string` | `null` | no |
| <a name="input_project_id"></a> [project\_id](#input\_project\_id) | ID of project in which GCS bucket will be created. | `string` | n/a | yes |
| <a name="input_random_suffix"></a> [random\_suffix](#input\_random\_suffix) | If true, a random id will be appended to the suffix of the bucket name. | `bool` | `false` | no |
| <a name="input_region"></a> [region](#input\_region) | The region to deploy to | `string` | n/a | yes |
| <a name="input_use_deployment_name_in_bucket_name"></a> [use\_deployment\_name\_in\_bucket\_name](#input\_use\_deployment\_name\_in\_bucket\_name) | If true, the deployment name will be included as part of the bucket name. This helps prevent naming clashes across multiple deployments. | `bool` | `true` | no |

## Outputs

| Name | Description |
|------|-------------|
| <a name="output_client_install_runner"></a> [client\_install\_runner](#output\_client\_install\_runner) | Runner that performs client installation needed to use gcs fuse. |
| <a name="output_gcs_bucket_path"></a> [gcs\_bucket\_path](#output\_gcs\_bucket\_path) | value |
| <a name="output_mount_runner"></a> [mount\_runner](#output\_mount\_runner) | Runner that mounts the cloud storage bucket with gcs fuse. |
| <a name="output_network_storage"></a> [network\_storage](#output\_network\_storage) | Describes a remote network storage to be mounted by fs-tab. |
<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
38 changes: 38 additions & 0 deletions community/modules/file-system/cloud-storage-bucket/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
/**
* Copyright 2023 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

locals {
prefix = var.name_prefix != null ? var.name_prefix : ""
deployment = var.use_deployment_name_in_bucket_name ? var.deployment_name : ""
suffix = var.random_suffix ? random_id.resource_name_suffix.hex : ""
first_dash = (local.prefix != "" && (local.deployment != "" || local.suffix != "")) ? "-" : ""
second_dash = local.deployment != "" && local.suffix != "" ? "-" : ""
composite_name = "${local.prefix}${local.first_dash}${local.deployment}${local.second_dash}${local.suffix}"
name = local.composite_name == "" ? "no-bucket-name-provided" : local.composite_name
}

resource "random_id" "resource_name_suffix" {
byte_length = 4
}

resource "google_storage_bucket" "bucket" {
project = var.project_id
name = local.name
uniform_bucket_level_access = true
location = var.region
storage_class = "REGIONAL"
labels = var.labels
}
64 changes: 64 additions & 0 deletions community/modules/file-system/cloud-storage-bucket/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
/**
* Copyright 2023 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

output "network_storage" {
description = "Describes a remote network storage to be mounted by fs-tab."
value = {
remote_mount = local.name
local_mount = var.local_mount
fs_type = "gcsfuse"
mount_options = var.mount_options
server_ip = null
client_install_runner = local.client_install_runner
mount_runner = local.mount_runner
}
}

locals {
client_install_runner = {
"type" = "shell"
"content" = file("${path.module}/scripts/install-gcs-fuse.sh")
"destination" = "install-gcsfuse${replace(var.local_mount, "/", "_")}.sh"
}

mount_runner = {
"type" = "shell"
"destination" = "mount_gcs${replace(var.local_mount, "/", "_")}.sh"
"args" = "\"not-used\" \"${local.name}\" \"${var.local_mount}\" \"gcsfuse\" \"${var.mount_options}\""
"content" = file("${path.module}/scripts/mount.sh")
}
}

output "client_install_runner" {
description = "Runner that performs client installation needed to use gcs fuse."
value = local.client_install_runner
}

output "mount_runner" {
description = "Runner that mounts the cloud storage bucket with gcs fuse."
value = local.mount_runner
}

output "gcs_bucket_path" {
description = "value"
# cannot use resource attribute, will cause lookup failure in startup-script
value = "gs://${local.name}"

# needed to make sure bucket contents are deleted before bucket
depends_on = [
google_storage_bucket.bucket
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
#!/bin/sh
# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

set -e

if [ ! "$(which gcsfuse)" ]; then
if [ -f /etc/centos-release ] || [ -f /etc/redhat-release ]; then
tee /etc/yum.repos.d/gcsfuse.repo >/dev/null <<EOF
[gcsfuse]
name=gcsfuse (packages.cloud.google.com)
baseurl=https://packages.cloud.google.com/yum/repos/gcsfuse-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=0
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
yum -y install gcsfuse

elif [ -f /etc/debian_version ] || grep -qi ubuntu /etc/lsb-release || grep -qi ubuntu /etc/os-release; then
RELEASE=$(lsb_release -c -s)
export GCSFUSE_REPO="gcsfuse-${RELEASE}"
echo "deb http://packages.cloud.google.com/apt $GCSFUSE_REPO main" | sudo tee /etc/apt/sources.list.d/gcsfuse.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

sudo apt-get update
sudo apt-get -y install gcsfuse
else
echo 'Unsuported distribution'
return 1
fi
fi
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
#!/bin/bash
# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -e
SERVER_IP=$1
REMOTE_MOUNT=$2
LOCAL_MOUNT=$3
FS_TYPE=$4
MOUNT_OPTIONS=$5

[[ -z "${MOUNT_OPTIONS}" ]] && POPULATED_MOUNT_OPTIONS="defaults" || POPULATED_MOUNT_OPTIONS="${MOUNT_OPTIONS}"

if [ "${FS_TYPE}" = "gcsfuse" ]; then
FS_SPEC="${REMOTE_MOUNT}"
else
FS_SPEC="${SERVER_IP}:${REMOTE_MOUNT}"
fi

SAME_LOCAL_IDENTIFIER="^[^#].*[[:space:]]${LOCAL_MOUNT}"
EXACT_MATCH_IDENTIFIER="${FS_SPEC}[[:space:]]${LOCAL_MOUNT}[[:space:]]${FS_TYPE}[[:space:]]${POPULATED_MOUNT_OPTIONS}[[:space:]]0[[:space:]]0"

grep -q "${SAME_LOCAL_IDENTIFIER}" /etc/fstab && SAME_LOCAL_IN_FSTAB=true || SAME_LOCAL_IN_FSTAB=false
grep -q "${EXACT_MATCH_IDENTIFIER}" /etc/fstab && EXACT_IN_FSTAB=true || EXACT_IN_FSTAB=false
findmnt --source "${SERVER_IP}":"${REMOTE_MOUNT}" --target "${LOCAL_MOUNT}" &>/dev/null && EXACT_MOUNTED=true || EXACT_MOUNTED=false

# Do nothing and success if exact entry is already in fstab and mounted
if [ "$EXACT_IN_FSTAB" = true ] && [ "${EXACT_MOUNTED}" = true ]; then
echo "Skipping mounting source: ${FS_SPEC}, already mounted to target:${LOCAL_MOUNT}"
exit 0
fi

# Fail if previous fstab entry is using same local mount
if [ "$SAME_LOCAL_IN_FSTAB" = true ] && [ "${EXACT_IN_FSTAB}" = false ]; then
echo "Mounting failed as local mount: ${LOCAL_MOUNT} was already in use in fstab"
exit 1
fi

# Add to fstab if entry is not already there
if [ "${EXACT_IN_FSTAB}" = false ]; then
echo "Adding ${FS_SPEC} -> ${LOCAL_MOUNT} to /etc/fstab"
echo "${FS_SPEC} ${LOCAL_MOUNT} ${FS_TYPE} ${POPULATED_MOUNT_OPTIONS} 0 0" >>/etc/fstab
fi

# Mount from fstab
echo "Mounting --target ${LOCAL_MOUNT} from fstab"
mkdir -p "${LOCAL_MOUNT}"
mount --target "${LOCAL_MOUNT}"
Loading