fix: update location and region handling to allow deploy outside of t…

…he US (#226) * delete unused region input * make cloud build respect the region defined by the user * make the creation of the app engine application respect the region defined by the user * fix usage of region and location in the examples * inline zone local * add note to explain region, location, and trusted_location * replace region with pubsub_resource_location and remove unused outputs * rename input in READMEs
GoogleCloudPlatform · Dec 15, 2021 · 16eaff7 · 16eaff7
1 parent 2800ec1
commit 16eaff7
Show file tree

Hide file tree

Showing 22 changed files with 95 additions and 65 deletions.
diff --git a/README.md b/README.md
@@ -33,7 +33,9 @@ module "secured_data_warehouse" {
   terraform_service_account        = TERRAFORM_SERVICE_ACCOUNT
   access_context_manager_policy_id = ACCESS_CONTEXT_MANAGER_POLICY_ID
   bucket_name                      = DATA_INGESTION_BUCKET_NAME
-  region                           = REGION
+  pubsub_resource_location         = PUBSUB_RESOURCE_LOCATION
+  location                         = LOCATION
+  trusted_locations                = TRUSTED_LOCATIONS
   dataset_id                       = DATASET_ID
   confidential_dataset_id          = CONFIDENTIAL_DATASET_ID
   cmek_keyring_name                = CMEK_KEYRING_NAME
@@ -47,6 +49,12 @@ module "secured_data_warehouse" {
 }
 ```
 
+**Note:** There are three inputs related to GCP Locations in the module:
+
+- `pubsub_resource_location`: is used to define which GCP location will be used to [Restrict Pub/Sub resource locations](https://cloud.google.com/pubsub/docs/resource-location-restriction). This policy offers a way to ensure that messages published to a topic are never persisted outside of a Google Cloud regions you specify, regardless of where the publish requests originate. **Zones or multi-region locations are not supported**.
+- `location`: is used to define which GCP region will be used for all other resources created: [Cloud Storage buckets](https://cloud.google.com/storage/docs/locations), [BigQuery datasets](https://cloud.google.com/bigquery/docs/locations), and [Cloud KMS key rings](https://cloud.google.com/kms/docs/locations). **Multi-region locations are supported**.
+- `trusted_locations`: is a list of locations that are used to set an [Organization Policy](https://cloud.google.com/resource-manager/docs/organization-policy/defining-locations#location_types) that restricts the GCP locations that can be used in the projects of the Secured Data Warehouse. Both `pubsub_resource_location` and `location` must respect this restriction.
+
 <!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
 ## Inputs
 
@@ -79,17 +87,17 @@ module "secured_data_warehouse" {
 | delete\_contents\_on\_destroy | (Optional) If set to true, delete all the tables in the dataset when destroying the resource; otherwise, destroying the resource will fail if tables are present. | `bool` | `false` | no |
 | key\_rotation\_period\_seconds | Rotation period for keys. The default value is 30 days. | `string` | `"2592000s"` | no |
 | kms\_key\_protection\_level | The protection level to use when creating a key. Possible values: ["SOFTWARE", "HSM"] | `string` | `"HSM"` | no |
-| location | The location for the KMS Customer Managed Encryption Keys, Bucket, and Bigquery dataset. This location can be a multiregion, if it is empty the region value will be used. | `string` | `""` | no |
+| location | The location for the KMS Customer Managed Encryption Keys, Cloud Storage Buckets, and Bigquery datasets. This location can be a multi-region. | `string` | `"us-east4"` | no |
 | network\_administrator\_group | Google Cloud IAM group that reviews network configuration. Typically, this includes members of the networking team. | `string` | n/a | yes |
 | non\_confidential\_data\_project\_id | The ID of the project in which the Bigquery will be created. | `string` | n/a | yes |
 | org\_id | GCP Organization ID. | `string` | n/a | yes |
 | perimeter\_additional\_members | The list additional members to be added on perimeter access. Prefix user: (user:email@email.com) or serviceAccount: (serviceAccount:my-service-account@email.com) is required. | `list(string)` | `[]` | no |
-| region | The region in which the resources will be deployed. | `string` | `"us-east4"` | no |
+| pubsub\_resource\_location | The location in which the messages published to Pub/Sub will be persisted. This location cannot be a multi-region. | `string` | `"us-east4"` | no |
 | sdx\_project\_number | The Project Number to configure Secure data exchange with egress rule for the dataflow templates. | `string` | n/a | yes |
 | security\_administrator\_group | Google Cloud IAM group that administers security configurations in the organization(org policies, KMS, VPC service perimeter). | `string` | n/a | yes |
 | security\_analyst\_group | Google Cloud IAM group that monitors and responds to security incidents. | `string` | n/a | yes |
 | terraform\_service\_account | The email address of the service account that will run the Terraform code. | `string` | n/a | yes |
-| trusted\_locations | This is a list of trusted regions where location-based GCP resources can be created. ie us-locations eu-locations. | `list(string)` | <pre>[<br>  "us-locations",<br>  "eu-locations"<br>]</pre> | no |
+| trusted\_locations | This is a list of trusted regions where location-based GCP resources can be created. | `list(string)` | <pre>[<br>  "us-locations"<br>]</pre> | no |
 | trusted\_subnetworks | The URI of the subnetworks where resources are going to be deployed. | `list(string)` | `[]` | no |
 
 ## Outputs

diff --git a/examples/batch-data-ingestion/README.md b/examples/batch-data-ingestion/README.md
@@ -36,7 +36,7 @@ configure with the firewall rules and DNS configurations described below.
 This examples uses a [csv file with sample data](./assets/cc_10000_records.csv) as input for the dataflow job.
 You can create new files with different sizes using the [sample-cc-generator](../../helpers/sample-cc-generator/README.md) helper.
 This new file must be placed in the [assets folder](./assets)
-You need to change value of the local `cc_file_name` in the [main.tf](./main.tf#L23) file to use the new sample file:
+You need to change value of the local `cc_file_name` in the [main.tf](./main.tf#L22) file to use the new sample file:
 
 ```hcl
 locals {
@@ -84,7 +84,6 @@ since the Batch Dataflow job ends when the pipeline finishes .
 | controller\_service\_account | The Service Account email that will be used to identify the VMs in which the jobs are running. |
 | dataflow\_temp\_bucket\_name | The name of the dataflow temporary bucket. |
 | df\_job\_network | The URI of the VPC being created. |
-| df\_job\_region | The region of the newly created Dataflow job. |
 | df\_job\_subnetwork | The name of the subnetwork used for create Dataflow job. |
 | project\_id | The data ingestion project's ID. |
 | scheduler\_id | Cloud Scheduler Job id created. |

diff --git a/examples/batch-data-ingestion/httpRequest.tmpl b/examples/batch-data-ingestion/httpRequest.tmpl
@@ -2,7 +2,7 @@
     "jobName": "batch-dataflow-flow",
     "environment": {
         "maxWorkers": 5,
-        "zone": "${location}",
+        "zone": "${zone}",
         "ipConfiguration": "WORKER_IP_PRIVATE",
         "enableStreamingEngine": true,
         "network": "${network_self_link}",

diff --git a/examples/batch-data-ingestion/main.tf b/examples/batch-data-ingestion/main.tf
@@ -14,8 +14,7 @@
  * limitations under the License.
  */
 locals {
-  region              = "us-east4"
-  location            = "us-east4-a"
+  location            = "us-east4"
   schema_file         = "schema.json"
   transform_code_file = "transform.js"
   dataset_id          = "dts_data_ingestion"
@@ -26,7 +25,7 @@ locals {
   httpRequestTemplate = templatefile(
     "${path.module}/httpRequest.tmpl",
     {
-      location                            = local.location,
+      zone                                = "us-east4-a",
       network_self_link                   = var.network_self_link,
       dataflow_service_account            = module.data_ingestion.dataflow_controller_service_account_email,
       subnetwork_self_link                = var.subnetwork_self_link,
@@ -53,8 +52,8 @@ module "data_ingestion" {
   terraform_service_account        = var.terraform_service_account
   access_context_manager_policy_id = var.access_context_manager_policy_id
   bucket_name                      = "data-ingestion"
-  location                         = local.region
-  region                           = local.region
+  pubsub_resource_location         = local.location
+  location                         = local.location
   dataset_id                       = local.dataset_id
   cmek_keyring_name                = "cmek_keyring"
   delete_contents_on_destroy       = var.delete_contents_on_destroy
@@ -130,7 +129,7 @@ resource "google_cloud_scheduler_job" "scheduler" {
   # Scheduler need App Engine enabled in the project to run, in the same region where it going to be deployed.
   # If you are using App Engine in us-central, you will need to use as region us-central1 for Scheduler.
   # You will get a resource not found error if just using us-central.
-  region  = local.region
+  region  = local.location
   project = var.data_ingestion_project_id
 
   http_target {
@@ -139,7 +138,7 @@ resource "google_cloud_scheduler_job" "scheduler" {
       "Accept"       = "application/json"
       "Content-Type" = "application/json"
     }
-    uri = "https://dataflow.googleapis.com/v1b3/projects/${var.data_ingestion_project_id}/locations/${local.region}/templates"
+    uri = "https://dataflow.googleapis.com/v1b3/projects/${var.data_ingestion_project_id}/locations/${local.location}/templates"
     oauth_token {
       service_account_email = module.data_ingestion.scheduler_service_account_email
     }

diff --git a/examples/batch-data-ingestion/outputs.tf b/examples/batch-data-ingestion/outputs.tf
@@ -34,11 +34,6 @@ output "dataflow_temp_bucket_name" {
   value       = module.data_ingestion.data_ingestion_dataflow_bucket_name
 }
 
-output "df_job_region" {
-  description = "The region of the newly created Dataflow job."
-  value       = local.region
-}
-
 output "df_job_network" {
   description = "The URI of the VPC being created."
   value       = var.network_self_link

diff --git a/examples/bigquery-confidential-data/main.tf b/examples/bigquery-confidential-data/main.tf
@@ -45,6 +45,7 @@ module "secured_data_warehouse" {
   terraform_service_account        = var.terraform_service_account
   access_context_manager_policy_id = var.access_context_manager_policy_id
   bucket_name                      = "data-ingestion"
+  pubsub_resource_location         = local.location
   location                         = local.location
   dataset_id                       = local.non_confidential_dataset_id
   confidential_dataset_id          = local.confidential_dataset_id

diff --git a/examples/dataflow-with-dlp/README.md b/examples/dataflow-with-dlp/README.md
@@ -11,7 +11,7 @@ It uses:
 ## Prerequisites
 
 1. The [Secured data warehouse](../../README.md#requirements) module requirements to create the Secured data warehouse infrastructure.
-1. A `crypto_key` and `wrapped_key` pair. Contact your Security Team to obtain the pair. The `crypto_key` location must be the same location where DLP, Storage and BigQuery are going to be created (`local.region`). There is a [Wrapped Key Helper](../../helpers/wrapped-key/README.md) python script which generates a wrapped key.
+1. A `crypto_key` and `wrapped_key` pair. Contact your Security Team to obtain the pair. The `crypto_key` location must be the same location where DLP, Storage and BigQuery are going to be created (`local.location`). There is a [Wrapped Key Helper](../../helpers/wrapped-key/README.md) python script which generates a wrapped key.
 1. The identity deploying the example must have permission to grant roles `roles/cloudkms.cryptoKeyDecrypter` and `roles/cloudkms.cryptoKeyEncrypter` in the KMS `crypto_key`. It will be granted to the Data ingestion Dataflow worker service account created by the Secured Data Warehouse module.
 1. The identity deploying the example must have permission to grant role `roles/artifactregistry.reader` in the docker repo of the Flex templates.
 1. A network and subnetwork in the data ingestion project [configured for Private Google Access](https://cloud.google.com/vpc/docs/configure-private-google-access).
@@ -78,7 +78,6 @@ locals {
 | bucket\_data\_ingestion\_name | The name of the bucket. |
 | controller\_service\_account | The Service Account email that will be used to identify the VMs in which the jobs are running. |
 | df\_job\_subnetwork | The name of the subnetwork used for create Dataflow job. |
-| dlp\_location | The location of the DLP resources. |
 | project\_id | The project's ID. |
 | template\_id | The ID of the Cloud DLP de-identification template that is created. |
 

diff --git a/examples/dataflow-with-dlp/main.tf b/examples/dataflow-with-dlp/main.tf
@@ -15,7 +15,7 @@
  */
 
 locals {
-  region       = "us-east4"
+  location     = "us-east4"
   dataset_id   = "dts_data_ingestion"
   cc_file_name = "cc_10000_records.csv"
   cc_file_path = "${path.module}/assets"
@@ -34,6 +34,8 @@ module "data_ingestion" {
   bucket_name                      = "data-ingestion"
   dataset_id                       = local.dataset_id
   cmek_keyring_name                = "cmek_keyring"
+  pubsub_resource_location         = local.location
+  location                         = local.location
   delete_contents_on_destroy       = var.delete_contents_on_destroy
   perimeter_additional_members     = var.perimeter_additional_members
   data_engineer_group              = var.data_engineer_group
@@ -65,7 +67,7 @@ module "de_identification_template" {
   terraform_service_account = var.terraform_service_account
   crypto_key                = var.crypto_key
   wrapped_key               = var.wrapped_key
-  dlp_location              = local.region
+  dlp_location              = local.location
   template_file             = "${path.module}/deidentification.tmpl"
   dataflow_service_account  = module.data_ingestion.dataflow_controller_service_account_email
 
@@ -78,7 +80,7 @@ resource "google_artifact_registry_repository_iam_member" "docker_reader" {
   provider = google-beta
 
   project    = var.external_flex_template_project_id
-  location   = local.region
+  location   = local.location
   repository = "flex-templates"
   role       = "roles/artifactregistry.reader"
   member     = "serviceAccount:${module.data_ingestion.dataflow_controller_service_account_email}"
@@ -94,7 +96,7 @@ module "regional_dlp" {
   project_id              = var.data_ingestion_project_id
   name                    = "regional-flex-java-gcs-dlp-bq"
   container_spec_gcs_path = var.de_identify_template_gs_path
-  region                  = local.region
+  region                  = local.location
   service_account_email   = module.data_ingestion.dataflow_controller_service_account_email
   subnetwork_self_link    = var.subnetwork_self_link
   kms_key_name            = module.data_ingestion.cmek_data_ingestion_crypto_key
@@ -108,7 +110,7 @@ module "regional_dlp" {
     datasetName            = local.dataset_id
     batchSize              = 1000
     dlpProjectId           = var.data_governance_project_id
-    dlpLocation            = local.region
+    dlpLocation            = local.location
     deidentifyTemplateName = module.de_identification_template.template_full_path
 
   }

diff --git a/examples/dataflow-with-dlp/outputs.tf b/examples/dataflow-with-dlp/outputs.tf
@@ -34,11 +34,6 @@ output "bucket_data_ingestion_name" {
   value       = module.data_ingestion.data_ingestion_bucket_name
 }
 
-output "dlp_location" {
-  description = "The location of the DLP resources."
-  value       = local.region
-}
-
 output "template_id" {
   description = "The ID of the Cloud DLP de-identification template that is created."
   value       = module.de_identification_template.template_id

diff --git a/examples/regional-dlp/main.tf b/examples/regional-dlp/main.tf
@@ -16,6 +16,7 @@
 
 locals {
   bq_schema = "book:STRING, author:STRING"
+  location  = "us-east4"
 }
 
 module "data_ingestion" {
@@ -31,7 +32,8 @@ module "data_ingestion" {
   bucket_name                      = "dlp-flex-data-ingestion"
   dataset_id                       = "dlp_flex_data_ingestion"
   cmek_keyring_name                = "dlp_flex_data-ingestion"
-  region                           = "us-east4"
+  pubsub_resource_location         = local.location
+  location                         = local.location
   delete_contents_on_destroy       = var.delete_contents_on_destroy
   perimeter_additional_members     = var.perimeter_additional_members
   data_engineer_group              = var.data_engineer_group
@@ -49,7 +51,7 @@ module "de_identification_template_example" {
   dataflow_service_account  = module.data_ingestion.dataflow_controller_service_account_email
   crypto_key                = var.crypto_key
   wrapped_key               = var.wrapped_key
-  dlp_location              = "us-east4"
+  dlp_location              = local.location
   template_file             = "${path.module}/templates/deidentification.tpl"
 
   depends_on = [
@@ -61,7 +63,7 @@ resource "google_artifact_registry_repository_iam_member" "docker_reader" {
   provider = google-beta
 
   project    = var.external_flex_template_project_id
-  location   = "us-east4"
+  location   = local.location
   repository = "flex-templates"
   role       = "roles/artifactregistry.reader"
   member     = "serviceAccount:${module.data_ingestion.dataflow_controller_service_account_email}"
@@ -75,7 +77,7 @@ resource "google_artifact_registry_repository_iam_member" "python_reader" {
   provider = google-beta
 
   project    = var.external_flex_template_project_id
-  location   = "us-east4"
+  location   = local.location
   repository = "python-modules"
   role       = "roles/artifactregistry.reader"
   member     = "serviceAccount:${module.data_ingestion.dataflow_controller_service_account_email}"
@@ -92,7 +94,7 @@ module "regional_dlp" {
   name                    = "regional-flex-python-pubsub-dlp-bq"
   container_spec_gcs_path = var.flex_template_gs_path
   job_language            = "PYTHON"
-  region                  = "us-east4"
+  region                  = local.location
   service_account_email   = module.data_ingestion.dataflow_controller_service_account_email
   subnetwork_self_link    = var.subnetwork_self_link
   kms_key_name            = module.data_ingestion.cmek_data_ingestion_crypto_key
@@ -103,7 +105,7 @@ module "regional_dlp" {
   parameters = {
     input_topic                    = "projects/${var.data_ingestion_project_id}/topics/${module.data_ingestion.data_ingestion_topic_name}"
     deidentification_template_name = "${module.de_identification_template_example.template_full_path}"
-    dlp_location                   = "us-east4"
+    dlp_location                   = local.location
     dlp_project                    = var.data_governance_project_id
     bq_schema                      = local.bq_schema
     output_table                   = "${var.non_confidential_data_project_id}:${module.data_ingestion.data_ingestion_bigquery_dataset.dataset_id}.classical_books"

diff --git a/examples/simple-example/main.tf b/examples/simple-example/main.tf
@@ -27,6 +27,8 @@ module "secured_data_warehouse" {
   bucket_name                      = "bucket_simple_example"
   dataset_id                       = "dataset_simple_example"
   cmek_keyring_name                = "key_name_simple_example"
+  pubsub_resource_location         = "us-east4"
+  location                         = "us-east4"
   delete_contents_on_destroy       = var.delete_contents_on_destroy
   perimeter_additional_members     = var.perimeter_additional_members
   data_engineer_group              = var.data_engineer_group

diff --git a/examples/tutorial-standalone/README.md b/examples/tutorial-standalone/README.md
@@ -32,7 +32,14 @@ The required infrastructure includes:
   - A Cloud KMS key
   - A traffic encryption key for DLP Templates
 
-This example will be deployed at the `us-east4` location, to deploy in another location change the local `location` in example [main.tf](./main.tf#L18) file.
+## Google Cloud Locations
+
+This example will be deployed at the `us-east4` location, to deploy in another location,
+change the local `location` in the example [main.tf](./main.tf#L18) file.
+By default, the Secured Data Warehouse module has an [Organization Policy](https://cloud.google.com/resource-manager/docs/organization-policy/defining-locations)
+that only allows the creation of resource in `us-locations`.
+To deploy in other locations, update the input [trusted_locations](../../README.md#inputs) with
+the appropriated location in the call to the [main module](./main.tf#L33).
 
 ## Usage
 

diff --git a/examples/tutorial-standalone/main.tf b/examples/tutorial-standalone/main.tf
@@ -42,7 +42,9 @@ module "secured_data_warehouse" {
   terraform_service_account        = var.terraform_service_account
   access_context_manager_policy_id = var.access_context_manager_policy_id
   bucket_name                      = "data-ingestion"
+  pubsub_resource_location         = local.location
   location                         = local.location
+  trusted_locations                = ["us-locations"]
   dataset_id                       = local.non_confidential_dataset_id
   confidential_dataset_id          = local.confidential_dataset_id
   cmek_keyring_name                = "cmek_keyring"