-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add infra for GCP #112
base: v2
Are you sure you want to change the base?
Add infra for GCP #112
Conversation
📝 WalkthroughWalkthroughThis pull request introduces several new files and configurations for managing infrastructure on Google Cloud Platform (GCP) using Terraform. Key changes include the addition of a Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 13
🧹 Outside diff range and nitpick comments (7)
infra-gcp/state.tf (1)
1-6
: Enhance state storage security and configuration management.
Consider the following security and maintainability improvements for the backend configuration:
- Enable encryption for sensitive state data
- Enable versioning for state files to prevent accidental loss
- Use variables for the bucket name to support different environments
Apply these improvements:
terraform {
backend "gcs" {
bucket = "bhasaideploymentstorage"
prefix = "terraform/bhasai/live/dev-k8s"
+ encryption_key = var.state_encryption_key
}
}
Also, consider adding these bucket configurations in a separate terraform configuration:
resource "google_storage_bucket" "terraform_state" {
name = var.state_bucket_name
location = var.region
force_destroy = false
versioning {
enabled = true
}
uniform_bucket_level_access = true
}
infra-gcp/inventory.ini.tftpl (2)
3-4
: Remove or update the example entries
The commented example entries contain hardcoded IP addresses that might be misleading. Consider either removing them or updating them with placeholder values that clearly indicate they are examples.
-#node1 ansible_host=35.202.160.50
-#node2 ansible_host=35.202.162.100
+# Example format:
+# node1 ansible_host=<external_ip> ansible_ssh_private_key_file=<path_to_key>
1-36
: Add documentation for template variables
The template uses several variables (master_nodes
, worker_nodes
, worker_gpu_nodes
) but lacks documentation about their expected structure and required attributes.
Add a header comment section explaining the required variables:
+# Required variables:
+# - master_nodes: List of objects with attributes: name, external_ip
+# - worker_nodes: List of objects with attributes: name, external_ip
+# - worker_gpu_nodes: List of objects with attributes: name, external_ip
+
## Configure 'ip' variable to bind kubernetes services on a
infra-gcp/vm.tf (1)
1-11
: Consider network segmentation for load balancer nodes.
The load balancer nodes are currently placed in the master subnet. For better security and network segmentation, consider placing them in a dedicated DMZ or frontend subnet.
infra-gcp/module/vm/vm.tf (3)
35-41
: Consolidate duplicate 'network_interface' blocks
There are two network_interface
blocks in the instance configuration. Unless multiple network interfaces are required, consider consolidating them into a single block to simplify the configuration.
Combine the configurations as needed to maintain only one network_interface
block.
Also applies to: 51-57
45-47
: Parameterize the VM image for flexibility
The VM image is hardcoded. To allow flexibility in selecting different images, consider making it a variable.
Add a variable for the image:
variable "node_image" {
type = string
default = "projects/ubuntu-os-cloud/global/images/ubuntu-2204-jammy-v20240419"
}
Update the boot_disk
configuration:
initialize_params {
- image = "projects/ubuntu-os-cloud/global/images/ubuntu-2204-jammy-v20240419"
+ image = var.node_image
size = var.node_disk_size
type = "pd-standard"
}
70-77
: Review and complete the commented-out 'vm_ips' output block
The vm_ips
output block is commented out and appears incomplete. If you intend to output VM IP information, review and complete this block.
Uncomment and adjust the output as needed:
-# output "vm_ips" {
-# value = { for vm in google_compute_instance.vm : vm.name => {
-# id = vm.id
-# public_ip = vm.
-# private_ip = google_compute_instance.vm[count.index].network_interface[0].network_ip
-# }
-# }
-# }
+output "vm_ips" {
+ value = { for vm in google_compute_instance.vm : vm.name => {
+ id = vm.id
+ public_ip = vm.network_interface[0].access_config[0].nat_ip
+ private_ip = vm.network_interface[0].network_ip
+ }
+ }
+}
Ensure that the public_ip
and private_ip
fields correctly reference the appropriate attributes.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (10)
- infra-gcp/.gitignore (1 hunks)
- infra-gcp/.terraform.lock.hcl (1 hunks)
- infra-gcp/README.md (1 hunks)
- infra-gcp/inventory.ini.tftpl (1 hunks)
- infra-gcp/module/vm/vm.tf (1 hunks)
- infra-gcp/networking.tf (1 hunks)
- infra-gcp/output.tf (1 hunks)
- infra-gcp/state.tf (1 hunks)
- infra-gcp/variables.tf (1 hunks)
- infra-gcp/vm.tf (1 hunks)
✅ Files skipped from review due to trivial changes (4)
- infra-gcp/.gitignore
- infra-gcp/.terraform.lock.hcl
- infra-gcp/README.md
- infra-gcp/variables.tf
🧰 Additional context used
🪛 checkov
infra-gcp/module/vm/vm.tf
[HIGH] 28-61: Ensure 'Block Project-wide SSH keys' is enabled for VM instances
(CKV_GCP_32)
[HIGH] 28-61: Ensure VM disks for critical VMs are encrypted with Customer Supplied Encryption Keys (CSEK)
(CKV_GCP_38)
infra-gcp/networking.tf
[HIGH] 39-51: Ensure Google compute firewall ingress does not allow unrestricted ssh access
(CKV_GCP_2)
infra-gcp/state.tf
[HIGH] 15-20: Ensure oslogin is enabled for a Project
(CKV_GCP_33)
🔇 Additional comments (5)
infra-gcp/state.tf (1)
15-20
:
Enable OS Login and follow GCP naming conventions.
Several security and consistency improvements are needed:
- OS Login should be enabled for enhanced security (CKV_GCP_33)
- Variable naming should follow GCP conventions (project_id instead of resource_group_name)
- The placeholder comment suggests incomplete configuration
Apply these security improvements:
resource "google_compute_project_metadata" "default" {
- project = var.resource_group_name
+ project = var.project_id
metadata = {
delete_os_disk_on_deletion = "true"
+ enable-oslogin = "TRUE"
}
}
Let's verify if OS Login is consistently configured across the project:
🧰 Tools
🪛 checkov
[HIGH] 15-20: Ensure oslogin is enabled for a Project
(CKV_GCP_33)
infra-gcp/inventory.ini.tftpl (1)
1-36
: Verify integration with Terraform configurations
The template assumes specific node structures and SSH key paths. Let's verify these assumptions against the Terraform configurations.
✅ Verification successful
✓ Template aligns with Terraform configuration
The inventory template is correctly structured and matches the Terraform implementation:
- The template expects master_nodes, worker_nodes, and worker_gpu_nodes, which align with the Terraform modules
k8s_master
,k8s_worker
, andk8s_worker_gpu
- SSH key paths in the template (
/home/ubuntu/.ssh/master.pem
,worker.pem
,worker_gpu.pem
) match the paths in the commented provisioner block - Each node type generates its own SSH key via the
tls_private_key
resource in the VM module - The node naming convention in the template will match the instance names generated by the VM module (
${var.resource_group_name}-${var.node_type}-${count.index}
)
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that the required node structures and SSH configurations exist in Terraform
# Check for node definitions
echo "Checking for node definitions..."
rg -A 5 'resource\s+"google_compute_instance"' --type tf
# Check for SSH key generation or management
echo "Checking for SSH key configurations..."
rg -l 'ssh|private_key' --type tf
# Check for output variables that would feed into this template
echo "Checking for relevant output variables..."
rg 'output\s+".*_nodes"' --type tf
Length of output: 967
Script:
#!/bin/bash
# Let's examine the specific node configurations and SSH key setup
# Check VM module implementation
echo "=== VM Module Implementation ==="
cat infra-gcp/module/vm/vm.tf
# Check main VM configuration
echo -e "\n=== Main VM Configuration ==="
cat infra-gcp/vm.tf
# Check output configurations
echo -e "\n=== Output Configurations ==="
cat infra-gcp/output.tf
# Look for variables that match our template's expectations
echo -e "\n=== Variables Related to Node Types ==="
rg "variable.*(\bmaster\b|\bworker\b|\bnode_type\b)" --type tf
Length of output: 8621
infra-gcp/output.tf (2)
1-19
: Verify the intention behind commented IP outputs
All VM IP outputs are commented out, which might make it difficult to interact with the infrastructure. If these are intentionally disabled for security reasons, consider:
- Using
sensitive = true
instead of commenting them out - Implementing proper access controls or VPN if direct IP access is needed
- Using DNS names instead of direct IPs
Could you clarify if these outputs should remain commented out or if they need to be enabled with appropriate security controls?
21-49
: Security: Reconsider exposing private keys as outputs
While the private keys are marked as sensitive, exposing them through Terraform outputs poses several security risks:
- Keys are stored in the Terraform state file
- No apparent key rotation mechanism
- Increased attack surface through state file access
Consider these security improvements:
- Use Google Secret Manager or HashiCorp Vault for key management
- Implement automated key rotation
- If outputs are necessary, consider outputting key references instead of actual keys
Example using Secret Manager:
resource "google_secret_manager_secret" "vm_keys" {
secret_id = "vm-private-keys"
replication {
automatic = true
}
}
resource "google_secret_manager_secret_version" "vm_keys" {
secret = google_secret_manager_secret.vm_keys.id
secret_data = jsonencode({
master = module.k8s_master.private_key,
worker = module.k8s_worker.private_key,
# ... other keys
})
}
output "secret_name" {
value = google_secret_manager_secret.vm_keys.name
}
Let's check if there are any existing secret management solutions in the codebase:
✅ Verification successful
Let me gather more context about how these private keys are being used and if there are any existing patterns for key management in the infrastructure code.
Let me check how these private keys are being used in the VM modules to understand the complete picture.
Security: Private key outputs are required for VM SSH access but need better management
Based on the verification, these private keys are legitimately used for SSH access to the VMs, where:
- Each VM module generates an RSA key pair using
tls_private_key
- The public key is attached to the VM's metadata for SSH authentication
- The private key is exposed as an output for SSH client configuration
While the current implementation follows a common pattern for VM SSH access, here are specific, actionable improvements:
- Add a lifecycle block to the
tls_private_key
resource withcreate_before_destroy = true
to ensure smooth key rotation - Consider implementing automated key rotation using external tooling or scheduled workflows
- Document the security implications and proper handling of these keys in the README
- Store the state file in an encrypted backend with strict access controls
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Look for secret management configurations
echo "Checking for secret management patterns..."
rg -l "google_secret_manager|vault_secret" || echo "No secret management found"
# Check for potential security guidelines
echo -e "\nChecking for security documentation..."
fd -g "*security*.md" -g "*README*.md" -X grep -l "key|secret|sensitive" || echo "No security documentation found"
Length of output: 542
Script:
#!/bin/bash
# Check the k8s module implementations to understand key usage
echo "Checking k8s module implementations..."
fd -g "*.tf" -X rg -l "private_key"
# Look for any existing key management patterns
echo -e "\nChecking key management patterns..."
fd -g "*.tf" -X rg -A 3 "resource\s+\"tls_private_key\""
# Check if these keys are used in other places
echo -e "\nChecking references to private keys..."
rg "private_key\s*=" -A 2
Length of output: 1263
Script:
#!/bin/bash
# Check the VM module implementation details
echo "Checking VM module private key usage..."
rg -A 10 "tls_private_key" ./infra-gcp/module/vm/vm.tf
# Check if these keys are used for SSH access
echo -e "\nChecking for SSH-related configurations..."
rg -A 5 "ssh|connection" ./infra-gcp/module/vm/vm.tf
# Check if there are any key lifecycle management configurations
echo -e "\nChecking for key lifecycle configurations..."
rg "lifecycle|create_before_destroy" ./infra-gcp/module/vm/vm.tf
Length of output: 1147
infra-gcp/networking.tf (1)
63-63
: Update 'source_ranges' in 'internal_firewall' with your internal IP range
The source_ranges
is currently set to a placeholder value "10.0.0.0/8"
with a comment indicating that it should be replaced. Ensure that you update it to reflect your actual internal IP range to enforce the correct firewall rules.
Please verify and replace the placeholder with your specific internal IP range.
required_providers { | ||
google = { | ||
source = "hashicorp/google" | ||
version = ">= 3.0.0" | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tighten version constraints and add provider configuration.
The current provider setup has two issues:
- The version constraint
>= 3.0.0
is too permissive and could lead to unexpected behavior with major version updates - Missing provider configuration block with essential GCP settings
Add a more specific version constraint and provider configuration:
terraform {
required_providers {
google = {
source = "hashicorp/google"
- version = ">= 3.0.0"
+ version = "~> 6.9.0"
}
}
}
+provider "google" {
+ project = var.project_id
+ region = var.region
+ zone = var.zone
+}
Committable suggestion was skipped due to low confidence.
[k8s_cluster:children] | ||
kube_node | ||
kube_control_plane |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider adding Ansible variables section
The inventory could benefit from a variables section to define cluster-wide configurations.
Add a variables section at the end of the file:
[k8s_cluster:children]
kube_node
kube_control_plane
+[all:vars]
+ansible_python_interpreter=/usr/bin/python3
+ansible_ssh_common_args='-o StrictHostKeyChecking=no'
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
[k8s_cluster:children] | |
kube_node | |
kube_control_plane | |
[k8s_cluster:children] | |
kube_node | |
kube_control_plane | |
[all:vars] | |
ansible_python_interpreter=/usr/bin/python3 | |
ansible_ssh_common_args='-o StrictHostKeyChecking=no' |
%{ for nodes in master_nodes ~} | ||
${nodes.name} ansible_host=${nodes.external_ip} ansible_ssh_private_key_file=/home/ubuntu/.ssh/master.pem | ||
%{ endfor ~} | ||
%{ for nodes in worker_nodes ~} | ||
${nodes.name} ansible_host=${nodes.external_ip} ansible_ssh_private_key_file=/home/ubuntu/.ssh/worker.pem | ||
%{ endfor ~} | ||
%{ for nodes in worker_gpu_nodes ~} | ||
${nodes.name} ansible_host=${nodes.external_ip} ansible_ssh_private_key_file=/home/ubuntu/.ssh/worker_gpu.pem | ||
%{ endfor ~} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve SSH key management security
The current implementation has several security considerations:
- Hardcoded paths to SSH keys in
/home/ubuntu/.ssh/
- Different key files for different node types increases management complexity
- No SSH user specification which defaults to the current user
Consider these improvements:
- Use variables for SSH paths to make them configurable
- Consider using a single SSH key for all nodes
- Explicitly specify the SSH user
%{ for nodes in master_nodes ~}
-${nodes.name} ansible_host=${nodes.external_ip} ansible_ssh_private_key_file=/home/ubuntu/.ssh/master.pem
+${nodes.name} ansible_host=${nodes.external_ip} ansible_user=${ssh_user} ansible_ssh_private_key_file=${ssh_private_key_path}
%{ endfor ~}
Committable suggestion was skipped due to low confidence.
[kube_control_plane] | ||
%{ for nodes in master_nodes ~} | ||
${nodes.name} | ||
%{ endfor ~} | ||
|
||
[etcd] | ||
%{ for nodes in master_nodes ~} | ||
${nodes.name} | ||
%{ endfor ~} | ||
|
||
[kube_node] | ||
%{ for nodes in worker_nodes ~} | ||
${nodes.name} | ||
%{ endfor ~} | ||
%{ for nodes in worker_gpu_nodes ~} | ||
${nodes.name} | ||
%{ endfor ~} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Control plane configuration requires attention
Based on the infrastructure code, the current setup has only 1 master node (k8s_master_node_count = "1"
), which means there's no high availability for the control plane and etcd. For production environments, it's recommended to have at least 3 master nodes to ensure high availability and fault tolerance.
Key findings:
- Single master node configuration (
k8s_master_node_count = "1"
) - Same node serves both control plane and etcd roles
- No redundancy for critical cluster components
🔗 Analysis chain
Verify Kubernetes control plane configuration
The current setup places all master nodes in both kube_control_plane
and etcd
groups. While this is a common pattern, ensure this aligns with your high-availability requirements.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if there are related Terraform configurations that specify the number of control plane nodes
# and their high availability setup.
# Look for control plane node count variables or locals
rg -l 'master_nodes|control_plane' --type tf
# Look for etcd-related configurations
rg -l 'etcd' --type tf
Length of output: 97
Script:
#!/bin/bash
# Let's examine the content of the VM configuration files to understand the control plane setup
# Check infra-gcp/vm.tf for master node configuration
echo "=== Content of infra-gcp/vm.tf ==="
cat infra-gcp/vm.tf
# Check infra/vm.tf as well
echo -e "\n=== Content of infra/vm.tf ==="
cat infra/vm.tf
# Also check for any variables files that might define the number of nodes
echo -e "\n=== Checking variables files ==="
fd "variables.tf" --type f
Length of output: 9995
Script:
#!/bin/bash
# Let's check the variables files to see the control plane node count configuration
echo "=== Content of infra-gcp/variables.tf ==="
cat infra-gcp/variables.tf
echo -e "\n=== Content of infra/variables.tf ==="
cat infra/variables.tf
Length of output: 2137
Script:
#!/bin/bash
# Let's check for any terraform.tfvars files that might define the actual values
echo "=== Looking for tfvars files ==="
fd ".*\.tfvars$" --type f
echo -e "\n=== Content of any found tfvars files ==="
fd ".*\.tfvars$" --type f --exec cat {}
Length of output: 1310
node_disk_size = var.k8s_worker_disk_size | ||
resource_group_name = var.resource_group_name | ||
resource_group_location = var.resource_group_location | ||
target_tags = ["k8s-master"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix incorrect target tags for worker nodes.
All worker nodes are incorrectly using the "k8s-master" target tag. This could lead to security issues as firewall rules and network policies intended for master nodes would affect workers. Each node type should have its appropriate tag:
Apply these changes:
- target_tags = ["k8s-master"]
+ target_tags = ["k8s-worker"]
- target_tags = ["k8s-master"]
+ target_tags = ["k8s-worker-gpu"]
- target_tags = ["k8s-master"]
+ target_tags = ["k8s-stateful"]
Also applies to: 46-46, 58-58
module "k8s_admin" { | ||
source = "./module/vm" | ||
node_type = "admin" | ||
number_of_nodes = 1 | ||
subnet_id = google_compute_subnetwork.master.id | ||
node_vm_size = var.k8s_admin_node_size | ||
node_disk_size = var.k8s_admin_disk_size | ||
resource_group_name = var.resource_group_name | ||
resource_group_location = var.resource_group_location | ||
node_public_ip = true | ||
target_tags = ["k8s-admin"] # Add tag for admin | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Security risk: Admin node exposed to unrestricted SSH access
The verification confirms critical security concerns:
- The admin firewall rule (
admin_firewall
) allows SSH access (port 22) from any IP (0.0.0.0/0
) - The admin node has a public IP and is placed in the master subnet
- These configurations create an unnecessarily broad attack surface
Required changes:
- Replace
source_ranges = ["0.0.0.0/0"]
with a variable for allowed CIDR ranges - Consider moving the admin node to a dedicated management subnet for better network segmentation
🔗 Analysis chain
Review security implications of admin node configuration.
The admin node:
- Has a public IP in the master subnet, which could increase the attack surface
- Should have additional security measures like restricted IP ranges for SSH access
Consider:
- Moving to a dedicated management subnet
- Adding
var.allowed_ssh_ranges
to restrict access
Let's check if any firewall rules are defined for the admin node:
Let me gather more information about the firewall rules to understand the security configuration better.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for firewall rules related to admin nodes
rg -A 5 'k8s-admin' --type tf
Length of output: 674
Script:
#!/bin/bash
# Get the full firewall rule definitions
rg -B 5 -A 15 'google_compute_firewall.*admin' --type tf
Length of output: 1182
output "private_key" { | ||
sensitive = true | ||
value = tls_private_key.rsa_key.private_key_openssh | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid outputting the private SSH key
Outputting the private key via Terraform outputs, even if marked as sensitive, poses a security risk.
Consider removing the private_key
output and securely managing SSH keys outside of Terraform outputs.
variable "resource_group_name" {} | ||
variable "resource_group_location" {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Use GCP-specific terminology for variables and resources
The variables resource_group_name
and resource_group_location
are more commonly associated with Azure. In GCP, consider using project_id
and zone
to align with standard terminology.
Apply this diff to rename variables and references:
-variable "resource_group_name" {}
-variable "resource_group_location" {}
+variable "project_id" {}
+variable "zone" {}
---
- name = "${var.resource_group_name}-${var.node_type}-${count.index}-ip"
- region = var.resource_group_location
- project = var.resource_group_name
+ name = "${var.project_id}-${var.node_type}-${count.index}-ip"
+ region = var.region
+ project = var.project_id
---
- name = "${var.resource_group_name}-${var.node_type}-${count.index}"
- zone = var.resource_group_location
- project = var.resource_group_name
+ name = "${var.project_id}-${var.node_type}-${count.index}"
+ zone = var.zone
+ project = var.project_id
Ensure to update all references to these variables throughout the configuration.
Also applies to: 31-32, 34-34
resource "google_compute_instance" "vm" { | ||
count = var.number_of_nodes | ||
name = "${var.resource_group_name}-${var.node_type}-${count.index}" | ||
zone = var.resource_group_location | ||
machine_type = var.node_vm_size | ||
can_ip_forward = true | ||
project = var.resource_group_name | ||
network_interface { | ||
subnetwork = var.subnet_id | ||
access_config { | ||
# Assign a public IP if required | ||
nat_ip = var.node_public_ip ? google_compute_address.public_ip[count.index].address : null | ||
} | ||
} | ||
|
||
boot_disk { | ||
initialize_params { | ||
image = "projects/ubuntu-os-cloud/global/images/ubuntu-2204-jammy-v20240419" | ||
size = var.node_disk_size | ||
type = "pd-standard" | ||
} | ||
} | ||
|
||
network_interface { | ||
network = "default" # Replace with your network name or variable if you have a custom network | ||
access_config { | ||
nat_ip = var.node_public_ip ? google_compute_address.public_ip[count.index].address : null | ||
} | ||
} | ||
|
||
metadata = { | ||
ssh-keys = "ubuntu:${tls_private_key.rsa_key.public_key_openssh}" | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider encrypting VM disks with Customer Supplied Encryption Keys
For enhanced data security, consider encrypting VM disks using Customer Supplied Encryption Keys (CSEK).
Introduce a variable for the KMS key:
variable "kms_key_self_link" {
type = string
default = null
}
Update the boot_disk
configuration:
boot_disk {
initialize_params {
image = var.node_image
size = var.node_disk_size
type = "pd-standard"
}
+ disk_encryption_key {
+ kms_key_self_link = var.kms_key_self_link
+ }
}
Ensure to provide the appropriate KMS key when deploying the infrastructure.
🧰 Tools
🪛 checkov
[HIGH] 28-61: Ensure 'Block Project-wide SSH keys' is enabled for VM instances
(CKV_GCP_32)
[HIGH] 28-61: Ensure VM disks for critical VMs are encrypted with Customer Supplied Encryption Keys (CSEK)
(CKV_GCP_38)
Add 'block-project-ssh-keys' to enhance instance security
To prevent the use of project-wide SSH keys and enhance security, add block-project-ssh-keys = "true"
to the instance metadata.
Apply this diff to update the metadata:
metadata = {
ssh-keys = "ubuntu:${tls_private_key.rsa_key.public_key_openssh}"
+ block-project-ssh-keys = "true"
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
resource "google_compute_instance" "vm" { | |
count = var.number_of_nodes | |
name = "${var.resource_group_name}-${var.node_type}-${count.index}" | |
zone = var.resource_group_location | |
machine_type = var.node_vm_size | |
can_ip_forward = true | |
project = var.resource_group_name | |
network_interface { | |
subnetwork = var.subnet_id | |
access_config { | |
# Assign a public IP if required | |
nat_ip = var.node_public_ip ? google_compute_address.public_ip[count.index].address : null | |
} | |
} | |
boot_disk { | |
initialize_params { | |
image = "projects/ubuntu-os-cloud/global/images/ubuntu-2204-jammy-v20240419" | |
size = var.node_disk_size | |
type = "pd-standard" | |
} | |
} | |
network_interface { | |
network = "default" # Replace with your network name or variable if you have a custom network | |
access_config { | |
nat_ip = var.node_public_ip ? google_compute_address.public_ip[count.index].address : null | |
} | |
} | |
metadata = { | |
ssh-keys = "ubuntu:${tls_private_key.rsa_key.public_key_openssh}" | |
} | |
} | |
resource "google_compute_instance" "vm" { | |
count = var.number_of_nodes | |
name = "${var.resource_group_name}-${var.node_type}-${count.index}" | |
zone = var.resource_group_location | |
machine_type = var.node_vm_size | |
can_ip_forward = true | |
project = var.resource_group_name | |
network_interface { | |
subnetwork = var.subnet_id | |
access_config { | |
# Assign a public IP if required | |
nat_ip = var.node_public_ip ? google_compute_address.public_ip[count.index].address : null | |
} | |
} | |
boot_disk { | |
initialize_params { | |
image = "projects/ubuntu-os-cloud/global/images/ubuntu-2204-jammy-v20240419" | |
size = var.node_disk_size | |
type = "pd-standard" | |
} | |
} | |
network_interface { | |
network = "default" # Replace with your network name or variable if you have a custom network | |
access_config { | |
nat_ip = var.node_public_ip ? google_compute_address.public_ip[count.index].address : null | |
} | |
} | |
metadata = { | |
ssh-keys = "ubuntu:${tls_private_key.rsa_key.public_key_openssh}" | |
block-project-ssh-keys = "true" | |
} | |
} |
🧰 Tools
🪛 checkov
[HIGH] 28-61: Ensure 'Block Project-wide SSH keys' is enabled for VM instances
(CKV_GCP_32)
[HIGH] 28-61: Ensure VM disks for critical VMs are encrypted with Customer Supplied Encryption Keys (CSEK)
(CKV_GCP_38)
resource "google_compute_firewall" "admin_firewall" { | ||
name = "${var.resource_group_name}-admin-fw" | ||
network = google_compute_network.vpc.name | ||
project = var.resource_group_name | ||
|
||
allow { | ||
protocol = "tcp" | ||
ports = ["22"] # Allow SSH | ||
} | ||
|
||
source_ranges = ["0.0.0.0/0"] # Allow from anywhere | ||
target_tags = ["k8s-admin"] # Tag for admin nodes | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restrict SSH access in 'admin_firewall' to trusted IP ranges
Allowing SSH access from any IP address (0.0.0.0/0
) poses a significant security risk. It's recommended to restrict the source_ranges
to specific trusted IP addresses or networks to enhance security.
Consider updating the source_ranges
to include only the IP ranges that require SSH access. For example:
resource "google_compute_firewall" "admin_firewall" {
name = "${var.resource_group_name}-admin-fw"
network = google_compute_network.vpc.name
project = var.resource_group_name
allow {
protocol = "tcp"
ports = ["22"] # Allow SSH
}
- source_ranges = ["0.0.0.0/0"] # Allow from anywhere
+ source_ranges = ["<trusted_ip_range>"] # Restrict to trusted IPs
target_tags = ["k8s-admin"] # Tag for admin nodes
}
Committable suggestion was skipped due to low confidence.
🧰 Tools
🪛 checkov
[HIGH] 39-51: Ensure Google compute firewall ingress does not allow unrestricted ssh access
(CKV_GCP_2)
Summary by CodeRabbit
Release Notes
New Features
.gitignore
file to manage ignored files in theinfra-gcp
directory..terraform.lock.hcl
file to ensure consistent provider versions across environments.README.md
with a guide for updating infrastructure using GCP Kubernetes Engine.Documentation
README.md
with step-by-step instructions for infrastructure updates.Chores
variables.tf
file to define configuration variables for GCP infrastructure.