Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

propose adr 0003 for workspace blob caching #1010

Merged
merged 5 commits into from
Oct 11, 2023
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions docs/adr/0003-workspace-blob-caching.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# 3. Workspace BLOB Caching

* Status: [ **proposed** | rejected | accepted | deprecated ]
* Date: 2023-09-20
* Authors: @chanwit
* Deciders: TBD

## Context

The TF-Controller is being enhanced to address the resource deletion problem
chanwit marked this conversation as resolved.
Show resolved Hide resolved
more efficiently using the contents of generated Workspace BLOBs.
This ensures that Terraform finalization procedures are streamlined and efficient.
Currently, the TF-Controller downloads a Source BLOB and pushes it to a tf-runner.
The tf-runner then processes this BLOB to create a Workspace file system.
The tf-runner generates a backend configuration file, variable files, and other necessary files
for the Workspace file system. This newly created Workspace file system is then compressed,
sent back to the TF-Controller, and stored as a Workspace BLOB in the controller's storage.
A clear caching mechanism for these BLOBs is essential to ensure efficiency, security,
chanwit marked this conversation as resolved.
Show resolved Hide resolved
and ease of access.

## Decision

1. **BLOB Creation and Storage**
* A gRPC function named `CreateWorkspaceBlob` will be invoked by the TF-Controller
to compress the Workspace file system into a tar.gz format, which is then retrieved
as a byte array.
chanwit marked this conversation as resolved.
Show resolved Hide resolved
* The caching mechanism will be executed right before the Terraform Initialization step, ensuring that the latest and most relevant data is used.
* Each Workspace Blob will be cached on the TF-Controller's local disk, using the UUID of the Terraform object as the filename,`${uuid}.tar.gz`.
* To prevent unauthorized access to the cache entries, and cache collisions, the cache file will be deleted after the finalization process is complete.
chanwit marked this conversation as resolved.
Show resolved Hide resolved
2. **Persistence**
* The persistence mechanism used by the Source Controller will be adopted for the TF-Controller's persistence volume.
chanwit marked this conversation as resolved.
Show resolved Hide resolved
3. **BLOB Encryption**
* The encryption and decryption of the BLOBs will be tasked to the runner, with the controller solely responsible for storing encrypted BLOBs.
* Each namespace will require a service account, preferably named "tf-runner".
* The token of this service account, which is natively supported by Kubernetes, will serve as the most appropriate encryption key.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it the most appropriate key?

4. **Security Measures (Based on STRIDE Analysis)**
chanwit marked this conversation as resolved.
Show resolved Hide resolved
* **Spoofing:** Implement Kubernetes RBAC for access restrictions and use mutual authentication for gRPC communications.
* **Tampering:** Use checksums for integrity verification and 0600 permissions to write-protect local disk storage.
* **Repudiation:** Ensure strong logging and auditing mechanisms for tracking activities.
* **Information Disclosure:** Utilize robust encryption algorithms, rotate encryption keys periodically, and secure service account tokens.
* **Denial of Service:** Monitor storage space and automate cleanup processes.
* **Elevation of Privilege:** Minimize permissions associated with service account tokens.
5. **First MVP & Future Planning**
* For the initial MVP, the default pod local volume will be used.
* Since a controller restart will erase the BLOB cache, it's essential to maintain data integrity and availability.
chanwit marked this conversation as resolved.
Show resolved Hide resolved
Consideration for using persistent volumes should be made for subsequent versions.

## Consequence

1. With the implementation of this architecture:
* BLOB management in TF-Controller will be optimized, leading to a more efficient and streamlined Terraform finalization process.
chanwit marked this conversation as resolved.
Show resolved Hide resolved
* Security measures will ensure the safety of the stored BLOBs, minimizing potential threats.
chanwit marked this conversation as resolved.
Show resolved Hide resolved
2. Using the default pod local volume might limit storage capabilities and risk data loss upon controller restart. This warrants the need for considering persistent volumes in future versions.
3. Encryption and security measures will demand regular maintenance and monitoring, especially concerning key rotations and integrity checks.
chanwit marked this conversation as resolved.
Show resolved Hide resolved
4. Given the complexity of this setup, the importance of robust documentation, including troubleshooting and recovery processes, becomes apparent.
chanwit marked this conversation as resolved.
Show resolved Hide resolved