From 91e4aa5b718fc814564541b013eb50bb70ba416e Mon Sep 17 00:00:00 2001 From: Eric Lipe Date: Tue, 1 Oct 2024 14:25:34 -0400 Subject: [PATCH 1/5] - Moved docs to new dir - Added connector app to support restoring DBs --- .../db-upgrade}/cloud-foundry-db-upgrade.md | 60 +++--- tdrs-backend/db-upgrade/manifest.yml | 12 ++ .../new-cloud-foundry-db-upgrade.md | 198 ++++++++++++++++++ 3 files changed, 241 insertions(+), 29 deletions(-) rename {docs/Technical-Documentation => tdrs-backend/db-upgrade}/cloud-foundry-db-upgrade.md (78%) create mode 100644 tdrs-backend/db-upgrade/manifest.yml create mode 100644 tdrs-backend/db-upgrade/new-cloud-foundry-db-upgrade.md diff --git a/docs/Technical-Documentation/cloud-foundry-db-upgrade.md b/tdrs-backend/db-upgrade/cloud-foundry-db-upgrade.md similarity index 78% rename from docs/Technical-Documentation/cloud-foundry-db-upgrade.md rename to tdrs-backend/db-upgrade/cloud-foundry-db-upgrade.md index 466b562f6..6fd95eb9e 100644 --- a/docs/Technical-Documentation/cloud-foundry-db-upgrade.md +++ b/tdrs-backend/db-upgrade/cloud-foundry-db-upgrade.md @@ -2,35 +2,37 @@ ## Process -If you are performing this process for the staging or production, you need to ensure you are performing the changes through the [HHS](https://github.com/HHS/TANF-app) repo and not the [Raft](https://github.com/raft-tech/TANF-app) repo. +If you are performing this process for the staging or production, you need to ensure you are performing the changes through the [HHS](https://github.com/HHS/TANF-app) repo and not the [Raft](https://github.com/raft-tech/TANF-app) repo. You also need to have the postgres client binaries installed on your local machine.
-### 1. SSH into a backend app in your desired environment -```bash -cf ssh tdp-backend- +### 1. Open an SSH tunnel to the service +To execute commands on the RDS instance we can open an SSH tunnel to the service and run all our commands locally. ``` -
- -### 2. Create a backup of all the databases in the ENV's RDS instance -Note: you can get the required field values from `VCAP_SERVICES`. -```bash -/home/vcap/deps/0/apt/usr/lib/postgresql//bin/pg_dump -h -p -d -U -F c --no-acl --no-owner -f .pg +cf connect-to-service --no-client ``` -
- -### 3. Copy the backup(s) to your local machine -Note: This assumes you ran the backup command above in the home directory of the app. As an added bonus for later steps, you should execute this command from somewhere within `tdrs-backend` directory! Make sure not to commit the files/directories that are copied to your local directory. -```bash -cf ssh tdp-backend-- -c 'tar cfz - ~/app/*.pg' | tar xfz - -C . +You should see out put similar to: +``` +Finding the service instance details... +Setting up SSH tunnel... +SSH tunnel created. +Skipping call to client CLI. Connection information: + +Host: localhost +Port: 63634 +Username: +Password: +Name: + +Leave this terminal open while you want to use the SSH tunnel. Press Control-C to stop. ```
-### 4. Verify backup file size(s) match the backup size(s) in the app -```bash -ls -lh /home/vcap/app +### 2. Create a backup of the database(s) in the RDS instance +Note: the , , , and are the values you received from the output of the SSH tunnel. The parameter is the name of the DB you want to export, e.g `tdp_db_raft`. You will need to run this command for each DB in the instance. ``` -As an added verification step, you should consider restoring the backups into a local server and verifying the contents with `psql` or `pgAdmin`. -

+pg_dump -h -p -d -U -F c --no-acl --no-owner -f .pg +``` +
### 5. Update the `version` key in the `json_params` item in the `database` resource in the `main.tf` file in the environment(s) you're upgrading with the new database server version ```yaml @@ -57,7 +59,7 @@ Follow the instuctions in the `terraform/README.md` and proceed from there. Modi

### 9. Bind backend to the new RDS instance to get credentials -```bash +``` cf bind-service tdp-backend- tdp-db- ``` Be sure to re-stage the app when prompted @@ -65,37 +67,37 @@ Be sure to re-stage the app when prompted ### 10. Apply the backend manifest to begin the restore process If you copied the backups as mentioned in the note from step 3, the backups will be copied for you to the app instance in the command below. If not, you will need to use `scp` to copy the backups to the app instance after running the command below. -```bash +``` cf push tdp-backend- --no-route -f manifest.buildpack.yml -t 180 --strategy rolling ```
### 11. SSH into the app you just pushed -```bash +``` cf ssh tdp-backend- ```
### 12. Create the appropriate database(s) in the new RDS server Note: you can get the required field values from `VCAP_SERVICES`. -```bash +``` /home/vcap/deps/0/apt/usr/lib/postgresql//bin/createdb -U -h ```
### 13. Restore the backup(s) to the appropriate database(s) Note: you can get the required field values from `VCAP_SERVICES`. -```bash +``` /home/vcap/deps/0/apt/usr/lib/postgresql//bin/pg_restore -p -h -U -d .pg ``` During this step, you may see errors similar to the message below. Note `` is imputed in the message to avoid leaking environment specific usernames/roles. -```bash +``` pg_restore: from TOC entry 215; 1259 17313 SEQUENCE users_user_user_permissions_id_seq pg_restore: error: could not execute query: ERROR: role "" does not exist Command was: ALTER TABLE public.users_user_user_permissions_id_seq OWNER TO ; ``` and the result and total amount of these errors should be: -```bash +``` pg_restore: warning: errors ignored on restore: 68 ``` If this is what you see, everything is OK. This happens because the `pg_dump` doesn't remove owner associations on sequences for some reason. But you will see in the blocks above that `pg_restore` correctly alters the sequence owner to the new database user. @@ -103,7 +105,7 @@ If this is what you see, everything is OK. This happens because the `pg_dump` do ### 14. Use `psql` to get into the database to check state Note: you can get the required field values from `VCAP_SERVICES`. -```bash +``` /home/vcap/deps/0/apt/usr/lib/postgresql//bin/psql ```
diff --git a/tdrs-backend/db-upgrade/manifest.yml b/tdrs-backend/db-upgrade/manifest.yml new file mode 100644 index 000000000..33f655e96 --- /dev/null +++ b/tdrs-backend/db-upgrade/manifest.yml @@ -0,0 +1,12 @@ +version: 1 +applications: +- name: db-connector + instances: 1 + memory: 512M + disk_quota: 2G + env: + POSTGRES_PASSWORD: password + docker: + image: postgres:15.7-alpine3.20 + services: + - diff --git a/tdrs-backend/db-upgrade/new-cloud-foundry-db-upgrade.md b/tdrs-backend/db-upgrade/new-cloud-foundry-db-upgrade.md new file mode 100644 index 000000000..13b794281 --- /dev/null +++ b/tdrs-backend/db-upgrade/new-cloud-foundry-db-upgrade.md @@ -0,0 +1,198 @@ +# Cloud Foundry, Cloud.gov AWS RDS Database Upgrade + +## Process + +If you are performing this process for the staging or production, you need to ensure you are performing the changes through the [HHS](https://github.com/HHS/TANF-app) repo and not the [Raft](https://github.com/raft-tech/TANF-app) repo. You also need to have the postgres client binaries installed on your local machine. + +### 1. Open an SSH tunnel to the service +To execute commands on the RDS instance we can open an SSH tunnel to the service and run all our commands locally. Keep this tunnel open in a separate terminal window until this process is complete! + +``` +cf connect-to-service --no-client +``` + +You should see out put similar to: + +``` +Finding the service instance details... +Setting up SSH tunnel... +SSH tunnel created. +Skipping call to client CLI. Connection information: + +Host: localhost +Port: 63634 +Username: +Password: +Name: + +Leave this terminal open while you want to use the SSH tunnel. Press Control-C to stop. +``` + +### 2. Create a backup of the database(s) in the RDS instance +In a separate terminal from your SSH tunnel terminal, generate the `pg_dump` files. +Note: the , , , and are the values you received from the output of the SSH tunnel. The parameter is the name of the DB you want to export, e.g `tdp_db_raft`. You will need to run this command for each DB in the instance. + +``` +pg_dump -h -p -d -U -F c --no-acl --no-owner -f .pg +``` + +After the command finishes, you should see .pg in your current working directory. Do some sanity checks on this backup file to assert it makes sense. Now that we have our backup(s), we need to begin making the Terraform changes required to support the upgrade. +
+ +### 3. Update Terraform to create a new RDS instance +Follow the instructions in the `terraform/README.md` to get Terraform configured. Modify the `main.tf` file in the `terraform/` to include a new RDS instance. E.g if you were updating `prod` to version 15.x you would add the following code to the `main.tf` file. We are NOT removing the existing `resource "cloudfoundry_service_instance" "database"` from the `main.tf` file. Note that the resource and the `name` of the new RDS instance are not the same as the original resource name and RDS name. This is on purpose and we will remedy this in later steps. + +```yaml +resource "cloudfoundry_service_instance" "new-database" { + name = "tdp-db-prod-new" + space = data.cloudfoundry_space.space.id + service_plan = data.cloudfoundry_service.rds.service_plans["medium-gp-psql"] + json_params = "{\"version\": \"15\", \"storage_type\": \"gp3\", \"storage\": 500}" + recursive_delete = true + timeouts { + create = "60m" + update = "60m" + delete = "2h" + } +} +``` +After adding the new RDS resource to `main.tf`, you can follow the rest of the instructions in the `terraform/README.md` to plan and then apply this change with Terraform. + +### 4. Bind an app to the new RDS instance +In the `tdrs-backend/db-upgrade` directory, open the `manifest.yml` file and update the `services` block to reference the new RDS service you just created: in the example this would be: `- tdp-db-prod-new`. Then deploy this manifest: `cf push --no-route -f manifest.yml -t 180`. Wait for the connector app to deploy. We need to deploy a temporary app to avoid too much downtime for the backend app(s) and so that we can start new SSH tunnel to the new RDS instance. You should now close the original SSH tunnel we opened in step 1. + +### 5. Open an SSH tunnel to the new RDS instance +Again, in a separate terminal execute the following command and leave that terminal/connection alive until further notice. +``` +cf connect-to-service --no-client db-connector +``` + +### 6. Create the appropriate database(s) in the new RDS server +Using the credentials from the new SSH tunnel, create the same DB(s) you dumped in the new RDS instance. +``` +createdb -U -h -p +``` + +### 7. Restore the backup(s) to the appropriate database(s) +Using the credentials from the new SSH tunnel, restore the backups to the appropriate DBs. +``` +pg_restore -p -h -U -d .pg +``` + +During this step, you may see errors similar to the message below. Note `` is imputed in the message to avoid leaking environment specific usernames/roles. + +``` +pg_restore: from TOC entry 215; 1259 17313 SEQUENCE users_user_user_permissions_id_seq +pg_restore: error: could not execute query: ERROR: role "" does not exist +Command was: ALTER TABLE public.users_user_user_permissions_id_seq OWNER TO ; +``` + +and the result and total amount of these errors should be something like: + +``` +pg_restore: warning: errors ignored on restore: 68 +``` + +If this is what you see, everything is OK. This happens because the `pg_dump` doesn't remove owner associations on sequences for some reason. But you will see in the blocks above that `pg_restore` correctly alters the sequence owner to the new database user. + +### 8. Use `psql` to get into the database(s) to check state +Using the credentials from the new SSH tunnel, use the psql cli to inspect the restored DBs. +``` +psql -p -h -U -d +``` +
+ +### 9. Rename and Move RDS instances +Now that we have verified that the data in our new RDS instance looks good. We need to lift and shift the backend app(s) to point to our new RDS instance as if it is the existing (now old) RDS instance. + +First we need to unbind the existing RDS instance from the backend app(s) so that way we can make name changes. +``` +cf unbind service +``` + +After unbinding the service we want to update the "old RDS" service `name` to something different, plan, and then apply those changes with Terraform. +```yaml +resource "cloudfoundry_service_instance" "database" { + name = "something-that-isnt-tdp-db-prod" + space = data.cloudfoundry_space.space.id + service_plan = data.cloudfoundry_service.rds.service_plans["medium-gp-psql"] + json_params = "{\"version\": \"15\", \"storage_type\": \"gp3\", \"storage\": 500}" + recursive_delete = true + timeouts { + create = "60m" + update = "60m" + delete = "2h" + } +} +``` + +Now we can name our "new RDS" service to the expected `name`. Then we can also plan and apply those changes with Terraform + +```yaml +resource "cloudfoundry_service_instance" "new-database" { + name = "tdp-db-prod" + space = data.cloudfoundry_space.space.id + service_plan = data.cloudfoundry_service.rds.service_plans["medium-gp-psql"] + json_params = "{\"version\": \"15\", \"storage_type\": \"gp3\", \"storage\": 500}" + recursive_delete = true + timeouts { + create = "60m" + update = "60m" + delete = "2h" + } +} +``` + +Now we will bind the new RDS service back to the backend app(s) and restage it. Be sure to monitor the app's logs to ensure it connects to the instance. + +``` +cf bind service +``` + +Then + +``` +cf restage +``` + +If the backend app is running with no issues, we can now safely remove the "old RDS" service from Terraform. Remove the entire resource block named `database` from `main.tf` re-plan and then apply the changes to remove that instance with Terraform. + +Finally, to get our Terraform state looking like it originally did, we want to rename our `new-database` resource back to `database`. That way we are consistent. To do so we rename the resource, and to avoid Terraform from deleting it (since `database` won't exist in the state) we want to inform Terraform that we have "moved" the resource. We do so by adding the following code to the `main.tf`. Note, when running `terraform plan ...` it will not show any infrastructure changes, only a name change. Ensure you still apply even if it looks like there are no changes! + +```yaml +moved { + from = cloudfoundry_service_instance.new-database + to = cloudfoundry_service_instance.database +} +``` + +After adding the above code, re-plan and apply the changes with Terrform. Once Terraform has successfully applied the change, remove the `moved` block from `main.tf`. Re-plan with Terraform and assert it agrees that there are no changes to be made. If Terraform reports changes, you have made a mistake and need to figure out where you made the mistake. + +### 10. Access the re-staged app(s) and run a smoke test +- Log in +- Submit a few datafiles +- Make sure new and existing submission histories populate correctly +- Checkout the DACs data + +If everything looks good, there is nothing to do. If apps aren't working/connecting to the new RDS instance, you will need to debug manually and determine if/where you made a mistake. + +### 11. Update the `postgresql-client` version to the new version in `tdrs-backend/apt.yml` +```yaml +- postgresql-client- +``` +Note: if the underlying OS for CloudFoundry is no longer `cflinuxfs4` (code name `jammy`) you may also need to update the repo we point to for the postgres client binaries. + +### 12. Update the postgres container version in `tdrs-backend/docker-compose.yml` +```yaml +postgres: +image: postgres: +``` + +### 13. Commit and push correct changes, revert unnecessary changes. +Commit and push the changes for: +- `main.tf` +- `tdrs-backend/apt.yml` +- `tdrs-backend/docker-compose.yml` + +Revert the changes for: +- `manifest.yml` From 2c83c9601716615591b0b4ab624dba87da31a8a9 Mon Sep 17 00:00:00 2001 From: Eric Lipe Date: Wed, 2 Oct 2024 12:06:29 -0400 Subject: [PATCH 2/5] - remove unnecessary document --- .../db-upgrade/cloud-foundry-db-upgrade.md | 184 +++++++++++----- .../new-cloud-foundry-db-upgrade.md | 198 ------------------ 2 files changed, 130 insertions(+), 252 deletions(-) delete mode 100644 tdrs-backend/db-upgrade/new-cloud-foundry-db-upgrade.md diff --git a/tdrs-backend/db-upgrade/cloud-foundry-db-upgrade.md b/tdrs-backend/db-upgrade/cloud-foundry-db-upgrade.md index 6fd95eb9e..13b794281 100644 --- a/tdrs-backend/db-upgrade/cloud-foundry-db-upgrade.md +++ b/tdrs-backend/db-upgrade/cloud-foundry-db-upgrade.md @@ -3,14 +3,16 @@ ## Process If you are performing this process for the staging or production, you need to ensure you are performing the changes through the [HHS](https://github.com/HHS/TANF-app) repo and not the [Raft](https://github.com/raft-tech/TANF-app) repo. You also need to have the postgres client binaries installed on your local machine. -
### 1. Open an SSH tunnel to the service -To execute commands on the RDS instance we can open an SSH tunnel to the service and run all our commands locally. +To execute commands on the RDS instance we can open an SSH tunnel to the service and run all our commands locally. Keep this tunnel open in a separate terminal window until this process is complete! + ``` cf connect-to-service --no-client ``` + You should see out put similar to: + ``` Finding the service instance details... Setting up SSH tunnel... @@ -25,98 +27,172 @@ Name: Leave this terminal open while you want to use the SSH tunnel. Press Control-C to stop. ``` -
### 2. Create a backup of the database(s) in the RDS instance +In a separate terminal from your SSH tunnel terminal, generate the `pg_dump` files. Note: the , , , and are the values you received from the output of the SSH tunnel. The parameter is the name of the DB you want to export, e.g `tdp_db_raft`. You will need to run this command for each DB in the instance. + ``` pg_dump -h -p -d -U -F c --no-acl --no-owner -f .pg ``` -
-### 5. Update the `version` key in the `json_params` item in the `database` resource in the `main.tf` file in the environment(s) you're upgrading with the new database server version -```yaml -json_params = "{\"version\": \"\"}" -``` +After the command finishes, you should see .pg in your current working directory. Do some sanity checks on this backup file to assert it makes sense. Now that we have our backup(s), we need to begin making the Terraform changes required to support the upgrade.
-### 6. Update the `postgresql-client` version to the new version in `tdrs-backend/apt.yml` -```yaml -- postgresql-client- -``` -Note: if the underlying OS for CloudFoundry is no longer `cflinuxfs4` you may also need to update the repo we point to for the postgres client binaries. -

+### 3. Update Terraform to create a new RDS instance +Follow the instructions in the `terraform/README.md` to get Terraform configured. Modify the `main.tf` file in the `terraform/` to include a new RDS instance. E.g if you were updating `prod` to version 15.x you would add the following code to the `main.tf` file. We are NOT removing the existing `resource "cloudfoundry_service_instance" "database"` from the `main.tf` file. Note that the resource and the `name` of the new RDS instance are not the same as the original resource name and RDS name. This is on purpose and we will remedy this in later steps. -### 7. Update the postgres container version in `tdrs-backend/docker-compose.yml` ```yaml -postgres: -image: postgres: +resource "cloudfoundry_service_instance" "new-database" { + name = "tdp-db-prod-new" + space = data.cloudfoundry_space.space.id + service_plan = data.cloudfoundry_service.rds.service_plans["medium-gp-psql"] + json_params = "{\"version\": \"15\", \"storage_type\": \"gp3\", \"storage\": 500}" + recursive_delete = true + timeouts { + create = "60m" + update = "60m" + delete = "2h" + } +} ``` -
+After adding the new RDS resource to `main.tf`, you can follow the rest of the instructions in the `terraform/README.md` to plan and then apply this change with Terraform. -### 8. Update Terraform state to delete then re-create RDS instance -Follow the instuctions in the `terraform/README.md` and proceed from there. Modify the `main.tf` file in the `terraform/` directory to inform TF of the changes. To delete the existing RDS instance you can simply comment out the whole database `resource` in the file (even though you made changes in the steps above). TF will see that the resource is no longer there, delete it, and appropriately update it's state. Then you simply re-comment the database `resource` back in with the changes you made in previous steps. TF will create the new RDS instance with your new updates, and also update the state in S3. -

+### 4. Bind an app to the new RDS instance +In the `tdrs-backend/db-upgrade` directory, open the `manifest.yml` file and update the `services` block to reference the new RDS service you just created: in the example this would be: `- tdp-db-prod-new`. Then deploy this manifest: `cf push --no-route -f manifest.yml -t 180`. Wait for the connector app to deploy. We need to deploy a temporary app to avoid too much downtime for the backend app(s) and so that we can start new SSH tunnel to the new RDS instance. You should now close the original SSH tunnel we opened in step 1. -### 9. Bind backend to the new RDS instance to get credentials +### 5. Open an SSH tunnel to the new RDS instance +Again, in a separate terminal execute the following command and leave that terminal/connection alive until further notice. ``` -cf bind-service tdp-backend- tdp-db- +cf connect-to-service --no-client db-connector ``` -Be sure to re-stage the app when prompted -

-### 10. Apply the backend manifest to begin the restore process -If you copied the backups as mentioned in the note from step 3, the backups will be copied for you to the app instance in the command below. If not, you will need to use `scp` to copy the backups to the app instance after running the command below. +### 6. Create the appropriate database(s) in the new RDS server +Using the credentials from the new SSH tunnel, create the same DB(s) you dumped in the new RDS instance. ``` -cf push tdp-backend- --no-route -f manifest.buildpack.yml -t 180 --strategy rolling +createdb -U -h -p ``` -
-### 11. SSH into the app you just pushed +### 7. Restore the backup(s) to the appropriate database(s) +Using the credentials from the new SSH tunnel, restore the backups to the appropriate DBs. ``` -cf ssh tdp-backend- +pg_restore -p -h -U -d .pg ``` -
- -### 12. Create the appropriate database(s) in the new RDS server -Note: you can get the required field values from `VCAP_SERVICES`. -``` -/home/vcap/deps/0/apt/usr/lib/postgresql//bin/createdb -U -h -``` -
-### 13. Restore the backup(s) to the appropriate database(s) -Note: you can get the required field values from `VCAP_SERVICES`. -``` -/home/vcap/deps/0/apt/usr/lib/postgresql//bin/pg_restore -p -h -U -d .pg -``` During this step, you may see errors similar to the message below. Note `` is imputed in the message to avoid leaking environment specific usernames/roles. + ``` pg_restore: from TOC entry 215; 1259 17313 SEQUENCE users_user_user_permissions_id_seq pg_restore: error: could not execute query: ERROR: role "" does not exist Command was: ALTER TABLE public.users_user_user_permissions_id_seq OWNER TO ; ``` -and the result and total amount of these errors should be: + +and the result and total amount of these errors should be something like: + ``` pg_restore: warning: errors ignored on restore: 68 ``` + If this is what you see, everything is OK. This happens because the `pg_dump` doesn't remove owner associations on sequences for some reason. But you will see in the blocks above that `pg_restore` correctly alters the sequence owner to the new database user. -

-### 14. Use `psql` to get into the database to check state -Note: you can get the required field values from `VCAP_SERVICES`. +### 8. Use `psql` to get into the database(s) to check state +Using the credentials from the new SSH tunnel, use the psql cli to inspect the restored DBs. ``` -/home/vcap/deps/0/apt/usr/lib/postgresql//bin/psql +psql -p -h -U -d ```
-### 15. Re-deploy or Re-stage the backend and frontend apps -Pending your environment you can do this GitHub labels or you can re-stage the apps from Cloud.gov. -

+### 9. Rename and Move RDS instances +Now that we have verified that the data in our new RDS instance looks good. We need to lift and shift the backend app(s) to point to our new RDS instance as if it is the existing (now old) RDS instance. + +First we need to unbind the existing RDS instance from the backend app(s) so that way we can make name changes. +``` +cf unbind service +``` + +After unbinding the service we want to update the "old RDS" service `name` to something different, plan, and then apply those changes with Terraform. +```yaml +resource "cloudfoundry_service_instance" "database" { + name = "something-that-isnt-tdp-db-prod" + space = data.cloudfoundry_space.space.id + service_plan = data.cloudfoundry_service.rds.service_plans["medium-gp-psql"] + json_params = "{\"version\": \"15\", \"storage_type\": \"gp3\", \"storage\": 500}" + recursive_delete = true + timeouts { + create = "60m" + update = "60m" + delete = "2h" + } +} +``` + +Now we can name our "new RDS" service to the expected `name`. Then we can also plan and apply those changes with Terraform + +```yaml +resource "cloudfoundry_service_instance" "new-database" { + name = "tdp-db-prod" + space = data.cloudfoundry_space.space.id + service_plan = data.cloudfoundry_service.rds.service_plans["medium-gp-psql"] + json_params = "{\"version\": \"15\", \"storage_type\": \"gp3\", \"storage\": 500}" + recursive_delete = true + timeouts { + create = "60m" + update = "60m" + delete = "2h" + } +} +``` + +Now we will bind the new RDS service back to the backend app(s) and restage it. Be sure to monitor the app's logs to ensure it connects to the instance. + +``` +cf bind service +``` -### 16. Access the re-deployed/re-staged apps and run a smoke test +Then + +``` +cf restage +``` + +If the backend app is running with no issues, we can now safely remove the "old RDS" service from Terraform. Remove the entire resource block named `database` from `main.tf` re-plan and then apply the changes to remove that instance with Terraform. + +Finally, to get our Terraform state looking like it originally did, we want to rename our `new-database` resource back to `database`. That way we are consistent. To do so we rename the resource, and to avoid Terraform from deleting it (since `database` won't exist in the state) we want to inform Terraform that we have "moved" the resource. We do so by adding the following code to the `main.tf`. Note, when running `terraform plan ...` it will not show any infrastructure changes, only a name change. Ensure you still apply even if it looks like there are no changes! + +```yaml +moved { + from = cloudfoundry_service_instance.new-database + to = cloudfoundry_service_instance.database +} +``` + +After adding the above code, re-plan and apply the changes with Terrform. Once Terraform has successfully applied the change, remove the `moved` block from `main.tf`. Re-plan with Terraform and assert it agrees that there are no changes to be made. If Terraform reports changes, you have made a mistake and need to figure out where you made the mistake. + +### 10. Access the re-staged app(s) and run a smoke test - Log in - Submit a few datafiles - Make sure new and existing submission histories populate correctly - Checkout the DACs data -
+ +If everything looks good, there is nothing to do. If apps aren't working/connecting to the new RDS instance, you will need to debug manually and determine if/where you made a mistake. + +### 11. Update the `postgresql-client` version to the new version in `tdrs-backend/apt.yml` +```yaml +- postgresql-client- +``` +Note: if the underlying OS for CloudFoundry is no longer `cflinuxfs4` (code name `jammy`) you may also need to update the repo we point to for the postgres client binaries. + +### 12. Update the postgres container version in `tdrs-backend/docker-compose.yml` +```yaml +postgres: +image: postgres: +``` + +### 13. Commit and push correct changes, revert unnecessary changes. +Commit and push the changes for: +- `main.tf` +- `tdrs-backend/apt.yml` +- `tdrs-backend/docker-compose.yml` + +Revert the changes for: +- `manifest.yml` diff --git a/tdrs-backend/db-upgrade/new-cloud-foundry-db-upgrade.md b/tdrs-backend/db-upgrade/new-cloud-foundry-db-upgrade.md deleted file mode 100644 index 13b794281..000000000 --- a/tdrs-backend/db-upgrade/new-cloud-foundry-db-upgrade.md +++ /dev/null @@ -1,198 +0,0 @@ -# Cloud Foundry, Cloud.gov AWS RDS Database Upgrade - -## Process - -If you are performing this process for the staging or production, you need to ensure you are performing the changes through the [HHS](https://github.com/HHS/TANF-app) repo and not the [Raft](https://github.com/raft-tech/TANF-app) repo. You also need to have the postgres client binaries installed on your local machine. - -### 1. Open an SSH tunnel to the service -To execute commands on the RDS instance we can open an SSH tunnel to the service and run all our commands locally. Keep this tunnel open in a separate terminal window until this process is complete! - -``` -cf connect-to-service --no-client -``` - -You should see out put similar to: - -``` -Finding the service instance details... -Setting up SSH tunnel... -SSH tunnel created. -Skipping call to client CLI. Connection information: - -Host: localhost -Port: 63634 -Username: -Password: -Name: - -Leave this terminal open while you want to use the SSH tunnel. Press Control-C to stop. -``` - -### 2. Create a backup of the database(s) in the RDS instance -In a separate terminal from your SSH tunnel terminal, generate the `pg_dump` files. -Note: the , , , and are the values you received from the output of the SSH tunnel. The parameter is the name of the DB you want to export, e.g `tdp_db_raft`. You will need to run this command for each DB in the instance. - -``` -pg_dump -h -p -d -U -F c --no-acl --no-owner -f .pg -``` - -After the command finishes, you should see .pg in your current working directory. Do some sanity checks on this backup file to assert it makes sense. Now that we have our backup(s), we need to begin making the Terraform changes required to support the upgrade. -
- -### 3. Update Terraform to create a new RDS instance -Follow the instructions in the `terraform/README.md` to get Terraform configured. Modify the `main.tf` file in the `terraform/` to include a new RDS instance. E.g if you were updating `prod` to version 15.x you would add the following code to the `main.tf` file. We are NOT removing the existing `resource "cloudfoundry_service_instance" "database"` from the `main.tf` file. Note that the resource and the `name` of the new RDS instance are not the same as the original resource name and RDS name. This is on purpose and we will remedy this in later steps. - -```yaml -resource "cloudfoundry_service_instance" "new-database" { - name = "tdp-db-prod-new" - space = data.cloudfoundry_space.space.id - service_plan = data.cloudfoundry_service.rds.service_plans["medium-gp-psql"] - json_params = "{\"version\": \"15\", \"storage_type\": \"gp3\", \"storage\": 500}" - recursive_delete = true - timeouts { - create = "60m" - update = "60m" - delete = "2h" - } -} -``` -After adding the new RDS resource to `main.tf`, you can follow the rest of the instructions in the `terraform/README.md` to plan and then apply this change with Terraform. - -### 4. Bind an app to the new RDS instance -In the `tdrs-backend/db-upgrade` directory, open the `manifest.yml` file and update the `services` block to reference the new RDS service you just created: in the example this would be: `- tdp-db-prod-new`. Then deploy this manifest: `cf push --no-route -f manifest.yml -t 180`. Wait for the connector app to deploy. We need to deploy a temporary app to avoid too much downtime for the backend app(s) and so that we can start new SSH tunnel to the new RDS instance. You should now close the original SSH tunnel we opened in step 1. - -### 5. Open an SSH tunnel to the new RDS instance -Again, in a separate terminal execute the following command and leave that terminal/connection alive until further notice. -``` -cf connect-to-service --no-client db-connector -``` - -### 6. Create the appropriate database(s) in the new RDS server -Using the credentials from the new SSH tunnel, create the same DB(s) you dumped in the new RDS instance. -``` -createdb -U -h -p -``` - -### 7. Restore the backup(s) to the appropriate database(s) -Using the credentials from the new SSH tunnel, restore the backups to the appropriate DBs. -``` -pg_restore -p -h -U -d .pg -``` - -During this step, you may see errors similar to the message below. Note `` is imputed in the message to avoid leaking environment specific usernames/roles. - -``` -pg_restore: from TOC entry 215; 1259 17313 SEQUENCE users_user_user_permissions_id_seq -pg_restore: error: could not execute query: ERROR: role "" does not exist -Command was: ALTER TABLE public.users_user_user_permissions_id_seq OWNER TO ; -``` - -and the result and total amount of these errors should be something like: - -``` -pg_restore: warning: errors ignored on restore: 68 -``` - -If this is what you see, everything is OK. This happens because the `pg_dump` doesn't remove owner associations on sequences for some reason. But you will see in the blocks above that `pg_restore` correctly alters the sequence owner to the new database user. - -### 8. Use `psql` to get into the database(s) to check state -Using the credentials from the new SSH tunnel, use the psql cli to inspect the restored DBs. -``` -psql -p -h -U -d -``` -
- -### 9. Rename and Move RDS instances -Now that we have verified that the data in our new RDS instance looks good. We need to lift and shift the backend app(s) to point to our new RDS instance as if it is the existing (now old) RDS instance. - -First we need to unbind the existing RDS instance from the backend app(s) so that way we can make name changes. -``` -cf unbind service -``` - -After unbinding the service we want to update the "old RDS" service `name` to something different, plan, and then apply those changes with Terraform. -```yaml -resource "cloudfoundry_service_instance" "database" { - name = "something-that-isnt-tdp-db-prod" - space = data.cloudfoundry_space.space.id - service_plan = data.cloudfoundry_service.rds.service_plans["medium-gp-psql"] - json_params = "{\"version\": \"15\", \"storage_type\": \"gp3\", \"storage\": 500}" - recursive_delete = true - timeouts { - create = "60m" - update = "60m" - delete = "2h" - } -} -``` - -Now we can name our "new RDS" service to the expected `name`. Then we can also plan and apply those changes with Terraform - -```yaml -resource "cloudfoundry_service_instance" "new-database" { - name = "tdp-db-prod" - space = data.cloudfoundry_space.space.id - service_plan = data.cloudfoundry_service.rds.service_plans["medium-gp-psql"] - json_params = "{\"version\": \"15\", \"storage_type\": \"gp3\", \"storage\": 500}" - recursive_delete = true - timeouts { - create = "60m" - update = "60m" - delete = "2h" - } -} -``` - -Now we will bind the new RDS service back to the backend app(s) and restage it. Be sure to monitor the app's logs to ensure it connects to the instance. - -``` -cf bind service -``` - -Then - -``` -cf restage -``` - -If the backend app is running with no issues, we can now safely remove the "old RDS" service from Terraform. Remove the entire resource block named `database` from `main.tf` re-plan and then apply the changes to remove that instance with Terraform. - -Finally, to get our Terraform state looking like it originally did, we want to rename our `new-database` resource back to `database`. That way we are consistent. To do so we rename the resource, and to avoid Terraform from deleting it (since `database` won't exist in the state) we want to inform Terraform that we have "moved" the resource. We do so by adding the following code to the `main.tf`. Note, when running `terraform plan ...` it will not show any infrastructure changes, only a name change. Ensure you still apply even if it looks like there are no changes! - -```yaml -moved { - from = cloudfoundry_service_instance.new-database - to = cloudfoundry_service_instance.database -} -``` - -After adding the above code, re-plan and apply the changes with Terrform. Once Terraform has successfully applied the change, remove the `moved` block from `main.tf`. Re-plan with Terraform and assert it agrees that there are no changes to be made. If Terraform reports changes, you have made a mistake and need to figure out where you made the mistake. - -### 10. Access the re-staged app(s) and run a smoke test -- Log in -- Submit a few datafiles -- Make sure new and existing submission histories populate correctly -- Checkout the DACs data - -If everything looks good, there is nothing to do. If apps aren't working/connecting to the new RDS instance, you will need to debug manually and determine if/where you made a mistake. - -### 11. Update the `postgresql-client` version to the new version in `tdrs-backend/apt.yml` -```yaml -- postgresql-client- -``` -Note: if the underlying OS for CloudFoundry is no longer `cflinuxfs4` (code name `jammy`) you may also need to update the repo we point to for the postgres client binaries. - -### 12. Update the postgres container version in `tdrs-backend/docker-compose.yml` -```yaml -postgres: -image: postgres: -``` - -### 13. Commit and push correct changes, revert unnecessary changes. -Commit and push the changes for: -- `main.tf` -- `tdrs-backend/apt.yml` -- `tdrs-backend/docker-compose.yml` - -Revert the changes for: -- `manifest.yml` From e57e58ba8d0af0b7f1c4a677ca9309c2e02eea2d Mon Sep 17 00:00:00 2001 From: Eric Lipe Date: Wed, 2 Oct 2024 14:37:31 -0400 Subject: [PATCH 3/5] - add intro --- tdrs-backend/db-upgrade/cloud-foundry-db-upgrade.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/tdrs-backend/db-upgrade/cloud-foundry-db-upgrade.md b/tdrs-backend/db-upgrade/cloud-foundry-db-upgrade.md index 13b794281..cec613351 100644 --- a/tdrs-backend/db-upgrade/cloud-foundry-db-upgrade.md +++ b/tdrs-backend/db-upgrade/cloud-foundry-db-upgrade.md @@ -1,9 +1,8 @@ # Cloud Foundry, Cloud.gov AWS RDS Database Upgrade +The process below provides a guide to roll our backend applications over to a new RDS version and instance. The entire process can take several hours and does involve downtime for the environment which you are upgrading. Be sure to take those factors into account when commencing the process. ## Process -If you are performing this process for the staging or production, you need to ensure you are performing the changes through the [HHS](https://github.com/HHS/TANF-app) repo and not the [Raft](https://github.com/raft-tech/TANF-app) repo. You also need to have the postgres client binaries installed on your local machine. - ### 1. Open an SSH tunnel to the service To execute commands on the RDS instance we can open an SSH tunnel to the service and run all our commands locally. Keep this tunnel open in a separate terminal window until this process is complete! From 17bdbcb364f41608d42b04c4dcc38b76e13262e0 Mon Sep 17 00:00:00 2001 From: Eric Lipe Date: Thu, 3 Oct 2024 09:47:32 -0400 Subject: [PATCH 4/5] - Clean up docs --- .../db-upgrade/cloud-foundry-db-upgrade.md | 39 +++++++++++-------- 1 file changed, 23 insertions(+), 16 deletions(-) diff --git a/tdrs-backend/db-upgrade/cloud-foundry-db-upgrade.md b/tdrs-backend/db-upgrade/cloud-foundry-db-upgrade.md index cec613351..abb9caa30 100644 --- a/tdrs-backend/db-upgrade/cloud-foundry-db-upgrade.md +++ b/tdrs-backend/db-upgrade/cloud-foundry-db-upgrade.md @@ -4,13 +4,13 @@ The process below provides a guide to roll our backend applications over to a ne ## Process ### 1. Open an SSH tunnel to the service -To execute commands on the RDS instance we can open an SSH tunnel to the service and run all our commands locally. Keep this tunnel open in a separate terminal window until this process is complete! +To execute commands on the RDS instance we can open an SSH tunnel to the service and run all our commands from our local machine. Keep this tunnel open in a separate terminal window until this process is complete! ``` cf connect-to-service --no-client ``` -You should see out put similar to: +You should see output similar to: ``` Finding the service instance details... @@ -29,17 +29,16 @@ Leave this terminal open while you want to use the SSH tunnel. Press Control-C t ### 2. Create a backup of the database(s) in the RDS instance In a separate terminal from your SSH tunnel terminal, generate the `pg_dump` files. -Note: the , , , and are the values you received from the output of the SSH tunnel. The parameter is the name of the DB you want to export, e.g `tdp_db_raft`. You will need to run this command for each DB in the instance. +Note: the HOST, PORT, DB_USER, and PASSWORD are the values you received from the output of the SSH tunnel. The DB_NAME parameter is the name of the DB you want to export, e.g `tdp_db_raft`. You will need to run this command for each DB in the instance. ``` pg_dump -h -p -d -U -F c --no-acl --no-owner -f .pg ``` -After the command finishes, you should see .pg in your current working directory. Do some sanity checks on this backup file to assert it makes sense. Now that we have our backup(s), we need to begin making the Terraform changes required to support the upgrade. -
+After the command finishes, you should see .pg in your current working directory. ### 3. Update Terraform to create a new RDS instance -Follow the instructions in the `terraform/README.md` to get Terraform configured. Modify the `main.tf` file in the `terraform/` to include a new RDS instance. E.g if you were updating `prod` to version 15.x you would add the following code to the `main.tf` file. We are NOT removing the existing `resource "cloudfoundry_service_instance" "database"` from the `main.tf` file. Note that the resource and the `name` of the new RDS instance are not the same as the original resource name and RDS name. This is on purpose and we will remedy this in later steps. +Follow the instructions in the `terraform/README.md` to get Terraform configured. Modify the `main.tf` file in the `terraform/` to include a new RDS instance. E.g if you were updating `prod` to version 15.x you would add the following code to the `main.tf` file. We are **NOT** removing the existing `resource "cloudfoundry_service_instance" "database"` from the `main.tf` file. Note that the resource name (i.e. `new-database`) and the `name` of the new RDS instance are not the same as the original resource name and RDS name. This is on purpose and we will remedy this in later steps. ```yaml resource "cloudfoundry_service_instance" "new-database" { @@ -55,25 +54,29 @@ resource "cloudfoundry_service_instance" "new-database" { } } ``` + After adding the new RDS resource to `main.tf`, you can follow the rest of the instructions in the `terraform/README.md` to plan and then apply this change with Terraform. ### 4. Bind an app to the new RDS instance -In the `tdrs-backend/db-upgrade` directory, open the `manifest.yml` file and update the `services` block to reference the new RDS service you just created: in the example this would be: `- tdp-db-prod-new`. Then deploy this manifest: `cf push --no-route -f manifest.yml -t 180`. Wait for the connector app to deploy. We need to deploy a temporary app to avoid too much downtime for the backend app(s) and so that we can start new SSH tunnel to the new RDS instance. You should now close the original SSH tunnel we opened in step 1. +In the `tdrs-backend/db-upgrade` directory, open the `manifest.yml` file and update the `services` block to reference the new RDS service you just created: in the example this would be: `- tdp-db-prod-new`. Then deploy this manifest: `cf push --no-route -f manifest.yml -t 180`. Wait for the connector app to deploy. We need to deploy a temporary app to avoid too much downtime for the backend app(s), erroneous transactions on the new RDS instance, and so that we can start a new SSH tunnel to the new RDS instance. If you haven't already, you should now close the original SSH tunnel we opened in step 1. ### 5. Open an SSH tunnel to the new RDS instance Again, in a separate terminal execute the following command and leave that terminal/connection alive until further notice. + ``` cf connect-to-service --no-client db-connector ``` ### 6. Create the appropriate database(s) in the new RDS server Using the credentials from the new SSH tunnel, create the same DB(s) you dumped in the new RDS instance. + ``` createdb -U -h -p ``` ### 7. Restore the backup(s) to the appropriate database(s) Using the credentials from the new SSH tunnel, restore the backups to the appropriate DBs. + ``` pg_restore -p -h -U -d .pg ``` @@ -92,24 +95,26 @@ and the result and total amount of these errors should be something like: pg_restore: warning: errors ignored on restore: 68 ``` -If this is what you see, everything is OK. This happens because the `pg_dump` doesn't remove owner associations on sequences for some reason. But you will see in the blocks above that `pg_restore` correctly alters the sequence owner to the new database user. +If this is what you see, everything is OK. This happens because the `pg_dump` doesn't remove all owner associations on DB objects for some reason. But you will see in the blocks above that `pg_restore` correctly alters the object owner to the new database user. ### 8. Use `psql` to get into the database(s) to check state -Using the credentials from the new SSH tunnel, use the psql cli to inspect the restored DBs. +Using the credentials from the new SSH tunnel, use the psql cli to inspect the restored DBs. You should consider counting the number of tables in the new and old DBs, counting some records across different tables, etc... + ``` psql -p -h -U -d ``` -
### 9. Rename and Move RDS instances -Now that we have verified that the data in our new RDS instance looks good. We need to lift and shift the backend app(s) to point to our new RDS instance as if it is the existing (now old) RDS instance. +Now that we have verified the data in our new RDS instance looks good, we need to lift and shift the backend app(s) to point to our new RDS instance as if it is the existing (now old) RDS instance. + +First we need to unbind the existing RDS instance from the backend app(s) it is bound to. -First we need to unbind the existing RDS instance from the backend app(s) so that way we can make name changes. ``` cf unbind service ``` After unbinding the service we want to update the "old RDS" service `name` to something different, plan, and then apply those changes with Terraform. + ```yaml resource "cloudfoundry_service_instance" "database" { name = "something-that-isnt-tdp-db-prod" @@ -125,7 +130,7 @@ resource "cloudfoundry_service_instance" "database" { } ``` -Now we can name our "new RDS" service to the expected `name`. Then we can also plan and apply those changes with Terraform +Now we can name our "new RDS" service to the expected `name` (i.e. the original `name` field from our old RDS instance). Then we plan and apply those changes with Terraform. ```yaml resource "cloudfoundry_service_instance" "new-database" { @@ -142,7 +147,7 @@ resource "cloudfoundry_service_instance" "new-database" { } ``` -Now we will bind the new RDS service back to the backend app(s) and restage it. Be sure to monitor the app's logs to ensure it connects to the instance. +Next we will bind the new RDS service back to the backend app(s) we unbound the old instance from and restage them. Be sure to monitor the backend app's logs to ensure it connects to the instance and starts as expected. ``` cf bind service @@ -154,7 +159,7 @@ Then cf restage ``` -If the backend app is running with no issues, we can now safely remove the "old RDS" service from Terraform. Remove the entire resource block named `database` from `main.tf` re-plan and then apply the changes to remove that instance with Terraform. +If the backend app(s) are running with no issues, we can now safely remove the "old RDS" service from Terraform. Remove the entire resource block named `database` from `main.tf`, plan and then apply the changes to remove that instance with Terraform. Finally, to get our Terraform state looking like it originally did, we want to rename our `new-database` resource back to `database`. That way we are consistent. To do so we rename the resource, and to avoid Terraform from deleting it (since `database` won't exist in the state) we want to inform Terraform that we have "moved" the resource. We do so by adding the following code to the `main.tf`. Note, when running `terraform plan ...` it will not show any infrastructure changes, only a name change. Ensure you still apply even if it looks like there are no changes! @@ -165,7 +170,7 @@ moved { } ``` -After adding the above code, re-plan and apply the changes with Terrform. Once Terraform has successfully applied the change, remove the `moved` block from `main.tf`. Re-plan with Terraform and assert it agrees that there are no changes to be made. If Terraform reports changes, you have made a mistake and need to figure out where you made the mistake. +After adding the above code, plan and apply the changes with Terrform. Once Terraform has successfully applied the change, remove the `moved` block from `main.tf`. Run `terraform plan ...` again and assert it agrees that there are no changes to be made. If Terraform reports changes, you have made a mistake and need to figure out where you made the mistake. ### 10. Access the re-staged app(s) and run a smoke test - Log in @@ -176,9 +181,11 @@ After adding the above code, re-plan and apply the changes with Terrform. Once T If everything looks good, there is nothing to do. If apps aren't working/connecting to the new RDS instance, you will need to debug manually and determine if/where you made a mistake. ### 11. Update the `postgresql-client` version to the new version in `tdrs-backend/apt.yml` + ```yaml - postgresql-client- ``` + Note: if the underlying OS for CloudFoundry is no longer `cflinuxfs4` (code name `jammy`) you may also need to update the repo we point to for the postgres client binaries. ### 12. Update the postgres container version in `tdrs-backend/docker-compose.yml` From 073c82ffaab17de41c37461831ff1b1946152df2 Mon Sep 17 00:00:00 2001 From: Eric Lipe Date: Thu, 3 Oct 2024 09:57:19 -0400 Subject: [PATCH 5/5] - remove credentials key --- docs/Technical-Documentation/nexus-repo.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/Technical-Documentation/nexus-repo.md b/docs/Technical-Documentation/nexus-repo.md index 5e504a384..2cf5190be 100644 --- a/docs/Technical-Documentation/nexus-repo.md +++ b/docs/Technical-Documentation/nexus-repo.md @@ -123,7 +123,7 @@ Now you will no longer have to enter the password when logging in. ## Local Docker Login After logging into the `tanf-dev` space with the `cf` cli, execute the following commands to authenticate your local docker daemon ``` -export NEXUS_DOCKER_PASSWORD=`cf service-key tanf-keys nexus-dev | tail -n +2 | jq .credentials.password` +export NEXUS_DOCKER_PASSWORD=`cf service-key tanf-keys nexus-dev | tail -n +2 | jq .password` echo "$NEXUS_DOCKER_PASSWORD" | docker login https://tdp-docker.dev.raftlabs.tech -u tdp-dev --password-stdin ```