diff --git a/.github/ISSUE_TEMPLATE/documentation.yaml b/.github/ISSUE_TEMPLATE/documentation.yaml index bf6fbf0af..441d5b261 100644 --- a/.github/ISSUE_TEMPLATE/documentation.yaml +++ b/.github/ISSUE_TEMPLATE/documentation.yaml @@ -13,6 +13,7 @@ body: - Ansible F5 Workshop - Ansible Security Automation - Ansible Windows Automation Workshop + - RHEL In-place Upgrade Automation Workshop - Smart Management Automation Workshop - Other validations: diff --git a/README.md b/README.md index 520dd2a3f..3fccb4c0e 100644 --- a/README.md +++ b/README.md @@ -24,6 +24,7 @@ The Red Hat Ansible Automation Workshops project is intended for effectively dem | **[Ansible Security Automation](./exercises/ansible_security)**
focused on automation of security tools like Check Point Firewall, IBM QRadar and the IDS Snort | [PDF](./decks/ansible_security.pdf) | [Google Source](https://docs.google.com/presentation/d/19gVCBz1BmxC15tDDj-FUlUd_jUUUKay81E8F24cyUjk/edit?usp=sharing) | [Exercises](./exercises/ansible_security) | `workshop_type: security` | | **[Ansible Windows Automation Workshop](./exercises/ansible_windows)**
focused on automation of Microsoft Windows | [PDF](./decks/ansible_windows.pdf) | [Google Source](https://docs.google.com/presentation/d/1RO5CQiCoqLDES1NvTI_1fQrR-oWM1NuW-uB0JRvtJzE) | [Exercises](./exercises/ansible_windows) | `workshop_type: windows` | | **[Smart Management Automation Workshop](./exercises/ansible_smart_mgmt)**
focused on automation of security and lifecycle management with Red Hat Satellite Server | [PDF](./decks/ansible_smart_mgmt.pdf) | [Google Source](https://docs.google.com/presentation/d/135lid9AeSioN4bJexBbv9q0fkJwDibpUQg8aeYjxzTY) | [Exercises](./exercises/ansible_smart_mgmt) | `workshop_type: smart_mgmt` +| **[RHEL In-place Upgrade Automation Workshop](./exercises/ansible_ripu)**
focused on automation of RHEL in-place upgrades at enterprise scale | [PDF](./decks/ansible_ripu.pdf) | [Google Source](https://docs.google.com/presentation/d/1U6i006Th7MQNuL1_0a0KhOSY4GfF1wFsINusDvJvXvo) | [Exercises](./exercises/ansible_ripu) | `workshop_type: ripu` 90 minute abbreviated versions: diff --git a/decks/ansible_ripu.pdf b/decks/ansible_ripu.pdf new file mode 100644 index 000000000..a5c93e091 Binary files /dev/null and b/decks/ansible_ripu.pdf differ diff --git a/exercises/ansible_config_as_code/0-setup/README.md b/exercises/ansible_config_as_code/0-setup/README.md index ca9660795..113ce4287 100644 --- a/exercises/ansible_config_as_code/0-setup/README.md +++ b/exercises/ansible_config_as_code/0-setup/README.md @@ -6,7 +6,7 @@ In this section we will show you step by step how to add pre commit linting to a If you are using the Workshop, a Workshop project should be available in your VSCode for you to push to the Workshop Gitea server. Create the files in this project folder. -NOTE: If when you click on the Explorer tab that looks like two pieces of paper and you see "Open Folder" click on that. In the popup window click windows-workshop/workshop_project/ (full path is: `/home/student/windows-workshop/workshop_project`) then click "ok". If prompted select the check box and "Yes, I trust the authors" option. You should now see a readme that has a typo saying Welcome to Windows Automation workshop. +NOTE: If when you click on the Explorer tab that looks like two pieces of paper and you see "Open Folder" click on that. In the popup window click windows-workshop/workshop_project/ (full path is: `/home/student/windows-workshop/workshop_project`) then click "ok". If prompted select the check box and "Yes, I trust the authors" option. You should now see a readme that has a typo saying Welcome to Windows Automation workshop. ## Step 1 diff --git a/exercises/ansible_config_as_code/1-ee/README.md b/exercises/ansible_config_as_code/1-ee/README.md index adfa50f3f..ba4dec8fe 100644 --- a/exercises/ansible_config_as_code/1-ee/README.md +++ b/exercises/ansible_config_as_code/1-ee/README.md @@ -20,7 +20,7 @@ Further documentation for those who are interested to learn more see: Install our ee_utilities collection and containers.podman using `ansible-galaxy` command. ```console -ansible-galaxy collection install infra.ee_utilities containers.podman community.general +ansible-galaxy collection install infra.ee_utilities:2.0.8 containers.podman community.general ``` Further documentation for those who are interested to learn more see: diff --git a/exercises/ansible_ripu/1.1-setup/README.md b/exercises/ansible_ripu/1.1-setup/README.md new file mode 100644 index 000000000..2fa9ebb4a --- /dev/null +++ b/exercises/ansible_ripu/1.1-setup/README.md @@ -0,0 +1,115 @@ +# Workshop Exercise - Your Lab Environment + +## Table of Contents + +- [Workshop Exercise - Your Lab Environment](#workshop-exercise---your-lab-environment) + - [Table of Contents](#table-of-contents) + - [Objectives](#objectives) + - [Guide](#guide) + - [Your Lab Environment](#your-lab-environment) + - [Step 1 - Access the Environment](#step-1---access-the-environment) + - [Step 2 - Open a Terminal Session](#step-2---open-a-terminal-session) + - [Step 3 - Access the AAP Web UI](#step-3---access-the-aap-web-ui) + - [Step 4 - Access the RHEL Web Console](#step-4---access-the-rhel-web-console) + - [Step 5 - Challenge Labs](#step-5---challenge-labs) + - [Conclusion](#conclusion) + +## Objectives + +* Understand the lab topology and how to access the environment +* Understand how to perform the workshop exercises +* Understand challenge labs + +## Guide + +### Your Lab Environment + +The workshop is provisioned with a pre-configured lab environment. You will have access to a host deployed with Ansible Automation Platform (AAP) which you will use to control the playbook and workflow jobs that automation the RHEL in-place upgrade workflow steps. You will also have access to some "pet" application hosts, two with RHEL7 and another two with RHEL8. These are the hosts where we will be upgrading the RHEL operating system (OS). + +| Role | Inventory name | +| ---------------------| ---------------| +| AAP Control Host | ansible-1 | +| RHEL7 pet app host 1 | tidy-bengal | +| RHEL7 pet app host 2 | strong-hyena | +| RHEL8 pet app host 1 | more-calf | +| RHEL8 pet app host 2 | upward-moray | + +> **Note** +> +> The inventory names of the pet app hosts will be random pet names different from the example above. We'll dive deeper into why we are using random names in a later exercise. + +### Step 1 - Access the Environment + +We will use Visual Studio Code (VS Code) as it provides a convenient and intuitive way to use a web browser to edit files and access terminal sessions. If you are a command line hero, direct SSH access is available if VS Code is not to your liking. There is a short YouTube video to explain if you need additional clarity: Ansible Workshops - Accessing your workbench environment. + +- You can open VS Code in your web browser using the "WebUI" link under "VS Code access" on the workshop launch page provided by your instructor. The password is given below the link. For example: + + ![Example link to VS Code WebUI](images/vscode_link.png) + +- After opening the link, type in the provided password to access your instance of VS Code. + +> **Note** +> +> A welcome wizard may appear to guide you through configuring your VS Code user experience. This is optional as the default settings will work fine for this workshop. Feel free to step though the wizard to explore the VS code bells and whistles or you may just skip it. + +### Step 2 - Open a Terminal Session + +Terminal sessions provide access to the RHEL commands and utilities that will help us understand what's going on "behind the curtain" when the RHEL in-place upgrade automation is doing its thing. + +- Use VS Code to open a terminal session. For example: + + ![Example of how to open a terminal session in VS Code](images/new_term.svg) + +- This terminal session will be running on the AAP control host `ansible-1`. Use the `cat /etc/hosts` command to see the hostnames of your pet app hosts. Next, use the `ssh` command to login to one of your pet app hosts. Finally, use the highlighted commands confirm the RHEL OS version and kernel version installed. + + For example: + + ![Example ssh login to pet app host](images/ssh_login.svg) + +- In the example above, the command `ssh tidy-bengal` connects us to a new session on the named pet app host. Then the commands `cat /etc/redhat-release` and `uname -r` are used to output the OS release information `Red Hat Enterprise Linux Server release 7.9 (Maipo)` and kernel version `3.10.0-1160.88.1.el7.x86_64` from that host. + +### Step 3 - Access the AAP Web UI + +The AAP Web UI is where we will go to submit and check the status of the Ansible playbook jobs we will use to automate the RHEL in-place upgrade workflow. + +- Let's open the AAP Web UI in a new web browser tab using the "WebUI" link under "Automation controller" on the workshop launch page. For example: + + ![Example link to AAP Web UI](images/aap_link.png) + +- Enter the username `admin` and the password provided. This will bring you to your AAP Web UI dashboard like the example below: + + ![Example AAP Web UI dashboard](images/aap_console_example.svg) + +- We will learn more about how to use the AAP Web UI in the next exercise. + +### Step 4 - Access the RHEL Web Console + +We will use the RHEL Web Console to review the results of the Leapp pre-upgrade reports we generate for our pet app servers. + +- Open a new web browser tab using the link under "RHEL Web Console" on the workshop launch page. For example: + + ![Example link to RHEL Web Console](images/cockpit_link.png) + +- Enter the username `student` and the password provided. This will bring you to a RHEL Web Console Overview page like the example below: + + ![Example RHEL Web Console](images/cockpit_example.svg) + +- We will revisit the RHEL Web Console when we are ready to review our pre-upgrade reports in an upcoming exercise. + +### Step 5 - Challenge Labs + +You will soon discover that many exercises in the workshop come with a "Challenge Lab" step. These labs are meant to give you a small task to solve using what you have learned so far. The solution of the task is shown underneath a warning sign. + +## Conclusion + +In this exercise, we learned about the lab environment we will be using to continue through the workshop exercises. We verified that we are able to use VS Code in our web browser and from there we can open terminal sessions. We also made sure we are able to access the AAP Web UI which will be the "self-service portal" we use to perform the next steps of the RHEL in-place upgrade automation workflow. Finally, we connected to the RHEL Web Console where we will soon be reviewing pre-upgrade reports. + +Use the link below to move on the the next exercise. + +--- + +**Navigation** + +[Next Exercise](../1.2-preupg/README.md) + +[Home](../README.md) diff --git a/exercises/ansible_ripu/1.1-setup/images/aap_console_example.svg b/exercises/ansible_ripu/1.1-setup/images/aap_console_example.svg new file mode 100644 index 000000000..c97421583 --- /dev/null +++ b/exercises/ansible_ripu/1.1-setup/images/aap_console_example.svg @@ -0,0 +1,2931 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.1-setup/images/aap_link.png b/exercises/ansible_ripu/1.1-setup/images/aap_link.png new file mode 100644 index 000000000..2810a51be Binary files /dev/null and b/exercises/ansible_ripu/1.1-setup/images/aap_link.png differ diff --git a/exercises/ansible_ripu/1.1-setup/images/cockpit_example.svg b/exercises/ansible_ripu/1.1-setup/images/cockpit_example.svg new file mode 100644 index 000000000..426c5977f --- /dev/null +++ b/exercises/ansible_ripu/1.1-setup/images/cockpit_example.svg @@ -0,0 +1,4950 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.1-setup/images/cockpit_link.png b/exercises/ansible_ripu/1.1-setup/images/cockpit_link.png new file mode 100644 index 000000000..e8852dd59 Binary files /dev/null and b/exercises/ansible_ripu/1.1-setup/images/cockpit_link.png differ diff --git a/exercises/ansible_ripu/1.1-setup/images/new_term.svg b/exercises/ansible_ripu/1.1-setup/images/new_term.svg new file mode 100644 index 000000000..e6eb04ecc --- /dev/null +++ b/exercises/ansible_ripu/1.1-setup/images/new_term.svg @@ -0,0 +1,7861 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.1-setup/images/ssh_login.svg b/exercises/ansible_ripu/1.1-setup/images/ssh_login.svg new file mode 100644 index 000000000..1091eb4fd --- /dev/null +++ b/exercises/ansible_ripu/1.1-setup/images/ssh_login.svg @@ -0,0 +1,70 @@ + + + + + + + + + + + + + + diff --git a/exercises/ansible_ripu/1.1-setup/images/vscode_link.png b/exercises/ansible_ripu/1.1-setup/images/vscode_link.png new file mode 100644 index 000000000..88b9de658 Binary files /dev/null and b/exercises/ansible_ripu/1.1-setup/images/vscode_link.png differ diff --git a/exercises/ansible_ripu/1.2-preupg/README.md b/exercises/ansible_ripu/1.2-preupg/README.md new file mode 100644 index 000000000..78d140f77 --- /dev/null +++ b/exercises/ansible_ripu/1.2-preupg/README.md @@ -0,0 +1,140 @@ +# Workshop Exercise - Run Pre-upgrade Jobs + +## Table of Contents + +- [Workshop Exercise - Run Pre-upgrade Jobs](#workshop-exercise---run-pre-upgrade-jobs) + - [Table of Contents](#table-of-contents) + - [Objectives](#objectives) + - [Guide](#guide) + - [Step 1 - RHEL In-place Upgrade Automation Workflow](#step-1---rhel-in-place-upgrade-automation-workflow) + - [Analysis](#analysis) + - [Upgrade](#upgrade) + - [Commit](#commit) + - [Let's Get Started](#lets-get-started) + - [Step 2 - Use AAP to Launch an Analysis Playbook Job](#step-2---use-aap-to-launch-an-analysis-playbook-job) + - [Step 3 - Review the Playbook Job Output](#step-3---review-the-playbook-job-output) + - [Step 4 - Challenge Lab: Analysis Playbook](#step-4---challenge-lab-analysis-playbook) + - [Conclusion](#conclusion) + +## Objectives + +* Understand the end-to-end RHEL in-place upgrade workflow +* Understand how to use AAP job templates to run Ansible playbooks +* Run the pre-upgrade analysis jobs + +## Guide + +### Step 1 - RHEL In-place Upgrade Automation Workflow + +Red Hat Enterprise Linux (RHEL) comes with the Leapp utility, the underlying framework that our automation approach uses to upgrade the operating system to the next major version. The [Leapp documentation](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/upgrading_from_rhel_7_to_rhel_8/index) guides users on how to use the Leapp framework to manually upgrade a RHEL host. This is fine if you only have a few RHEL hosts to upgrade, but what if you are a large enterprise with tens of thousands of RHEL hosts? The manual process does not scale. Using automation, the end-to-end process for upgrading a RHEL host is reduced to a matter of days and the total downtime required for the actual upgrade is measured in hours or less. + +Our RHEL in-place upgrade automation approach follows a workflow with three phases: + +![Three phase workflow: Analysis, Upgrade, Commit](images/ripu-workflow.svg) + +> **Note** +> +> The ![arrow pointing down at server](images/playbook_icon.svg) icon indicates workflow steps that are automated by Ansible playbooks. + +#### Analysis + +During the analysis phase, no changes are made yet. When the analysis playbook is executed, it uses the Leapp utility to scan the host for issues or blockers that may prevent a successful upgrade. Then it generates a detailed report listing any potential risks found. The report also includes recommended actions that should be followed to reduce the likelihood of the reported issues impacting the upgrade. If any recommended remediation actions are performed, the analysis scan should be run again to verify the risks are resolved. This iteration continues until everyone reviewing the report is comfortable that any remaining findings are acceptable. + + + +#### Upgrade + +After the analysis phase is done and the report indicates acceptable risk, a maintenance window can be scheduled and the upgrade phase can begin. It is during this phase that the upgrade playbooks are executed using a workflow job template. The first playbook creates a snapshot that can be used for rolling back if anything goes wrong with the upgrade. After the snapshot is created, the second playbook uses the Leapp utility to perform the upgrade where the RHEL OS is advanced to the new major version. The host will not be available for login or application access during the upgrade. When the upgrade is finished, the host will reboot under the newly upgraded RHEL major version. Now the ops and app teams can assess if the upgrade was successful by verifying all application services are working as expected. + +#### Commit + +If there are any application impacts discovered that can't be easily corrected within the scheduled maintenance window, the decision can be made to undo the upgrade by rolling back the snapshot. This will revert all changes and return the host back to the previous RHEL version. If there are no issues immediately found, the commit phase begins. During the commit phase, the host can be returned to normal operation while keeping the snapshot just in case any issues are uncovered later. After everyone is comfortable with the upgraded host, the commit playbook should be executed to delete the snapshot. The RHEL in-place upgrade is done. + +#### Let's Get Started + +The RHEL in-place upgrade automation approach workflow is designed to reduce the risks inherent in doing an in-place upgrade versus deploying a new RHEL host. Decision points at the end of the analysis and upgrade phases allow the process to be rolled back and restarted with the benefit of lessons learned through reporting checks and actual upgrade results. Of course, the best practice for avoiding production impacts or outages is to proceed with upgrades in properly configured Dev and Test environments before moving on to production hosts. + +### Step 2 - Use AAP to Launch an Analysis Playbook Job + +As we progress through the workshop, we'll refer back to this diagram to track where we are in our automation approach workflow. We are starting now in the highlighted block below: + +![Automation approach workflow diagram with analysis step highlighted](images/ripu-workflow-hl-analysis.svg) + +The first step in upgrading our pet app hosts will be executing the analysis playbook to generate the Leapp pre-upgrade report for each host. To do this, we will use the Ansible Automation Platform (AAP) automation controller host that has been pre-configured in your workshop lab environment. + +- Return to the AAP Web UI browser tab you opened in step 3 of the previous exercise. Navigate to Resources > Templates by clicking on "Templates" under the "Resources" group in the navigation menu. This will bring up a list of job templates that can be used to run playbook jobs on target hosts: + + ![Job templates listed on AAP Web UI](images/aap_templates.svg) + +- Click on the "AUTO / 01 Analysis" job template. This will display the Details tab of the job template: + + !["AUTO / 01 Analysis" job templates seen on AAP Web UI](images/analysis_template.svg) + +- From here, we could use the "Edit" button if we wanted to make any changes to the job template. This job template is already configured, so we are ready to use it to submit a playbook job. To do this, use the "Launch" button which will bring up a series of prompts. + + > **Note** + > + > The prompts that each job template presents can be configured using the "Prompt on launch" checkboxes seen when editing a job template. + + ![Analysis job variables prompt on AAP Web UI](images/analysis_vars_prompt.svg) + +- The first prompt as seen above allows for changing the default playbook variables or adding more variables. We don't need to do this at this time, so just click the "Next" button to move on. + + ![Analysis job survey prompt on AAP Web UI](images/analysis_survey_prompt.svg) + +- Next we see the job template survey prompt. A survey is a customizable set of prompts that can be configured from the Survey tab of the job template. For this job template, the survey allows for choosing a group of hosts on which the job will execute the playbook. Choose the "ALL_rhel" option and click the "Next" button. This will bring you to a preview of the selected job options and variable settings. + + ![Analysis job preview on AAP Web UI](images/analysis_preview.svg) + +- If you are satisfied with the job preview, use the "Launch" button to start the playbook job. + +### Step 3 - Review the Playbook Job Output + +After launching the analysis playbook job, the AAP Web UI will navigate automatically to the job output page for the job you just started. + +- While the playbook job is running, you can monitor its progress by clicking the "Follow" button. When you are in follow mode, the output will scroll automatically as task results are streamed to the bottom of job output shown in the AAP Web UI. + +- The analysis playbook will run the Leapp pre-upgrade scan. This will take about two or three minutes to complete. When it is done, you can find a "PLAY RECAP" at the end of the job output showing the success or failure status for the playbook runs executed on each host. A status of "failed=0" indicates a successful playbook run. Scroll to the bottom of the job output and you should see that your job summary looks like this example: + + ![Analysis job "PLAY RECAP" as seen at the end of the job output](images/analysis_job_recap.svg) + +### Step 4 - Challenge Lab: Analysis Playbook + +Let's take a closer look at the playbook we just ran. + +> **Tip** +> +> Try looking at the configuration details of the "Project Leapp" project and the "AUTO / 01 Analysis" job template. + +Can you find the upstream source repo and playbook code? + +> **Warning** +> +> **Solution below\!** + +- In the AAP Web UI, navigate to Resources > Projects > Project Leapp. Under the Details tab, you will see the "Source Control URL" setting that defines where job templates of this project will go to pull their playbooks. We see it is pointing to this git repo on GitHub: [https://github.com/redhat-partner-tech/leapp-project](https://github.com/redhat-partner-tech/leapp-project). Open this URL in a new browser tab. + +- Go back to the AAP Web UI and now navigate to Resources > Templates > AUTO / 01 Analysis. Under the Details tab, you will see the "Playbook" setting with the name of the playbook this job template runs when it is used to submit a job. The playbook name is `analysis.yml`. In your GitHub browser tab, you can find `analysis.yml` listed in the files of the git repo. Click on it to see the playbook contents. + +- Notice that the `Run RIPU preupg` task of the playbook is importing a role from the `infra.leapp` Ansible collection. By checking the `collections/requirements.yml` file in the git repo, we can discover that this role comes from another git repo at [https://github.com/redhat-cop/infra.leapp](https://github.com/redhat-cop/infra.leapp). It is the `analysis` role under this second git repo that provides all the automation tasks that ultimately runs the Leapp pre-upgrade scan and generates the report. + +- Drill down to the `roles/analysis` directory in this git repo to review the README and yaml source files. + +When you are ready to develop your own custom playbooks to run upgrades for your enterprise, you should consider using roles from the `infra.leapp` Ansible collection to make your job easier. + +## Conclusion + +In this exercise, we learned about the end-to-end workflow used by our automation approach for doing RHEL in-place upgrades. We used a job template in AAP to submit a playbook job that ran the Leapp pre-upgrade analysis on our pet application servers. In the challenge lab, we explored the playbook that we ran and how it includes a role from an upstream Ansible collection. + +In the next exercise, we will review the pre-upgrade reports we just generated and take action to resolve any high-risk findings that were identified. + +--- + +**Navigation** + +[Previous Exercise](../1.1-setup/README.md) - [Next Exercise](../1.3-report/README.md) + +[Home](../README.md) diff --git a/exercises/ansible_ripu/1.2-preupg/images/aap_templates.svg b/exercises/ansible_ripu/1.2-preupg/images/aap_templates.svg new file mode 100644 index 000000000..873bbea11 --- /dev/null +++ b/exercises/ansible_ripu/1.2-preupg/images/aap_templates.svg @@ -0,0 +1,3360 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.2-preupg/images/analysis_job_recap.svg b/exercises/ansible_ripu/1.2-preupg/images/analysis_job_recap.svg new file mode 100644 index 000000000..8706f9dd2 --- /dev/null +++ b/exercises/ansible_ripu/1.2-preupg/images/analysis_job_recap.svg @@ -0,0 +1,6340 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.2-preupg/images/analysis_preview.svg b/exercises/ansible_ripu/1.2-preupg/images/analysis_preview.svg new file mode 100644 index 000000000..d8381b648 --- /dev/null +++ b/exercises/ansible_ripu/1.2-preupg/images/analysis_preview.svg @@ -0,0 +1,2212 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.2-preupg/images/analysis_survey_prompt.svg b/exercises/ansible_ripu/1.2-preupg/images/analysis_survey_prompt.svg new file mode 100644 index 000000000..57c09d3ce --- /dev/null +++ b/exercises/ansible_ripu/1.2-preupg/images/analysis_survey_prompt.svg @@ -0,0 +1,1016 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.2-preupg/images/analysis_template.svg b/exercises/ansible_ripu/1.2-preupg/images/analysis_template.svg new file mode 100644 index 000000000..0623d6705 --- /dev/null +++ b/exercises/ansible_ripu/1.2-preupg/images/analysis_template.svg @@ -0,0 +1,4366 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.2-preupg/images/analysis_vars_prompt.svg b/exercises/ansible_ripu/1.2-preupg/images/analysis_vars_prompt.svg new file mode 100644 index 000000000..d81fa0f19 --- /dev/null +++ b/exercises/ansible_ripu/1.2-preupg/images/analysis_vars_prompt.svg @@ -0,0 +1,1386 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.2-preupg/images/playbook_icon.svg b/exercises/ansible_ripu/1.2-preupg/images/playbook_icon.svg new file mode 100644 index 000000000..049810f10 --- /dev/null +++ b/exercises/ansible_ripu/1.2-preupg/images/playbook_icon.svg @@ -0,0 +1,67 @@ + + + + + + + + + + + + + + + + diff --git a/exercises/ansible_ripu/1.2-preupg/images/ripu-workflow-hl-analysis.svg b/exercises/ansible_ripu/1.2-preupg/images/ripu-workflow-hl-analysis.svg new file mode 100644 index 000000000..6d28f0406 --- /dev/null +++ b/exercises/ansible_ripu/1.2-preupg/images/ripu-workflow-hl-analysis.svg @@ -0,0 +1,769 @@ + + + +ReviewReportLooksOK?YESNOAnalysis PhaseUpgrade PhaseCommit PhaseLooksOK?App TeamValidationsCan Fix?RollbackYESNONOYESDeleteSnapshotDoneStartApplyRecommendedRemediationRunPre-upgradeAnalysisRunIn-placeUpgradeCreateSnapshot diff --git a/exercises/ansible_ripu/1.2-preupg/images/ripu-workflow.svg b/exercises/ansible_ripu/1.2-preupg/images/ripu-workflow.svg new file mode 100644 index 000000000..889e8d16d --- /dev/null +++ b/exercises/ansible_ripu/1.2-preupg/images/ripu-workflow.svg @@ -0,0 +1,759 @@ + + + +ReviewReportLooksOK?YESNOAnalysis PhaseUpgrade PhaseCommit PhaseLooksOK?App TeamValidationsCan Fix?RollbackYESNONOYESDeleteSnapshotDoneStartApplyRecommendedRemediationRunPre-upgradeAnalysisRunIn-placeUpgradeCreateSnapshot diff --git a/exercises/ansible_ripu/1.3-report/README.md b/exercises/ansible_ripu/1.3-report/README.md new file mode 100644 index 000000000..f64fef400 --- /dev/null +++ b/exercises/ansible_ripu/1.3-report/README.md @@ -0,0 +1,201 @@ +# Workshop Exercise - Review Pre-upgrade Reports + +## Table of Contents + +- [Workshop Exercise - Review Pre-upgrade Reports](#workshop-exercise---review-pre-upgrade-reports) + - [Table of Contents](#table-of-contents) + - [Objectives](#objectives) + - [Guide](#guide) + - [Step 1 - Managing Leapp Pre-upgrade Results](#step-1---managing-leapp-pre-upgrade-results) + - [Step 2 - Navigating the RHEL Web Console](#step-2---navigating-the-rhel-web-console) + - [Step 3 - Review Leapp Pre-upgrade Report of RHEL8 Host](#step-3---review-leapp-pre-upgrade-report-of-rhel8-host) + - [Step 4 - Review Leapp Pre-upgrade Report of RHEL7 Host](#step-4---review-leapp-pre-upgrade-report-of-rhel7-host) + - [Challenge Lab: What About Ignoring So Many High Findings?](#challenge-lab-what-about-ignoring-so-many-high-findings) + - [Conclusion](#conclusion) + +## Objectives + +* Understand the different options for managing Leapp pre-upgrade reports +* Use the RHEL Web Console to review the reports we generated +* Learn how to filter pre-upgrade report entries +* Embrace failure! + +## Guide + +### Step 1 - Managing Leapp Pre-upgrade Results + +In the previous exercise, we used a playbook job template to generate a Leapp pre-upgrade report on each of our pet app servers. Now we need to review the findings listed in those reports. There are a number of different ways that we can access the reports. Let's review these and consider the pros and cons: + +- If we we're using the Leapp framework to manually upgrade just a single RHEL host, we could simply get to a shell prompt on the host and look at the local report file output. In [Exercise 1.1, Step 2](../1.1-setup/README.md#step-2---open-a-terminal-session), we learned how to open an ssh session to one of our pet app servers. Follow those steps and after logging in, use this command to review the local Leapp pre-upgrade report file: + + ``` + less /var/log/leapp/leapp-report.txt + ``` + + This is a "quick and dirty" way to review the report, but doesn't scale if you need to review reports for a large number of hosts. + + > **Note** + > + > Use the up and down arrow keys to scroll through the file and type `q` when you are ready to quit the `less` command. + +- If your RHEL hosts are registered to [Red Hat Insights](https://www.redhat.com/en/technologies/management/insights), you can see the Leapp pre-upgrade reports on your Insights console. The pet app servers provisioned for this workshop are not registered to Insights, so we can't demonstrate this here. Read the blog article [Take the unknowns out of RHEL upgrades with Red Hat Insights](https://www.redhat.com/en/blog/take-unknowns-out-rhel-upgrades-red-hat-insights) to see an example of how Insights can be used to review and manage Leapp pre-upgrades. + +- RHEL includes an optional administration web console based on [Cockpit](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/managing_systems_using_the_rhel_8_web_console/index#what-is-the-RHEL-web-console_getting-started-with-the-rhel-8-web-console) that we call the RHEL Web Console. We will explore how to review the Leapp pre-upgrade reports using the RHEL Web Console in the next step of this exercise. + +- In addition to writing the plain text `leapp-report.txt` file, Leapp also generates a JSON format `leapp-report.json` file. This file includes the same report results as the plain text file, but in JSON format which is perfect for being ingested by log management tools like Elastic/Kibana or Splunk. Many large enterprises will push their pre-upgrade report data to one of these tools to develop their own custom dashboards that can filter reports by environment (e.g., Dev/Test/Prod), location, app ID, owning team, etc. + +### Step 2 - Navigating the RHEL Web Console + +For this workshop, we will be using the RHEL Web Console to access the Leapp pre-upgrade reports we generated. + +- Return to the RHEL Web Console browser tab you opened from [Exercise 1.1, Step 4](../1.1-setup/README.md#step-4---access-the-rhel-web-console). This is the RHEL Web Console of the AAP controller host, but we need to access our pet app server hosts to see their pre-upgrade reports. Do this by clicking the "student​@​ansible-1.example.com" box in the top left corner of the RHEL Web Console to reveal the remote host menu. For example: + + ![Remote host menu listing all pet app servers](images/remote_host_menu_with_pets.svg) + +- You can use the remote host menu to navigate to the web consoles of each of your pet app servers. Try selecting one of your pet servers now. The RHEL Web Console system overview page will show the operating system version installed. For example, this pet app server is running RHEL8: + + ![upward-moray running Red Hat Enterprise Linux 8.7 (Ootpa)](images/rhel8_os.svg) + + Here is an example of one running RHEL7: + + ![Operating System Red Hat Enterprise Linux Server 7.9 (Maipo)](images/rhel7_os.svg) + +- When you navigate to different hosts in the RHEL Web Console, look out for the "limited access mode" warning: + + ![Web console is running in limited access mode](images/limited_access.svg) + + If you see this, use the button to switch to administrative access mode before proceeding. A confirmation will appear like this: + + ![You now have administrative access](images/administrative_access.svg) + +- Take some time to explore the navigation menus available with the RHEL Web Console of your different pet app servers. Once you feel comfortable navigating around the console and switching between hosts, move on to the next step where we will look at our first pre-upgrade report. + +### Step 3 - Review Leapp Pre-upgrade Report of RHEL8 Host + +Now we are ready to use the RHEL Web Console to see the Leapp pre-upgrade reports. Let's start by looking at one of the RHEL8 hosts and then we'll look at one of the RHEL7 hosts in the next step. + +While you might be interested in learning about upgrading only RHEL7 or RHEL8, we recommend following the exercise steps for both. This workshop presents the skills you need with the RHEL7 and RHEL8 examples covering different topics you must know irrespective of the OS version being upgraded. + +We are now here in the automation approach workflow: + +![Automation approach workflow diagram with review report step highlighted](images/ripu-workflow-hl-review.svg) + +- Navigate to the RHEL Web Console remote host menu and click on the hostname of one of your RHEL8 pet app servers. Remember as we learned in the previous step, you can confirm the RHEL version on the system overview page. Also make sure you enabled administrative access as explained in the previous step. + +- Having verified you are looking at one of the RHEL8 pet app servers, use the main menu to navigate to Tools > Upgrade Report. This will display the Leapp pre-upgrade report that was generated for the selected host. For example, the report might look like this: + + ![Example pre-upgrade report of RHEL8 host](images/rhel8_report.svg) + + > **Note** + > + > The contents of your report may differ from the example above because of updates made to the Leapp framework and other RHEL packages released over time since this workshop was written. If you discover any differences that materially break the flow of the exercises in the workshop, kindly let us know by raising an issue [here](https://github.com/ansible/workshops/issues/new). + +- When the pre-upgrade report is generated, the Leapp framework collects system data and assesses upgradeability based on a large collection of checks. When any of these checks uncovers a potential risk, it is recorded as a finding in the report. These findings are listed in order from highest risk to lowest. In the report above, we see there are three high risk findings. Let's review each of these. + +- The first finding we see listed has the title "Leapp could not identify where GRUB core is located." You can see additional details for any finding by clicking on it in the list. For example, click on the first finding and you will see these details: + + ![Details view of grub core finding](images/grub_core_finding.svg) + + This finding is being reported because the EC2 instances deployed for the workshop do not have a separate /boot partition. We'll ignore this one for now, but make a mental note as we may revisit this with a Challenge Lab in a later exercise. + +- The next finding is titled "Remote root logins globally allowed using password." Click on it to see the details: + + ![Details view of remote root logins finding](images/remote_root_logins_finding.svg) + + This finding is meant to raise awareness of a change to the default root login settings introduced with RHEL9. We can safely ignore this finding because surely everybody already follows best practices by never logging in directly as the root user. + +- That brings us to the final high risk finding. This one is a little embarrassing because it's actually a known bug in the Leapp framework. + + ![Details view usage of deprecated model bug finding](images/leapp_bug_finding.svg) + + Luckily, it is completely benign and we can safely ignore it. This bug will be fixed with an update to the Leapp framework expected to be released soon. + +- The good news is that none of the findings with our RHEL8 host were the most severe "inhibitor" classification. When any inhibitor findings are reported, the RHEL upgrade is blocked and can't proceed without first taking action to correct the cause of the inhibitor risk finding. + +- There are a number of filtering options you can use to limit the findings that are displayed according to risk level, audience, etc. Click on the "Filters" button to experiment with this feature. For example, if you click the "Is inhibitor?" filter checkbox, you will see no findings displayed because there were no inhibitors. + +- Let's now move on to the pre-upgrade report for one of our RHEL7 hosts. Spoiler alert: we will have to deal with some inhibitor findings with this one! + +### Step 4 - Review Leapp Pre-upgrade Report of RHEL7 Host + +In the previous step, we reviewed the pre-upgrade report for one of our RHEL8 hosts. Now let's take a look at the report from one of our RHEL7 hosts. + +- Navigate to the RHEL Web Console remote host menu and click on the hostname of one of your RHEL7 pet app servers. Verify the host you have chosen is RHEL7. Then use the main menu to navigate to Tools > Upgrade Report. This will bring up the Leapp pre-upgrade report for the selected host. For example, the report might look like this: + + ![Example pre-upgrade report of RHEL7 host](images/rhel7_report.svg) + + > **Note** + > + > The contents of your report may differ from the example above because of updates made to the Leapp framework and other RHEL packages released over time since this workshop was written. If you discover any differences that materially break the flow of the exercises in this workshop, kindly let us know by raising an issue [here](https://github.com/ansible/workshops/issues/new). + +- In the report for our RHEL7 pet app server above, we see there are six high risk findings and two of those are inhibitor findings. Let's start by reviewing the high risk findings that are not inhibitors. + +- The "GRUB core will be updated during upgrade" finding is no different than the finding with the same title we learned about in the RHEL8 pre-upgrade report, so we'll ignore this for now. + +- The high risk finding "Usage of deprecated Model" is again because of the Leapp framework bug we talked about before. It's annoying but benign and we can ignore it. + +- Now let's look at the new findings we are seeing only on our RHEL7 pre-upgrade report. At the top of the list we see the "Packages available in excluded repositories will not be installed" finding. Clicking on the finding to bring up the detailed view, we see this: + + ![Details view of packages available in excluded repositories will not be installed](images/excluded_repos_finding.svg) + + This finding is warning that packages python3-pyxattr and rpcgen will not be upgraded because "they are available only in target system repositories that are intentionally excluded from the list of repositories used during the upgrade," but then refers to an informational finding titled "Excluded target system repositories" for more information. Scroll down and click on that finding to show its details: + + ![Details view of excluded target system repositories information finding](images/enablerepo_info_finding.svg) + + Here we see the remediation hint suggests to run the `leapp` utility with the `--enablerepo` option. But wait, that's assuming we are manually running the `leapp` command. Don't worry, in an upcoming exercise, we'll explore how this option can be given by setting a variable when submitting the upgrade playbook job. Stay tuned! + +- The next high risk entry on the list is the "Difference in Python versions and support in RHEL8" finding: + + ![Details view of Difference in Python versions and support in RHEL8 finding](images/python_finding.svg) + + This finding could be a concern if we have any apps on our pet server that are using the system-provided Python interpreter. Let's assume we don't have any of those in which case we can blissfully ignore this finding. + +- That leaves us with our two inhibitor findings. The first is the "Possible problems with remote login using root account" finding. You know the drill; click on the finding to review the details: + + ![Details view of possible problems with remote login using root account inhibitor finding](images/root_account_inhibitor.svg) + + Remember that with inhibitor findings, if we don't take action to resolve the inhibitor, the Leapp framework will block the RHEL in-place upgrade from going forward. + +- The other inhibitor is the "Missing required answers in the answer file" finding. Here are the details for this one: + + ![Details view of missing required answers in the answer file](images/missing_answers_inhibitor.svg) + + Here again, we will need to take action to remediate this finding. Don't panic! In the next exercise, we will explore different options for automating the required remediation actions and recommendations. + +### Challenge Lab: What About Ignoring So Many High Findings? + +You may be wondering why are we only worrying about the inhibitor findings. What about all the other high risk findings showing up in red on the report? Red means danger! Why would we be going forward with attempting an upgrade without first resolving all the findings on the report? It's a fair question. + +> **Tip** +> +> Think back to the four key features that we introduced at the beginning of the workshop. + +Is there a specific feature that helps with reducing risk? + +> **Warning** +> +> **Solution below\!** + +Of course, the answer is our automated snapshot/rollback capability. + +- If any of the high risk findings listed in the pre-upgrade report ultimately leads to the upgrade failing or results in application compatibility impact, we can quickly get back to where we started by rolling back the snapshot. Before rolling back, we can debug the root cause and use the experience to understand the best way to eliminate the risk of that failure or impact happening in the future. + +- There is a concept explained quite well in the famous article [Fail Fast](http://www.martinfowler.com/ieeeSoftware/failFast.pdf) published in *IEEE Software*. The article dates back to 2004, so this is hardly a new concept. Unfortunately, there is a stigma associated with failure that can lead to excessively risk-averse behavior. The most important benefit of having automated snapshots is being able to quickly revert failures. That allows us to safely adopt a fail fast and fail smart mantra. + +- Of course, there are many best practices we can follow to reduce risk. Obviously, test for application impacts by trying upgrades in your lower environments first. Any issues that can be worked out with Dev and Test servers will help you be prepared to avoid those issues in production. + +- The high risk findings reported by the Leapp pre-upgrade report are there to make us aware of potential failure modes, but experience has shown that they are not a problem in many cases. Don't become petrified when you see those red findings on the report. Upgrade early and often! + +## Conclusion + +In this exercise, we learned about the different options for managing Leapp pre-upgrade reports. We used the RHEL Web Console to look at the reports we generated in the previous exercise and reviewed a number of the reported findings. In the challenge lab, we explored the importance of snapshots and learned to embrace failure. + +In the next exercise, we are going to look at how to automate the remediation actions required to resolve our inhibitor findings. + +--- + +**Navigation** + +[Previous Exercise](../1.2-preupg/README.md) - [Next Exercise](../1.4-remediate/README.md) + +[Home](../README.md) diff --git a/exercises/ansible_ripu/1.3-report/images/administrative_access.svg b/exercises/ansible_ripu/1.3-report/images/administrative_access.svg new file mode 100644 index 000000000..f38097461 --- /dev/null +++ b/exercises/ansible_ripu/1.3-report/images/administrative_access.svg @@ -0,0 +1,339 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.3-report/images/enablerepo_info_finding.svg b/exercises/ansible_ripu/1.3-report/images/enablerepo_info_finding.svg new file mode 100644 index 000000000..b1edb1c22 --- /dev/null +++ b/exercises/ansible_ripu/1.3-report/images/enablerepo_info_finding.svg @@ -0,0 +1,3182 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.3-report/images/excluded_repos_finding.svg b/exercises/ansible_ripu/1.3-report/images/excluded_repos_finding.svg new file mode 100644 index 000000000..6281ee889 --- /dev/null +++ b/exercises/ansible_ripu/1.3-report/images/excluded_repos_finding.svg @@ -0,0 +1,2232 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.3-report/images/grub_core_finding.svg b/exercises/ansible_ripu/1.3-report/images/grub_core_finding.svg new file mode 100644 index 000000000..fcfafffc1 --- /dev/null +++ b/exercises/ansible_ripu/1.3-report/images/grub_core_finding.svg @@ -0,0 +1,1521 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.3-report/images/leapp_bug_finding.svg b/exercises/ansible_ripu/1.3-report/images/leapp_bug_finding.svg new file mode 100644 index 000000000..67c4a121f --- /dev/null +++ b/exercises/ansible_ripu/1.3-report/images/leapp_bug_finding.svg @@ -0,0 +1,1744 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.3-report/images/limited_access.svg b/exercises/ansible_ripu/1.3-report/images/limited_access.svg new file mode 100644 index 000000000..55290ef80 --- /dev/null +++ b/exercises/ansible_ripu/1.3-report/images/limited_access.svg @@ -0,0 +1,422 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.3-report/images/missing_answers_inhibitor.svg b/exercises/ansible_ripu/1.3-report/images/missing_answers_inhibitor.svg new file mode 100644 index 000000000..b0b3c9929 --- /dev/null +++ b/exercises/ansible_ripu/1.3-report/images/missing_answers_inhibitor.svg @@ -0,0 +1,2124 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.3-report/images/python_finding.svg b/exercises/ansible_ripu/1.3-report/images/python_finding.svg new file mode 100644 index 000000000..bf2f790e0 --- /dev/null +++ b/exercises/ansible_ripu/1.3-report/images/python_finding.svg @@ -0,0 +1,2183 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.3-report/images/remote_host_menu_with_pets.svg b/exercises/ansible_ripu/1.3-report/images/remote_host_menu_with_pets.svg new file mode 100644 index 000000000..ba8d35c84 --- /dev/null +++ b/exercises/ansible_ripu/1.3-report/images/remote_host_menu_with_pets.svg @@ -0,0 +1,833 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.3-report/images/remote_root_logins_finding.svg b/exercises/ansible_ripu/1.3-report/images/remote_root_logins_finding.svg new file mode 100644 index 000000000..9cde352b5 --- /dev/null +++ b/exercises/ansible_ripu/1.3-report/images/remote_root_logins_finding.svg @@ -0,0 +1,2318 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.3-report/images/rhel7_os.svg b/exercises/ansible_ripu/1.3-report/images/rhel7_os.svg new file mode 100644 index 000000000..cca590656 --- /dev/null +++ b/exercises/ansible_ripu/1.3-report/images/rhel7_os.svg @@ -0,0 +1,1231 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.3-report/images/rhel7_report.svg b/exercises/ansible_ripu/1.3-report/images/rhel7_report.svg new file mode 100644 index 000000000..baa3b8e99 --- /dev/null +++ b/exercises/ansible_ripu/1.3-report/images/rhel7_report.svg @@ -0,0 +1,4909 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.3-report/images/rhel8_os.svg b/exercises/ansible_ripu/1.3-report/images/rhel8_os.svg new file mode 100644 index 000000000..5b85e68ba --- /dev/null +++ b/exercises/ansible_ripu/1.3-report/images/rhel8_os.svg @@ -0,0 +1,925 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.3-report/images/rhel8_report.svg b/exercises/ansible_ripu/1.3-report/images/rhel8_report.svg new file mode 100644 index 000000000..da293fd93 --- /dev/null +++ b/exercises/ansible_ripu/1.3-report/images/rhel8_report.svg @@ -0,0 +1,4889 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.3-report/images/ripu-workflow-hl-review.svg b/exercises/ansible_ripu/1.3-report/images/ripu-workflow-hl-review.svg new file mode 100644 index 000000000..8e874a9f5 --- /dev/null +++ b/exercises/ansible_ripu/1.3-report/images/ripu-workflow-hl-review.svg @@ -0,0 +1,769 @@ + + + +ReviewReportLooksOK?YESNOAnalysis PhaseUpgrade PhaseCommit PhaseLooksOK?App TeamValidationsCan Fix?RollbackYESNONOYESDeleteSnapshotDoneStartApplyRecommendedRemediationRunPre-upgradeAnalysisRunIn-placeUpgradeCreateSnapshot diff --git a/exercises/ansible_ripu/1.3-report/images/root_account_inhibitor.svg b/exercises/ansible_ripu/1.3-report/images/root_account_inhibitor.svg new file mode 100644 index 000000000..c5c1f1eb4 --- /dev/null +++ b/exercises/ansible_ripu/1.3-report/images/root_account_inhibitor.svg @@ -0,0 +1,2323 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.4-remediate/README.md b/exercises/ansible_ripu/1.4-remediate/README.md new file mode 100644 index 000000000..c326518b8 --- /dev/null +++ b/exercises/ansible_ripu/1.4-remediate/README.md @@ -0,0 +1,153 @@ +# Workshop Exercise - Perform Recommended Remediation + +## Table of Contents + +- [Workshop Exercise - Perform Recommended Remediation](#workshop-exercise---perform-recommended-remediation) + - [Table of Contents](#table-of-contents) + - [Objectives](#objectives) + - [Guide](#guide) + - [Step 1 - Explore Options for Resolving Inhibitors](#step-1---explore-options-for-resolving-inhibitors) + - [Step 2 - Managing the Leapp Answer File](#step-2---managing-the-leapp-answer-file) + - [Step 3 - Resolving Inhibitors Using a Remediation Playbook](#step-3---resolving-inhibitors-using-a-remediation-playbook) + - [Conclusion](#conclusion) + +## Objectives + +* Consider different options for resolving inhibitor risk findings +* Learn how to use the `leapp_answerfile` variable of the `analysis` role +* Use a remediation playbook to proactively prepare for pre-upgrade + +## Guide + +### Step 1 - Explore Options for Resolving Inhibitors + +In the previous exercise, we reviewed the Leapp pre-upgrade reports that were generated for our RHEL7 and RHEL8 pet application servers. With the RHEL8 hosts, there were no inhibitor risk findings reported, so those are good to go and ready to try upgrading. However, there were a couple inhibitors reported for the RHEL7 hosts. We must take action to resolve them before those hosts can be upgraded. + +We are now here in our automation approach workflow: + +![Automation approach workflow diagram with apply recommended remediations step highlighted](images/ripu-workflow-hl-remediate.svg) + +- Let's start by dissecting one of our inhibitor findings: + + ![Details view of missing required answers in the answer file](images/missing_answers_dissected.svg) + + ![1.](images/circle_1.svg) Each finding has a unique title. + + ![2.](images/circle_2.svg) A risk factor is assigned to each finding, but as we discussed in the previous exercise, this may be more nuanced than can be indicated by a simple High, Medium, Low or Info rating. + + ![3.](images/circle_3.svg) The summary provides a detailed explanation of the risk and solution recommendation. + + ![4.](images/circle_4.svg) Under remediation, we are given a fairly prescriptive recommendation. + + ![5.](images/circle_5.svg) Sometimes, the remediation also includes an exact command like this one. + +- When a remediation command is given such as with the example above, there are a number of options we can choose from for how to execute the command. Obviously, we could go with the quick and dirty method of getting to a root shell prompt on the host to cut and paste the command or manually edit the answerfile. Of course, going that way is prone to human error and doesn't scale well. Another option would be to use the "Run Remediation" button shown above the command. Using this option, the RHEL Web Console executes the command for us. While doing this is less prone to human error, it still doesn't scale well as it's only going to run on this single host. + +- In the next steps, we'll look at how we can use the scale of Ansible Automation Platform (AAP) to perform remediations in bulk across a large RHEL estate. + +### Step 2 - Managing the Leapp Answer File + +The Leapp framework uses an answer file as a means of accepting user input choices. This is explained in greater detail in the [Asking user questions](https://leapp.readthedocs.io/en/latest/dialogs.html) section of the Leapp developer documentation. The inhibitor finding we dissected in the previous step is looking for us to make a decision or, more specifically, asking us to acknowledge we are aware that Leapp will disable the pam_pkcs11 PAM module during the RHEL upgrade. + +- In [Exercise 1.2 - Run Pre-upgrade Jobs](../1.2-preupg/README.md), we launched a playbook that runs the pre-upgrade report using the `analysis` role from the `infra.leapp` Ansible collection. Look at the [documentation for this role](https://github.com/redhat-cop/infra.leapp/blob/main/roles/analysis/README.md). Do you see where it supports a `leapp_answerfile` input variable. We can set the variable to automatically populate the Leapp answer file. + +- Let's try running the pre-upgrade job again with this variable defined. Launch the "AUTO / 01 Analysis" job template the same as you did under [Exercise 1.2, Step 2](../1.2-preupg/README.md#step-2---use-aap-to-launch-an-analysis-playbook-job), except this time, we will add this setting when we get to the Variables prompt: + + ```json + "leapp_answerfile": "[remove_pam_pkcs11_module_check]\nconfirm = True\n", + ``` + For example: + + ![Setting the `leapp_answerfile` input variable](images/analysis_leapp_answerfile.svg) + + After making the variable setting as shown above, click the "Next" button. This will lead to the job template survey prompt. Previously, we used the "ALL_rhel" option to run the pre-upgrade on all our pet servers. However, our `leapp_answerfile` setting is specific to our RHEL7 hosts, so choose the "rhel7" option this time: + + ![Choose "rhel7" at the survey prompt](images/analysis_survey_rhel7_only.svg) + + Click the "Next" button to proceed to the preview prompt. If you are satisfied with the job preview, use the "Launch" button to start the job. + +- As before, the AAP Web UI will navigate automatically to the job output page after you start the job. The job will take a few minutes to finish and then you should see the "PLAY RECAP" at the end of the job output. + +- Now go back to your RHEL Web Console browser tab and navigate to the pre-upgrade report of one of the RHEL7 hosts. + + > **Note** + > + > You may need to refresh the browser using Ctrl-R to see the newly generated report. + + You should see that the "Missing required answers in the answer file" inhibitor finding is no longer being reported. + + For example: + + ![Pre-upgrade report of RHEL7 host without answer file inhibitor](images/rhel7_answer_fixed.svg) + + But we still have the "Possible problems with remote login using root account" inhibitor which we need to fix. Let's look at that next. + +### Step 3 - Resolving Inhibitors Using a Remediation Playbook + +In the previous step, we were able to resolve an inhibitor finding by simply setting the `leapp_answerfile` input variable supported by the `infra.leapp` Ansible collection `analysis` role. While that's a convenient way to resolve an answerfile inhibitor, our next inhibitor can't be resolved that way. + +- Here is our other inhibitor finding: + + ![Details view of missing required answers in the answer file](../1.3-report/images/root_account_inhibitor.svg) + + Like the previous inhibitor finding, this one also provides a detailed summary and a a fairly prescriptive recommended remediation. However, it does not recommend an exact remediation command. Instead, the remediation recommends making edits to the `/etc/ssh/sshd_config` file. + +- Of course, we're not going to just login to a root shell and `vi` the configuration file, are we? Right, let's make a playbook to automate the required remediations. Here's a task that should do the trick: + + ```yaml + - name: Configure sshd + ansible.builtin.lineinfile: + path: "/etc/ssh/sshd_config" + regex: "^(#)?{{ item.key }}" + line: "{{ item.key }} {{ item.value }}" + state: present + loop: + - {key: "PermitRootLogin", value: "prohibit-password"} + - {key: "PasswordAuthentication", value: "no"} + notify: + - Restart sshd + ``` + + While we're at it, let's also add a task to take care of the answer file inhibitor using the `leapp answer` command. For example: + + ```yaml + - name: Remove pam_pkcs11 module + ansible.builtin.shell: | + set -o pipefail + leapp answer --section remove_pam_pkcs11_module_check.confirm=True + args: + executable: /bin/bash + ``` + +- You will find the tasks above in the playbook [`remediate_rhel7.yml`](https://github.com/redhat-partner-tech/leapp-project/blob/main/remediate_rhel7.yml#L21-L38). There are a few more remediation task examples in this playbook as well. The "OS / Remediate" job template is already set up to execute this playbook, so let's use it to remediate our RHEL7 hosts. + +- Return to your AAP Web UI browser tab. Navigate to Resources > Templates on the AAP Web UI and open the "OS / Remediate" job template. Click the "Launch" button to get started. + +- This will bring you to the job template survey prompt. Again, choose the "rhel7" option at the "Select inventory group" prompt because our remediation playbook is specific to the pre-upgrade findings of our RHEL7 hosts. Then click the "Next" button. If you are satisfied with the job preview, use the "Launch" button to submit the job. This playbook includes only a small number of tasks and should run pretty quickly. + +- When the "OS / Remediate" job is finished, launch the "AUTO / 01 Analysis" job template one more time again taking care to choose the "rhel7" option at the "Select inventory group" prompt. When the job completes, go back to the RHEL Web Console of your RHEL7 host and refresh the report. You should now see there are no inhibitors: + + ![Pre-upgrade report of RHEL7 host with no more inhibitors](images/rhel7_no_inhibitors.svg) + + With no inhibitors indicated on our RHEL7 and RHEL8 pet servers, we are ready to try the RHEL upgrade. + +## Conclusion + +In this exercise, we looked at the different ways we can resolve inhibitor risk findings. We learned how to use the `leapp_answerfile` variable of the `analysis` role to manage the Leapp answer file. Finally, we used an example remediation playbook to demonstrate how we could address pre-upgrade inhibitor findings at scale across our RHEL estate. + +Now we are ready to try upgrading our RHEL pet app servers, but before we get to that, there are two more optional exercises in this section of the workshop: + +- [Exercise 1.5 - Custom Pre-upgrade Checks](../1.5-custom-modules/README.md) +- [Exercise 1.6 - Deploy a Pet Application](../1.6-my-pet-app/README.md) + +These exercises are not required to successfully complete the workshop, but we recommend doing them if time allows. If you can't wait and want skip ahead to upgrading your RHEL hosts, strap in for this exciting exercise: + +- [Exercise 2.1 - Run the RHEL Upgrade Jobs](../2.1-upgrade/README.md) + +--- + +**Navigation** + +[Previous Exercise](../1.3-report/README.md) - [Next Exercise](../1.5-custom-modules/README.md) + +[Home](../README.md) diff --git a/exercises/ansible_ripu/1.4-remediate/images/analysis_leapp_answerfile.svg b/exercises/ansible_ripu/1.4-remediate/images/analysis_leapp_answerfile.svg new file mode 100644 index 000000000..7c2f038c0 --- /dev/null +++ b/exercises/ansible_ripu/1.4-remediate/images/analysis_leapp_answerfile.svg @@ -0,0 +1,3105 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.4-remediate/images/analysis_survey_rhel7_only.svg b/exercises/ansible_ripu/1.4-remediate/images/analysis_survey_rhel7_only.svg new file mode 100644 index 000000000..af8c01220 --- /dev/null +++ b/exercises/ansible_ripu/1.4-remediate/images/analysis_survey_rhel7_only.svg @@ -0,0 +1,2379 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.4-remediate/images/circle_1.svg b/exercises/ansible_ripu/1.4-remediate/images/circle_1.svg new file mode 100644 index 000000000..248d75562 --- /dev/null +++ b/exercises/ansible_ripu/1.4-remediate/images/circle_1.svg @@ -0,0 +1,60 @@ + + + + + + + + + + 1 + + + diff --git a/exercises/ansible_ripu/1.4-remediate/images/circle_2.svg b/exercises/ansible_ripu/1.4-remediate/images/circle_2.svg new file mode 100644 index 000000000..f34f7fb35 --- /dev/null +++ b/exercises/ansible_ripu/1.4-remediate/images/circle_2.svg @@ -0,0 +1,60 @@ + + + + + + + + + + 2 + + + diff --git a/exercises/ansible_ripu/1.4-remediate/images/circle_3.svg b/exercises/ansible_ripu/1.4-remediate/images/circle_3.svg new file mode 100644 index 000000000..d0f31b7b2 --- /dev/null +++ b/exercises/ansible_ripu/1.4-remediate/images/circle_3.svg @@ -0,0 +1,60 @@ + + + + + + + + + + 3 + + + diff --git a/exercises/ansible_ripu/1.4-remediate/images/circle_4.svg b/exercises/ansible_ripu/1.4-remediate/images/circle_4.svg new file mode 100644 index 000000000..a0177ee6a --- /dev/null +++ b/exercises/ansible_ripu/1.4-remediate/images/circle_4.svg @@ -0,0 +1,60 @@ + + + + + + + + + + 4 + + + diff --git a/exercises/ansible_ripu/1.4-remediate/images/circle_5.svg b/exercises/ansible_ripu/1.4-remediate/images/circle_5.svg new file mode 100644 index 000000000..807f47b71 --- /dev/null +++ b/exercises/ansible_ripu/1.4-remediate/images/circle_5.svg @@ -0,0 +1,60 @@ + + + + + + + + + + 5 + + + diff --git a/exercises/ansible_ripu/1.4-remediate/images/missing_answers_dissected.svg b/exercises/ansible_ripu/1.4-remediate/images/missing_answers_dissected.svg new file mode 100644 index 000000000..05156ae25 --- /dev/null +++ b/exercises/ansible_ripu/1.4-remediate/images/missing_answers_dissected.svg @@ -0,0 +1,147 @@ + + + + + + + + + + + + 1 + + + + 2 + + + + 3 + + + + 4 + + + + 5 + + diff --git a/exercises/ansible_ripu/1.4-remediate/images/rhel7_answer_fixed.svg b/exercises/ansible_ripu/1.4-remediate/images/rhel7_answer_fixed.svg new file mode 100644 index 000000000..4dae9890a --- /dev/null +++ b/exercises/ansible_ripu/1.4-remediate/images/rhel7_answer_fixed.svg @@ -0,0 +1,4766 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.4-remediate/images/rhel7_no_inhibitors.svg b/exercises/ansible_ripu/1.4-remediate/images/rhel7_no_inhibitors.svg new file mode 100644 index 000000000..7eae7ee1b --- /dev/null +++ b/exercises/ansible_ripu/1.4-remediate/images/rhel7_no_inhibitors.svg @@ -0,0 +1,4776 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.4-remediate/images/ripu-workflow-hl-remediate.svg b/exercises/ansible_ripu/1.4-remediate/images/ripu-workflow-hl-remediate.svg new file mode 100644 index 000000000..273ad524c --- /dev/null +++ b/exercises/ansible_ripu/1.4-remediate/images/ripu-workflow-hl-remediate.svg @@ -0,0 +1,769 @@ + + + +ReviewReportLooksOK?YESNOAnalysis PhaseUpgrade PhaseCommit PhaseLooksOK?App TeamValidationsCan Fix?RollbackYESNONOYESDeleteSnapshotDoneStartApplyRecommendedRemediationRunPre-upgradeAnalysisRunIn-placeUpgradeCreateSnapshot diff --git a/exercises/ansible_ripu/1.5-custom-modules/README.md b/exercises/ansible_ripu/1.5-custom-modules/README.md new file mode 100644 index 000000000..e8794b4f3 --- /dev/null +++ b/exercises/ansible_ripu/1.5-custom-modules/README.md @@ -0,0 +1,162 @@ +# Workshop Exercise - Custom Modules + +## Table of Contents + +- [Workshop Exercise - Custom Modules](#workshop-exercise---custom-modules) + - [Table of Contents](#table-of-contents) + - [Optional Exercise](#optional-exercise) + - [Objectives](#objectives) + - [Guide](#guide) + - [Step 1 - What are Custom Modules?](#step-1---what-are-custom-modules) + - [Step 2 - Install a Leapp Custom Actor](#step-2---install-a-leapp-custom-actor) + - [Step 3 - Generate a New Pre-upgrade Report](#step-3---generate-a-new-pre-upgrade-report) + - [Step 4 - Learn More About Developing Leapp Custom Actors](#step-4---learn-more-about-developing-leapp-custom-actors) + - [Conclusion](#conclusion) + +## Optional Exercise + +This is an optional exercise. It is not required to successfully complete the workshop, but it's recommended if time allows. Review the objectives listed in the next section to decide if you want to do this exercise or if you would rather skip ahead to the next exercises: + +* [Exercise 1.6 - (Optional) Deploy a Pet Application](../1.6-my-pet-app/README.md) +* [Exercise 2.1 - Run the RHEL Upgrade Jobs](../2.1-upgrade/README.md) + +## Objectives + +* Learn what are custom modules +* Install a Leapp custom actor to implement additional pre-upgrade checks +* See where to go to learn more about making your own custom actors + +## Guide + +### Step 1 - What are Custom Modules? + +Custom modules is a generic term that refers to any custom capabilities that may be layered on the RHEL in-place upgrades automation approach to meet any special requirements unique to your enterprise. + +- For example, maybe your organization has established a standard set of policies that must be followed before doing any system maintenance. In that case, you could implement a Leapp custom actor that will raise a pre-upgrade inhibitor if any conditions are detected that do not comply with your policies. + +- Another use case for custom modules is to deal with 3rd-party tools and agents that you include in your standard RHEL server build. For example, if you use Chef for configuration convergence, you could implement additional tasks in your upgrade playbook to take care of updating the Chef client package and making any required changes to the Chef node attributes and run list that would be required under the new OS version. + +- We use the generic term "custom modules" because there is no single blueprint for how to design and implement the logic and automation for different organization's diverse requirements. Custom module requirements may call for a Leapp custom actor, custom Ansible automation, or even an integrated design using both. + +- Custom Leapp actors are best for implementing any additional pre-upgrade checks you may need because the results of these checks will be seamlessly included in the pre-upgrade report generated by the Leapp framework. Custom Leapp actors should also be used for any automated tasks that need to be run during the interim system (initrd) phase of the Leapp upgrade. + +- Developing Leapp custom actors is not as easy compared to just adding tasks to a playbook, so most custom automation requirements are best achieved using Ansible. Tasks can be included in your upgrade playbook before and after the task that imports the `infra.leapp` collection `upgrade` role from the to actually perform RHEL OS upgrade. + +### Step 2 - Install a Leapp Custom Actor + +There is a collection of example custom actor at the GitHub repo [oamg/leapp-supplements](https://github.com/oamg/leapp-supplements). We will use one of these to demonstrate adding a custom check to our pre-upgrade reports. + +- The example custom actor we are going to install implements checks for compliance with an imaginary organization's "reboot hygiene" policy. The actor will block the upgrade by reporting inhibitor risk if any of the following conditions are detected: + + - The host uptime is greater than the maximum defined by the policy. + + - The running kernel version does not match the default kernel version configured in the bootloader. + + - The /boot directory has any files that have been modified since the last reboot. + +- Go to your VS Code browser tab and open a terminal session. Refer back to [Exercise 1.1, Step 2](https://github.com/swapdisk/workshops/blob/devel/exercises/ansible_ripu/1.1-setup/README.md#step-2---open-a-terminal-session) if you need a reminder of how we did that. + +- Login to one of your pet app servers using the `ssh` command. For example: + + ``` + ssh tidy-bengal + ``` + +- Now we will install the RPM package that provides our custom actor. Run the following command on your pet app server: + + ``` + sudo yum -y install leapp-upgrade-\*-supplements + ``` + + > **Note** + > + > We are installing the package manually just for the purpose of demonstrating this custom actor. If we were ready to roll out custom actors at enterprise scale, we would include the package installation at the beginning of our analysis playbook. + +- This is an example of the output you should expect to see if the package is installed successfully: + + ``` + Resolving Dependencies + --> Running transaction check + ---> Package leapp-upgrade-el7toel8-supplements.noarch 0:1.0.0-47.demo.el7 will be installed + --> Finished Dependency Resolution + + Dependencies Resolved + + ========================================================================================== + Package Arch Version Repository Size + ========================================================================================== + Installing: + leapp-upgrade-el7toel8-supplements noarch 1.0.0-47.demo.el7 leapp-supplements 12 k + + Transaction Summary + ========================================================================================== + Install 1 Package + + Total download size: 12 k + Installed size: 18 k + Downloading packages: + leapp-upgrade-el7toel8-supplements-1.0.0-47.demo.el7.noarch.rpm | 12 kB 00:00:00 + Running transaction check + Running transaction test + Transaction test succeeded + Running transaction + Installing : leapp-upgrade-el7toel8-supplements-1.0.0-47.demo.el7.noarch 1/1 + Verifying : leapp-upgrade-el7toel8-supplements-1.0.0-47.demo.el7.noarch 1/1 + + Installed: + leapp-upgrade-el7toel8-supplements.noarch 0:1.0.0-47.demo.el7 + + Complete! + ``` + +- To demonstrate the custom actor at work, let's create a condition that violates our policy so that an inhibitor finding will be reported. Use this command: + + ``` + sudo touch /boot/policy-violation + ``` + + With this command, we just created a file under /boot with a timestamp later than the last reboot. This host is now out of compliance with our reboot hygiene policy! + +### Step 3 - Generate a New Pre-upgrade Report + +We are now ready to try running a pre-upgrade report including the checks from our custom actor. + +- Return to your AAP Web UI browser tab. Navigate to Resources > Templates and open the "AUTO / 01 Analysis" job template. Launch the job choosing the "ALL_rhel" option at the "Select inventory group" prompt. + +- When the job completes, go back to the RHEL Web Console and use the remote host menu to navigate to the pet app server where you installed the custom actor package. Refresh the pre-upgrade report. You should now see there is a new inhibitor finding. For example: + + ![Pre-upgrade report showing inhibitor finding from custom actor](images/reboot_hygiene.svg) + +- Click on the finding to open the detail view. Here we see the summary with an explanation of the finding and the remediation hint which politely says please reboot: + + ![Finding details reported by reboot hygiene custom actor](images/reboot_hygiene_finding.svg) + +- Reboot the host to resolve the inhibitor finding. For example: + + ``` + sudo reboot + ``` + +- Now generate another pre-upgrade report after rebooting. Verify that this inhibitor finding has disappeared with the new report. + +### Step 4 - Learn More About Developing Leapp Custom Actors + +The gritty details of developing Leapp custom actors are beyond the scope of this workshop. Here are some resources you can check out to learn more on your own: + + - [Developer Documentation for Leapp](https://leapp.readthedocs.io/en/latest/): this documentation covers the internal workflow architecture of the Leapp framework and how to develop and test your own custom actors. + + - [Leapp Dashboard](https://oamg.github.io/leapp-dashboard/#/): dig around here to make sure the custom actor functionality you are considering doesn't already exist in the mainstream Leapp framework. + + - [oamg/leapp-supplements](https://github.com/oamg/leapp-supplements): GitHub repo where you can find example custom actors and contribute your own. It also has the `Makefile` for custom actor RPM packaging. + +## Conclusion + +In this exercise, we learned that custom modules can be Leapp custom actors or simply custom tasks added to your upgrade playbook. We demonstrated installing an RPM package that provides an example custom actor with additional pre-upgrade checks and generated a new pre-upgrade report to see it in action. + +--- + +**Navigation** + +[Previous Exercise](../1.4-remediate/README.md) - [Next Exercise](../1.6-my-pet-app/README.md) + +[Home](../README.md) diff --git a/exercises/ansible_ripu/1.5-custom-modules/images/reboot_hygiene.svg b/exercises/ansible_ripu/1.5-custom-modules/images/reboot_hygiene.svg new file mode 100644 index 000000000..23cf9de13 --- /dev/null +++ b/exercises/ansible_ripu/1.5-custom-modules/images/reboot_hygiene.svg @@ -0,0 +1,4629 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.5-custom-modules/images/reboot_hygiene_finding.svg b/exercises/ansible_ripu/1.5-custom-modules/images/reboot_hygiene_finding.svg new file mode 100644 index 000000000..f1ff51f6d --- /dev/null +++ b/exercises/ansible_ripu/1.5-custom-modules/images/reboot_hygiene_finding.svg @@ -0,0 +1,1546 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.6-my-pet-app/README.md b/exercises/ansible_ripu/1.6-my-pet-app/README.md new file mode 100644 index 000000000..44f22e705 --- /dev/null +++ b/exercises/ansible_ripu/1.6-my-pet-app/README.md @@ -0,0 +1,233 @@ +# Workshop Exercise - Deploy a Pet App + +## Table of Contents + +- [Workshop Exercise - Deploy a Pet App](#workshop-exercise---deploy-a-pet-app) + - [Table of Contents](#table-of-contents) + - [Optional Exercise](#optional-exercise) + - [Objectives](#objectives) + - [Guide](#guide) + - [Step 1 - The Traditional Application Lifecycle](#step-1---the-traditional-application-lifecycle) + - [Step 2 - Installing Our Beloved Pet Application](#step-2---installing-our-beloved-pet-application) + - [Step 3 - Test the Pet Application](#step-3---test-the-pet-application) + - [Step 4 - Configure the Application to Start on Reboot](#step-4---configure-the-application-to-start-on-reboot) + - [Step 5 - Run Another Pre-upgrade Report](#step-5---run-another-pre-upgrade-report) + - [Conclusion](#conclusion) + +## Optional Exercise + +This is an optional exercise. It is not required to successfully complete the workshop, but we recommended trying it if time allows. Review the objectives listed in the next section to decide if you want to do this exercise or if you would rather skip ahead to the next exercise: + +* [Exercise 2.1 - Run OS Upgrade Jobs](../2.1-upgrade/README.md) + +## Objectives + +* Discuss how applications are deployed and maintained in traditional server environments +* Install our example pet application or bring your own +* Consider how to test if your application is functioning as expected + +## Guide + +### Step 1 - The Traditional Application Lifecycle + +Let's take a step back and think about why we want to do an in-place upgrade. Wouldn't it be best practice to deploy a new server or VM instance with the new RHEL version and then do a fresh install of our application from there? + +- Yes, but... + + - What if the app team doesn't have automation to deploy their apps and they instead manually install and configure everything? + + - What if since they installed their app however many years ago, they have been making changes to their app environment to solve issues or cope with changing business requirements? + + - What if they have lost track of all that accumulated drift and technical debt such that it would be very difficult for them to start fresh? + +- Unfortunately, this is the position may app teams find themselves in. Their traditional app server has been tenderly cared for like a beloved pet for years. The idea of throwing it out and redoing everything from scratch is unthinkable. + +- If they can just move to the new RHEL version without having to touch their application, that is a very compelling alternative. That is why they want to do an in-place upgrade, so they can disappear off the corporate platform lifecycle compliance report without having to suffer with the headache of manually installing and reconfiguring everything all over again. + +- In this optional exercise, we want to install an application so that we can assess if the RHEL in-place upgrade causes any impact. We want to see if the app still functions as expected under the new RHEL version after the upgrade. + +### Step 2 - Installing Our Beloved Pet Application + +In this step, we are going to install an example application. We are going to install the old fashioned way: manually by following a traditional written procedure of confusing and potentially error prone command line steps. After all, if our app deployment was automated end-to-end, we wouldn't need to upgrade in-place. + +You may want to install a different application, for example, an actual application from your enterprise environment that you would like to test for potential impacts. Feel free to skip the procedure below and make your own adventure. Just take care to test your app both before and after the upgrade. + +- Our example application will be the [Spring Pet Clinic Sample Application](https://github.com/spring-projects/spring-petclinic) written in Java. It is a Spring Boot application that gets built using Maven. It will connect to a MySQL database which gets populated at startup with sample data. + + > **Note** + > + > If you want to take a shortcut as an alternative to manually running all the commands in the procedure below, you may use this single command to run a bash script that will run the same commands to install and start the example application: + > + > ``` + > curl -s https://people.redhat.com/bmader/petapp.sh | bash + > ``` + > + > If the script is successful, you may skip ahead to [Step 3 - Test the Pet Application](#step-3---test-the-pet-application). + +- The first step in our app install procedure is to install a Java JDK. We'll use a 3rd-party one just for fun! + + > **Warning** + > + > All commands should be run as the ec2-user, not as root. Commands that require root will use `sudo`. + + Login to your pet app server and run these commands: + + ``` + distver=$(sed -r 's/([^:]*:){4}//;s/(.).*/\1/' /etc/system-release-cpe) + sudo yum-config-manager --add-repo=https://packages.adoptium.net/artifactory/rpm/rhel/$distver/x86_64 + sudo yum-config-manager --save --setopt=\*adoptium\*.gpgkey=https://packages.adoptium.net/artifactory/api/gpg/key/public + sudo yum install mariadb mariadb-server temurin-17-jdk + ``` + + Answer `y` to any prompts from the last command. + +- Verify that the temurin-17-jdk package was installed: + + ``` + rpm -q temurin-17-jdk + ``` + + If not, go back and figure out what went wrong. + +- Next, we will install the Spring Pet Clinic Sample Application under the home directory of the ec2-user. Run these commands: + + ``` + cd ~ + git clone https://github.com/spring-projects/spring-petclinic.git + ``` + + You should now see that the application files are installed in the `spring-petclinic` directory. + + +- We need to start the database server and create the database for our application. Use this command to enable and start the database: + + ``` + sudo systemctl enable --now mariadb + ``` + + Now use the `mysql` command line client to connect to the database server. For example: + + ``` + mysql --user root + ``` + + This should bring you to a `MariaDB [(none)]>` prompt. Enter the following SQL commands at this prompt: + + ``` + CREATE DATABASE IF NOT EXISTS petclinic; + ALTER DATABASE petclinic DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; + GRANT ALL PRIVILEGES ON petclinic.* TO 'petclinic'@'localhost' IDENTIFIED BY 'petclinic'; + FLUSH PRIVILEGES; + quit + ``` + +- Now we are ready to start the application web service. Use this command to run it in the background: + + ``` + echo 'cd $HOME/spring-petclinic && ./mvnw spring-boot:run -Dspring-boot.run.profiles=mysql >> $HOME/app.log 2>&1' | at now + ``` + +- The application will take a couple minute to come up the first time. Check the `app.log` file to follow the progress and verify the web service has started successfully: + + ``` + tailf ~/app.log + ``` + + When you see events listed at the bottom of the log output like this example, that means the application is started successfully and ready for testing: + + ``` + o.s.b.a.e.web.EndpointLinksResolver : Exposing 13 endpoint(s) beneath base path '/actuator' + o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port(s): 8080 (http) with context path '' + o.s.s.petclinic.PetClinicApplication : Started PetClinicApplication in 6.945 seconds (process running for 7.496) + ``` + + Type Ctrl-C to quit the `tailf` command. + +### Step 3 - Test the Pet Application + +Now that we have installed our application and verified it is running, it's time to test how it works. + +- Use this command to determine the application's external URL: + + ``` + echo "http://$(curl -s ifconfig.me):8080" + ``` + +- Open a new web browser tab. Cut and paste the URL that was output by the command above. This should open the application web user interface. If the application is working correctly, you should see something like this: + + ![Pet Clinic application home page](images/petapp_home.svg) + +- Try the different function tabs at the top of the web user interface. For example, navigate to "FIND OWNERS" and search for Davis. Click on one of the owner records to see their details. + + Use the "Edit Owner" and "Add New Pet" buttons to make changes and add new records. + + The "ERROR" tab in the top-right corner is a error handling test function. If you click it, the expected result is a "Something happened..." message will be displayed and a runtime exception and stack trace will be logged in the `app.log` file. + +- Play with the application until you feel comfortable you understand its expected behavior. After the upgrade, we will test it again to verify it has not been impacted. + +### Step 4 - Configure the Application to Start on Reboot + +Right now, our application was started manually. We need to configure the app so it will start up automatically when our server is rebooted. + +- Use this command to configure a reboot cron entry so the application will be started automatically after every reboot: + + ``` + echo '@reboot cd $HOME/spring-petclinic && ./mvnw spring-boot:run -Dspring-boot.run.profiles=mysql >> $HOME/app.log 2>&1' | crontab - + ``` + +- Now reboot the server to verify this works: + + ``` + sudo reboot + ``` + +- Try refreshing the web browser tab where you have the Pet Clinic web app open. While the server is rebooting, you may see a timeout or connection refused error. After the reboot is finished, the web app should be working again. + + > **Note** + > + > Because the external IP addresses of the EC2 instances provisioned for the workshop are dynamically assigned (i.e., using DHCP), it is possible that the web user interface URL might change after a reboot. If you are unable to access the app after the reboot, run this command again to determine the new URL for the application web user interface: + > + > ``` + > echo "http://$(curl -s ifconfig.me):8080" + > ``` + > + + FIXME: Shame on us for not using DNS! + +### Step 5 - Run Another Pre-upgrade Report + +Whenever changes are made to a server, its a good idea to rerun the Leapp pre-upgrade report to make sure those changes have not introduced any new risk findings. + +- Launch the "AUTO / 01 Analysis" job template to generate a fresh pre-upgrade report. After the job finishes, review the report to see if there are any new findings. Refer to the steps in the previous exercises if you don't have them memorized by heart already. + +- Did you notice that this high risk finding has popped up now? + + ![Packages not signed by Red Hat found on the system high risk finding](images/packages_not_signed_by_rh.svg) + +- If we open the finding, we are presented with the following details: + + ![Packages not signed by Red Hat details view](images/packages_not_signed_details.svg) + +- "Packages not signed by Red Hat" is just a fancy way of referring to 3rd-party packages and/or package that are built in-house by your app teams. In the case of this finding, the package that has been identified is `temurin-17-jdk`, the 3rd-party JDK runtime package we installed. The finding is warning that there is a risk of this package being removed during the upgrade if there are unresolvable dependencies. + + There is only one surefire way to know if the package will be removed or not. Let's run the upgrade and see what happens! + +## Conclusion + +In this exercise, we discussed the sorry state of traditional application maintenance, untracked drift and technical debt. We installed a 3rd-party Java runtime and then installed the Pet Clinic application on top of that. We made certain that our app is functioning as expected, but we also discovered a new "high risk" finding on our pre-upgrade report. + +Congratulations on completing all the exercises in the first section of the workshop. It's time now to upgrade RHEL and see if there will be any application impact. + +--- + +**Navigation** + +[Previous Exercise](../1.5-custom-modules/README.md) - [Next Exercise](../2.1-upgrade/README.md) + +[Home](../README.md) diff --git a/exercises/ansible_ripu/1.6-my-pet-app/images/build_success.svg b/exercises/ansible_ripu/1.6-my-pet-app/images/build_success.svg new file mode 100644 index 000000000..03dd84d90 --- /dev/null +++ b/exercises/ansible_ripu/1.6-my-pet-app/images/build_success.svg @@ -0,0 +1,880 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.6-my-pet-app/images/packages_not_signed_by_rh.svg b/exercises/ansible_ripu/1.6-my-pet-app/images/packages_not_signed_by_rh.svg new file mode 100644 index 000000000..e5adbab76 --- /dev/null +++ b/exercises/ansible_ripu/1.6-my-pet-app/images/packages_not_signed_by_rh.svg @@ -0,0 +1,726 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.6-my-pet-app/images/packages_not_signed_details.svg b/exercises/ansible_ripu/1.6-my-pet-app/images/packages_not_signed_details.svg new file mode 100644 index 000000000..6450b5e91 --- /dev/null +++ b/exercises/ansible_ripu/1.6-my-pet-app/images/packages_not_signed_details.svg @@ -0,0 +1,1066 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.6-my-pet-app/images/petapp_home.svg b/exercises/ansible_ripu/1.6-my-pet-app/images/petapp_home.svg new file mode 100644 index 000000000..bfcf7a92f --- /dev/null +++ b/exercises/ansible_ripu/1.6-my-pet-app/images/petapp_home.svg @@ -0,0 +1,6329 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/1.6-my-pet-app/petapp.sh b/exercises/ansible_ripu/1.6-my-pet-app/petapp.sh new file mode 100755 index 000000000..ad5331554 --- /dev/null +++ b/exercises/ansible_ripu/1.6-my-pet-app/petapp.sh @@ -0,0 +1,85 @@ +#!/bin/bash +# +# Script to install Spring Pet Clinic on a RHEL pet app server +# +# Usage: curl -s https://people.redhat.com/bmader/petapp.sh | bash +# + +# Don't run as root! +if [[ $USER == root ]]; then + echo 'This script must run by non-root user.' + exit 1 +fi + +# Install 3rd-party JDK runtime +distver=$(sed -r 's/([^:]*:){4}//;s/(.).*/\1/' /etc/system-release-cpe) +sudo yum-config-manager --add-repo=https://packages.adoptium.net/artifactory/rpm/rhel/$distver/x86_64 +sudo yum-config-manager --save --setopt=\*adoptium\*.gpgkey=https://packages.adoptium.net/artifactory/api/gpg/key/public +sudo yum -y install mariadb mariadb-server temurin-17-jdk + +# Verify JDK runtime installed +rpm -q temurin-17-jdk +if [[ $? -ne 0 ]]; then + echo 'Install of JDK failed! Try installing manually.' + exit 1 +fi + +# Download the app +cd $HOME +rm -rf spring-petclinic app.log .m2 .config/jgit +git clone https://github.com/spring-projects/spring-petclinic.git + +# Verify app download +if [[ ! -x spring-petclinic/mvnw ]]; then + echo 'App download failed! Review output above for clues.' + exit 1 +fi + +# Open firewall rules +if [[ -x /usr/bin/firewall-cmd ]]; then + sudo firewall-cmd --add-port=8080/tcp + sudo firewall-cmd --add-port=8080/tcp --permanent +fi + +# Set up the database +sudo systemctl enable --now mariadb +mysql --user root < /dev/null +echo 'Starting the app web service. Please wait patiently...' +echo 'cd $HOME/spring-petclinic && ./mvnw spring-boot:run -Dspring-boot.run.profiles=mysql >> $HOME/app.log 2>&1' | at now + +# Verify app started +retry=300 +touch app.log +until grep -q 'Started PetClinic' app.log; do + sleep 1 + if [[ $((retry--)) -le 0 ]]; then + echo 'Timeout waiting for app to start. Check ~/app.log for clues.' + exit 1 + fi +done + +# Declare success! +cat < Templates and then open the "AUTO / 02 Upgrade" job template. Here is what it looks like: + + ![AAP Web UI showing the upgrade job template details view](images/upgrade_template.svg) + +- Click the "Launch" button which will bring up a the prompts for submitting the job starting with the variables prompt: + + ![Upgrade job variables prompt on AAP Web UI](images/upgrade_vars_prompt.svg) + +- We don't need to change the variables settings, so just click the "Next" button to move on. + + ![Upgrade job survey prompt on AAP Web UI](images/upgrade_survey_prompt.svg) + +- Next we see the job template survey prompt asking us to select an inventory group. We are going to upgrade all our pet app hosts using a single job, so choose the "ALL_rhel" option and click the "Next" button. This will bring you to a preview of the selected job options and variable settings. + + ![Upgrade job preview on AAP Web UI](images/upgrade_preview.svg) + +- If you are satisfied with the job preview, use the "Launch" button to start the job. + +### Step 2 - Learn More About Leapp + +After launching the upgrade job, the AAP Web UI will navigate automatically to the workflow job output page of the job we just started. This job will take up to 20 minutes to finish, so let's take this time to learn a little more about how the Leapp framework upgrades your OS to next RHEL major version. + +- Keep in mind that the Leapp framework is responsible only for upgrading the RHEL OS packages. Additional tasks required for upgrading your standard agents, tools, middleware, etc., need to be included in the upgrade playbooks you develop to deal with the specific requirements of your organization's environment. + +- The Leapp framework performs the RHEL in-place upgrade by following a sequence of phases. These phases are represented in the following diagram: + + ![Leapp upgrade flow diagram](images/inplace-upgrade-workflow-gbg.svg) + +- The steps of the RHEL in-place upgrade are implemented in modules known as Leapp actors. The Leapp framework is message-driven. The execution of actors is dependent on the data produced by other actors running before them. Actors running in the early phases scan the system to produce messages that add findings to the pre-upgrade report as well as messages that later actors use to make decisions or apply changes during the upgrade. + +- Each phase includes three defined stages: before, main, and after. Before and after stages are used to further refine when an actor will be run in relation to any other actors in the phase. Actors are tagged to define the phase and stage during which they are to run. + +- There are three groups of phases: Old System, Interim System, and New System. Phases under the Old System group run under the existing RHEL installed version. The Interim System phases starts after the InitRamStart phase reboots the host to an upgrade initramfs environment under which the network and other services are not started. It is at this time that all RHEL packages can be upgraded. Once all the packages are upgraded, another reboot brings the host up under the new RHEL major version and the FirstBoot phase starts. This final phase runs a few post-upgrade actors that require network access and then the upgrade is done. + +- Being aware of these phases helps if you need to troubleshoot an issue during the Leapp upgrade and is especially important if you are planning to develop any Leapp custom actors. You can learn more about the Leapp framework architecture and internal design by reading the upstream [Leapp Developer Documentation](https://leapp.readthedocs.io/en/latest/index.html). + +## Conclusion + +In this exercise, we launched a workflow job template to create snapshots and start the upgrades of our pet app servers. We learned more about the Leapp framework to better understand what is happening as the RHEL OS is being upgraded. + +In the next exercise, we'll learn more about how snapshots work. + +--- + +**Navigation** + +[Previous Exercise](../1.6-my-pet-app/README.md) - [Next Exercise](../2.2-snapshots/README.md) + +[Home](../README.md) diff --git a/exercises/ansible_ripu/2.1-upgrade/images/inplace-upgrade-workflow-gbg.svg b/exercises/ansible_ripu/2.1-upgrade/images/inplace-upgrade-workflow-gbg.svg new file mode 100644 index 000000000..59072e2b4 --- /dev/null +++ b/exercises/ansible_ripu/2.1-upgrade/images/inplace-upgrade-workflow-gbg.svg @@ -0,0 +1,2585 @@ + + + + + + + + + + + + + image/svg+xml + + + + + + + + + Main Phases + + Leapp Upgrade Flow + + + Pre-FactsCollection + + + + Facts Collection + + + + Post-FactsCollection + + + + Pre-Checks + + + + Checks + + + + Post-Checks + + + + + + + + + + + + + + + + + + + + + + + + + + + Start + + + + Pre-Report + + + + Report + + + + Post-Report + + + + Pre-Download + + + + Download + + + + Post-Download + + + + Pre-InterimPreparation + + + + InterimPreparation + + + + Post-InterimPreparation + + + + Pre-InitramReboot + + + + Initram Start + + + + Post-InitramStart + + + + Pre-LateTests + + + + Late Tests + + + + Post-LateTests + + + + PrePreparation + + + + Preparation + + + + PostPreparation + + + + Pre-RPMUpgrade + + + + RPM Upgrade + + + + Post-RPMUpgrade + + + + PreApplications + + + + Applications + + + + PostApplications + + + + Pre-Third PartyApplications + + + + Third PartyApplications + + + + Post-Third PartyApplications + + + + PreFinalization + + + + Finalization + + + + PostFinalization + + + + PreReboot + + + + Reboot + + + + PostReboot + + + + Pre-FirstBoot + + + + First Boot + + + + Post-FirstBoot + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Report + + User Interaction + + + + + + + + + + + + + + + + + + + New System + Interim System (Initrd) + Old System + + + + + + + + + Pre-Target TransactionFactsCollection + + + + Target TransactionFacts Collection + + + + Post-Target TransactionFacts Collection + + + + + + + + + + + + + + + + + Pre-Target TransactionCheck + + + Target TransactionCheck + + + Post-Target TransactionCheck + + + + + + + + + + + + + + + + diff --git a/exercises/ansible_ripu/2.1-upgrade/images/ripu-workflow-hl-upgrade.svg b/exercises/ansible_ripu/2.1-upgrade/images/ripu-workflow-hl-upgrade.svg new file mode 100644 index 000000000..954e14f7c --- /dev/null +++ b/exercises/ansible_ripu/2.1-upgrade/images/ripu-workflow-hl-upgrade.svg @@ -0,0 +1,780 @@ + + + +ReviewReportLooksOK?YESNOAnalysis PhaseUpgrade PhaseCommit PhaseLooksOK?App TeamValidationsCan Fix?RollbackYESNONOYESDeleteSnapshotDoneStartApplyRecommendedRemediationRunPre-upgradeAnalysisRunIn-placeUpgradeCreateSnapshot diff --git a/exercises/ansible_ripu/2.1-upgrade/images/upgrade_preview.svg b/exercises/ansible_ripu/2.1-upgrade/images/upgrade_preview.svg new file mode 100644 index 000000000..408b716f5 --- /dev/null +++ b/exercises/ansible_ripu/2.1-upgrade/images/upgrade_preview.svg @@ -0,0 +1,1882 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/2.1-upgrade/images/upgrade_survey_prompt.svg b/exercises/ansible_ripu/2.1-upgrade/images/upgrade_survey_prompt.svg new file mode 100644 index 000000000..b96efa2ad --- /dev/null +++ b/exercises/ansible_ripu/2.1-upgrade/images/upgrade_survey_prompt.svg @@ -0,0 +1,1011 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/2.1-upgrade/images/upgrade_template.svg b/exercises/ansible_ripu/2.1-upgrade/images/upgrade_template.svg new file mode 100644 index 000000000..b5386b0f6 --- /dev/null +++ b/exercises/ansible_ripu/2.1-upgrade/images/upgrade_template.svg @@ -0,0 +1,3332 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/2.1-upgrade/images/upgrade_vars_prompt.svg b/exercises/ansible_ripu/2.1-upgrade/images/upgrade_vars_prompt.svg new file mode 100644 index 000000000..fb566968b --- /dev/null +++ b/exercises/ansible_ripu/2.1-upgrade/images/upgrade_vars_prompt.svg @@ -0,0 +1,1122 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/2.2-snapshots/README.md b/exercises/ansible_ripu/2.2-snapshots/README.md new file mode 100644 index 000000000..a49b54738 --- /dev/null +++ b/exercises/ansible_ripu/2.2-snapshots/README.md @@ -0,0 +1,172 @@ +# Workshop Exercise - Let's Talk About Snapshots + +## Table of Contents + +- [Workshop Exercise - Let's Talk About Snapshots](#workshop-exercise---lets-talk-about-snapshots) + - [Table of Contents](#table-of-contents) + - [Objectives](#objectives) + - [Guide](#guide) + - [Step 1 - What are Snapshots and What are They Not](#step-1---what-are-snapshots-and-what-are-they-not) + - [Step 2 - Assessing Different Snapshot Solutions](#step-2---assessing-different-snapshot-solutions) + - [LVM](#lvm) + - [VMware](#vmware) + - [Amazon EBS](#amazon-ebs) + - [Break Mirror](#break-mirror) + - [ReaR](#rear) + - [Step 3 - Snapshot Scope](#step-3---snapshot-scope) + - [Step 4 - Choosing the Best Snapshot Solution](#step-4---choosing-the-best-snapshot-solution) + - [Conclusion](#conclusion) + +## Objectives + +* Understand the difference between backups and snapshots +* Learn about some of the different ways of doing snapshots +* Be prepared for the challenges and barriers you may encounter automating snapshots +* Consider the appropriate snapshot scope for your organization + +## Guide + +In the previous exercise, we launched the automation to start the RHEL in-place upgrades of our pet application servers. The first step of the upgrade workflow template is to create a snapshot for each RHEL instance being upgraded. If something goes wrong with an upgrade, the snapshot makes it possible to quickly undo the upgrade. + +Automating snapshots can be one of the most difficult features of the RHEL in-place upgrade solution approach. In this exercise, we will explore some of the challenges that enterprises face and look at strategies for overcoming them. + +Let's start by defining exactly what we mean when we talk about snapshots. + +### Step 1 - What are Snapshots and What are They Not + +Most organizations with a mature traditional computing environment will have standards and tools implemented for doing backups. Typically, backups will be performed on a periodic schedule. More critical or more dynamic data might be backed up more frequently than mostly static data. There is often a strategy where full backups are performed only occasionally and incremental backups are used to save changed file more often. + +The reason for doing backups is to be able to recover data that has been lost for any reason. If data is corrupted because of an operations issue or software defect or accidentally deleted, backups make it easy to turn the clock back and restore the lost data. + +But when an entire server is lost, using backups to recover is more difficult because a new operating system must first be installed before anything can be restored from the backup. The data can be spread out across a full backup as well as multiple incremental backups, further increasing the time for a full server recovery. Most organizations only use their backup solution to restore individual files or directories, but they are not as prepared to recover everything on a server. Even if they are, such a recovery will take a long time. + +Snapshots are different in that they do not backup and restore individual files. Instead, backups operate at a storage device level, instantly saving the contents of an entire logical volume or virtual disk. Unlike backups, snapshots do not make a copy of the data being backed up, but rather mark a point in time after which a copy of all modified data is copied going forward. For this reason, the underlying technique used for snapshots is often referred to as "copy-on-write" or COW. + +While COW snapshots are not a substitute for traditional full and incremental backups, they do offer a number of advantages. The most important advantage is that creating and rolling back snapshots happens almost instantaneously as compared to hours or longer for traditional backup and restore. It is this ability to quickly take an entire server back in time that makes snapshots ideal for reducing the risk of performing RHEL in-place upgrades. + +### Step 2 - Assessing Different Snapshot Solutions + +There are a number of different types of snapshot solutions you may choose from. Each has their own benefits and drawbacks as summarized in the table below: + +| Snapshot type | Works with | Benefits | Drawbacks | +| ------------- | ---------- | -------- | --------- | +| LVM |||| +| VMware |||| +| Amazon EBS |||| +| Break Mirror |||| +| ReaR |||| + +The following sections explain the pros and cons in detail. + +#### LVM + +The Logical Volume Manager (LVM) is a set of tools included in RHEL that provide a way to create and manage virtual block devices known as logical volumes. LVM logical volumes are typically used as the block devices from which RHEL OS filesystems are mounted. The LVM tools support creating and rolling back logical volume snapshots. Automating these actions from an Ansible playbook is relatively simple. + +Logical volumes are contained in a storage pool known as a volume group. The storage available in a volume group comes from one or more physical volumes, that is, block devices underlying actual disks or disk partitions. Typically, the logical volumes where the RHEL OS is installed will be in a "rootvg" volume group. If best practices are followed, logical volumes for applications and app data will be isolated in a separate volume group, "appvg" for example. + +To create logical volume snapshots, there must be free space in the volume group. That is, the total size of the logical volumes in the volume group must be less than the total size of the volume group. The `vgs` command can be used query volume group free space. For example: + +``` +# vgs + VG #PV #LV #SN Attr VSize VFree + rootvg 1 3 0 wz--n- 950.06g 422.06g +``` + +In the example above, the rootvg volume group total size is about 950 Gb and there is about 422 Gb of free space in the volume group. There is plenty of free space to allow for creating snapshot volumes in this volume group. + +If there is not enough free space in the volume group, there are a few ways we can make space available: + +- Adding another physical volume to the volume group (i.e., `pvcreate` and `vgextend`). For a VM, you would first configure an additional virtual disk. +- Temporarily remove a logical volume you don't need. For example, on bare metal servers, there is often a large /var/crash empty filesystem. Removing this filesystem from `/etc/fstab` and then using `lvremove` to remove the logical volume from which it was mounted will free up space in the volume group. +- Reducing the size of one or more logical volumes. This is tricky because first the filesystem in the logical volume needs to be shrunk. XFS filesystems do not support shrinking. EXT filesystems do support shrinking, but not while the filesystem is mounted. This option can be difficult and should only be considered as a last resort and trusted to a very experienced Linux admin. + +After a snapshot is created, COW data will start to utilize the free space of the snapshot logical volume as blocks are written to the origin logical volume. Unless the snapshot is create with the same size as the origin, there is a chance that the snapshot could fill up and become invalid. Testing should be performed during the development of the LVM snapshot automation to determine snapshot sizings with enough cushion to prevent this. The `snapshot_autoextend_percent` and `snapshot_autoextend_threshold` settings in lvm.conf can also be used to reduce the risk of snapshots running out of space. + +Unless you have the luxury of creating snapshots with the same size as their origin volumes, LVM snapshot sizing needs to be thoroughly tested and free space usage carefully monitored. However, if that challenge can be met, LVM snapshots offer a reliable snapshot solution without the headache of depending on external infrastructure such as VMware. + +#### VMware + +A VMware snapshot preserves the state and data of a VM at a specific point in time. Because VMware snapshots operate at the hypervisor level, they are completely independent of the guest OS. This makes them foolproof to anything that can go wrong during a RHEL upgrade. Even if an upgrade fails so badly that the OS can't even be booted up again, reverting the VMware snapshot will still save the day. For these reasons, VMware snapshots appear to be a very compelling snapshot option. + +VMware snapshots can be manually created and reverted using the vSphere management UI. To create or revert a VMware snapshot automatically from an Ansible playbook, access permissions to the required vSphere API calls must be authorized for the AAP control node. + +In our experience, having this access granted can be extremely challenging. The team that controls the VMware environment in most organizations is deeply invested in the "ClickOps" model of doing everything manually using the vSphere management UI. They may also be hesitant to trust that automation developed outside of their team can be trusted to perform the operations they would do manually to create a VMware snapshot, including checking for sufficient free space in the VMFS data store where the snapshot will be created. + +The VMware team may resist supporting snapshots because of limited storage space. While standard VMDK files are fixed in size, COW snapshots will grow over time and require careful monitoring with data stores in VMware environments often running tight on capacity. + +Another justification for pushing back on supporting automated snapshots will be the VMware vendor recommendation that snapshots should never be used for more than 72 hours (see KB article [Best practices for using VMware snapshots in the vSphere environment](https://kb.vmware.com/s/article/1025279)). Unfortunately, app teams usually need more than 3 days of soak time before they are comfortable that no impact to their apps has resulted from a RHEL upgrade. + +VMware snapshots work great when they can be automated. If you are considering this option, engage early with the team that controls the VMware environment for your organization and be prepared for potential resistance. + +#### Amazon EBS + +Amazon Elastic Block Store (Amazon EBS) provides the block storage volumes used for the virtual disks attached to AWS EC2 instances. When a snapshot is created for an EBS volume, the COW data is written to Amazon S3 object storage. + +> **Note** +> +> The snapshot and rollback automation capability implemented for our workshop lab environment uses EBS snapshots. + +While EBS snapshots operate independently from the guest OS running on the EC2 instance, the similarity to VMware snapshots ends there. An EBS snapshot saves the data of the source EBS volume, but does not save the state or memory of the EC2 instance to which the volume is attached. Also unlike with VMware, EBS snapshots can be created for an OS volume only while leaving any separate application volumes as is. + +Automating EBS snapshot creation and rollback is fairly straightforward assuming your playbooks can access the required AWS APIs. The tricky bit of the automation is identifying the EC2 instance and attached EBS volume that corresponds to the target host in the Ansible inventory managed by AAP. For the snapshot automation we implemented for our workshop lab environment, we solved this by setting tags on our EC2 instances. + +#### Break Mirror + +This method is an alternative to LVM that can be used with bare metal servers where the root disk is on a hardware RAID mirror set. Technically speaking, it is not a snapshot, but it still provides a near instantaneous rollback capability. + +Instead of creating a snapshot just before starting the upgrade, the automation reconfigures the RAID controller to break the mirror set of the root disk so then it's just two JBOD disks. One of the JBOD disks is used going forward with the upgrade while the other is left untouched. To perform a rollback, the mirror set is reconstructed from the untouched JBOD. + +Most bare metal servers support out-of-band management and those manufactured in the last decade will support APIs based on the [Redfish](https://www.dmtf.org/standards/redfish) standard. These APIs can be used by automation to break and reconstruct the mirror set, but be prepared for a significant development and testing effort because the API implementations are not always the same across different vendors and server models. + +#### ReaR + +ReaR (Relax and Recover) is a backup and recovery tool that is included with RHEL. ReaR doesn't use snapshots, but it does make it very easy to perform a full backup and restore of your RHEL server. When taking a full backup, ReaR creates a bootable ISO image with the current state of the server. To use a ReaR backup to revert an in-place upgrade, we simply boot the server from the ISO image and then choose the "Automatic Recover" option from the menu. + +While ReaR backup and recovery is not instantaneous like rolling back a snapshot, it is remarkably fast compared to recovery tools that require you to perform a fresh OS install and then manually recover at a file level. + +Read the article [ReaR: Backup and recover your Linux server with confidence](https://www.redhat.com/sysadmin/rear-backup-and-recover) to learn more. + +### Step 3 - Snapshot Scope + +The best practice for allocating the local storage of a RHEL servers is to configure volumes that separate the OS from the apps and app data. For example, the OS filesystems would be under a "rootvg" volume group while the apps and app data would be in an "appvg" volume group. This separation helps isolate the storage usage requirements of these two groups so they can be manged based on their individual requirements and are less likely to impact each other. For example, the backup profile for the OS is likely different than for the apps and app data. + +This practice helps to enforce a key tenet of the RHEL in-place upgrade approach: that is that the OS upgrade should leave the applications untouched with the expectation that system library forward compatibility and middleware runtime abstraction reduces the risk of the RHEL upgrade impacting app functionality. + +With these concepts in mind, let's consider if we want to include the apps and app data in what gets rolled back if we need to revert the RHEL upgrade: + +| Snapshot scope | Benefits | Drawbacks | +| -------------- | -------- | --------- | +| OS only ||| +| OS and apps/data ||| + +When snapshots only include the upgraded OS volumes, the best practice of isolating OS changes from app changes is followed. In this case, it is important to resist the temptation to make some heroic app changes in an attempt to avoid rolling back in the face of application impact after a RHEL upgrade. For the sake of safety and soundness, gather the evidence required to help understand what caused any app impact, but then do a rollback. Don't make any app changes that could be difficult to untangle after rolling back the OS. + +Unfortunately, a VMware snapshot saves the full state of a VM instance including all virtual disks irrespective of whether they contain OS or app data. This can prove challenging for a couple reasons. First, more storage space will be required for the snapshots and it is more difficult to anticipate how much snapshot growth will result because of app data activity. The other problem is that rolling back app data may result in the app state becoming out of sync with external systems leading to unpredictable issues. When rolling back app data for any reason, be aware of the potential headaches that may result. + +### Step 4 - Choosing the Best Snapshot Solution + +There are a number of factors you should consider when deciding which method of snapshot/rollback will work best in your environment. + +- What is your mix of bare metal servers versus VMware or cloud instances? +- Where can free space most readily be made available? +- Can you get unfettered access to your VMware inventory and vSphere APIs? +- What is the appropriate snapshot scope for your organization? +- Which snapshot solution can you most easily make fully automated? + +Consider a belt and suspenders approach, that is, offer support for more than one method. Maybe it makes sense to recommend one method for bare metal and another for VMs. + +Whatever your decision, remember that an effective snapshot/rollback capability integrated with your end-to-end automation is the most important feature of any RHEL in-place upgrade solution. + +## Conclusion + +In this exercise, we learned about the pros and cons of a number of different methods of achieving an automated snapshot/rollback capability. We also considered the risks that can happen because of rolling back app data that isn't isolated from OS changes. With this knowledge, you are ready to make more informed decisions when designing your snapshot/rollback automation approach. + +In the next exercise, we'll go back to look at how the RHEL in-place upgrades are progressing on our pet application servers. + +--- + +**Navigation** + +[Previous Exercise](../2.1-upgrade/README.md) - [Next Exercise](../2.3-check-upg/README.md) + +[Home](../README.md) diff --git a/exercises/ansible_ripu/2.3-check-upg/README.md b/exercises/ansible_ripu/2.3-check-upg/README.md new file mode 100644 index 000000000..3301183f1 --- /dev/null +++ b/exercises/ansible_ripu/2.3-check-upg/README.md @@ -0,0 +1,99 @@ +# Workshop Exercise - Check if the Upgrades Worked + +## Table of Contents + +- [Workshop Exercise - Check if the Upgrades Worked](#workshop-exercise---check-if-the-upgrades-worked) + - [Table of Contents](#table-of-contents) + - [Objectives](#objectives) + - [Guide](#guide) + - [Step 1 - Review the Upgrade Playbook Job Log](#step-1---review-the-upgrade-playbook-job-log) + - [Step 2 - Verify the Hosts are Upgraded to Next RHEL Version](#step-2---verify-the-hosts-are-upgraded-to-next-rhel-version) + - [Conclusion](#conclusion) + +## Objectives + +* Review the upgrade playbook job log +* Verify our pet application servers are running the newer RHEL version + +## Guide + +In the previous exercises, we review pre-upgrade reports and performed some recommended remediations. If you tried the optional exercises, you also learned about custom pre-upgrade checks and installed a sample pet application. After all of that, you finally launched the Ansible playbook jobs to run the RHEL in-place upgrades on your servers. + +It's time to see verify the result of the upgrades and let our application teams assess if their pet apps are still good. We are here in the our RHEL in-place upgrade automation workflow: + +![Automation approach workflow diagram with app validation steps highlighted](images/ripu-workflow-hl-validate.svg) + +Let's get started! + +### Step 1 - Review the Upgrade Playbook Job Log + +The first thing we want to do is see if the job running the upgrade playbooks has finished successfully. + +- Return to the AAP Web UI tab in your web browser. Navigate to Views > Jobs and then open the "OS / Upgrade" playbook run entry to see the log output from the upgrades. + + > **Note** + > + > You will also see an entry for the "AUTO / 02 Upgrade" workflow job. Workflow jobs launch a number of playbook runs. To see the playbook log output, we need to open the playbook run entry, not the workflow job entry. + + For example: + + ![AAP Web UI listing upgrade job entries](images/upgrade_jobs.svg) + +- If the playbook run finished without any failed tasks, you should see "Successful" displayed with a green checkmark. + + > **Note** + > + > If you see "Running" with spinning arrows, the playbook is still running. Wait for the playbook run to finish before moving on with this exercise. + + Scroll down to the end of the log output to see the "PLAY RECAP" indicating the success or failure status for the playbook run executed on each host. Here is what you should expect to see: + + ![AAP Web UI showing successful upgrade playbook run play recap](images/upgrade_play_recap.svg) + + If there are no failed runs, the RHEL in-place upgrade is done on all of our pet app servers. + +### Step 2 - Verify the Hosts are Upgraded to Next RHEL Version + +Now let's make sure our pet app servers are actually upgraded to the next RHEL version. + +- In [Exercise 1.3: Step 2](../1.3-report/README.md#step-2---navigating-the-rhel-web-console), you used the RHEL Web Console to check the installed RHEL versions on your pet app servers. Let's repeat those steps to see the RHEL versions reported after our upgrades. + + Return to your RHEL Web Console browser tab and use the remote host menu to navigate to the web consoles of each of your pet app servers. The RHEL Web Console system overview page should now show the upgraded versions. + + > **Note** + > + > You may need to refresh the browser using Ctrl-R to see the newly reported RHEL version. + + For example, this pet app server that previously had RHEL8 is now reporting RHEL9: + + ![fluent-bee running Red Hat Enterprise Linux 9.2 (Plow)](images/rhel9_upgraded.svg) + + Here is an example one that was previously RHEL7 now running RHEL8: + + ![vocal-hyena running Red Hat Enterprise Linux Server 8.8 (Oopta)](images/rhel8_upgraded.svg) + +- You can also check the RHEL and kernel versions from the command line following the steps you used with [Exercise 1.1: Step 2](../1.1-setup/README.md#step-2---open-a-terminal-session). + + At the shell prompt of your pet app servers, use the `cat /etc/redhat-release` and `uname -r` commands. Here's an example showing a pet app server that was upgraded to RHEL9: + + ![command output showing RHEL9 is installed](images/rhel9_commands.svg) + +## Conclusion + +In this exercise, we observed that the upgrade playbook runs completed successfully. We then used the RHEL Web Console and the command line to verify the new RHEL versions were installed. + +If you deployed a sample pet application in the previous optional exercise, continue here to verify the pet application is still functioning as expected after the RHEL upgrades: + +- [Exercise 2.4 - How is the Pet App Doing?](../2.4-check-pet-app/README.md) + +Otherwise, you may skip ahead to the next section of the workshop where we will demonstrate rolling back the RHEL upgrade, starting with these exercises: + +- [Exercise 3.1 - (Optional) Trash the Instance](../3.1-rm-rf/README.md) +- [Exercise 3.2 - Run Rollback Job](../3.2-rollback/README.md) + +--- + +**Navigation** + +[Previous Exercise](../2.1-upgrade/README.md) - [Next Exercise](../2.4-check-pet-app/README.md) + +[Home](../README.md) diff --git a/exercises/ansible_ripu/2.3-check-upg/images/rhel8_upgraded.svg b/exercises/ansible_ripu/2.3-check-upg/images/rhel8_upgraded.svg new file mode 100644 index 000000000..f6313fc58 --- /dev/null +++ b/exercises/ansible_ripu/2.3-check-upg/images/rhel8_upgraded.svg @@ -0,0 +1,822 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/2.3-check-upg/images/rhel9_commands.svg b/exercises/ansible_ripu/2.3-check-upg/images/rhel9_commands.svg new file mode 100644 index 000000000..1f3efce83 --- /dev/null +++ b/exercises/ansible_ripu/2.3-check-upg/images/rhel9_commands.svg @@ -0,0 +1,136 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/2.3-check-upg/images/rhel9_upgraded.svg b/exercises/ansible_ripu/2.3-check-upg/images/rhel9_upgraded.svg new file mode 100644 index 000000000..7356f3d41 --- /dev/null +++ b/exercises/ansible_ripu/2.3-check-upg/images/rhel9_upgraded.svg @@ -0,0 +1,767 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/2.3-check-upg/images/ripu-workflow-hl-validate.svg b/exercises/ansible_ripu/2.3-check-upg/images/ripu-workflow-hl-validate.svg new file mode 100644 index 000000000..5c7cf1997 --- /dev/null +++ b/exercises/ansible_ripu/2.3-check-upg/images/ripu-workflow-hl-validate.svg @@ -0,0 +1,780 @@ + + + +ReviewReportLooksOK?YESNOAnalysis PhaseUpgrade PhaseCommit PhaseLooksOK?App TeamValidationsCan Fix?RollbackYESNONOYESDeleteSnapshotDoneStartApplyRecommendedRemediationRunPre-upgradeAnalysisRunIn-placeUpgradeCreateSnapshot diff --git a/exercises/ansible_ripu/2.3-check-upg/images/upgrade_jobs.svg b/exercises/ansible_ripu/2.3-check-upg/images/upgrade_jobs.svg new file mode 100644 index 000000000..f64046b2c --- /dev/null +++ b/exercises/ansible_ripu/2.3-check-upg/images/upgrade_jobs.svg @@ -0,0 +1,4400 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/2.3-check-upg/images/upgrade_play_recap.svg b/exercises/ansible_ripu/2.3-check-upg/images/upgrade_play_recap.svg new file mode 100644 index 000000000..407530baa --- /dev/null +++ b/exercises/ansible_ripu/2.3-check-upg/images/upgrade_play_recap.svg @@ -0,0 +1,4925 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/2.4-check-pet-app/README.md b/exercises/ansible_ripu/2.4-check-pet-app/README.md new file mode 100644 index 000000000..8993ec4a2 --- /dev/null +++ b/exercises/ansible_ripu/2.4-check-pet-app/README.md @@ -0,0 +1,60 @@ +# Workshop Exercise - How is the Pet App Doing? + +## Table of Contents + +- [Workshop Exercise - How is the Pet App Doing?](#workshop-exercise---how-is-the-pet-app-doing) + - [Table of Contents](#table-of-contents) + - [Objectives](#objectives) + - [Guide](#guide) + - [Step 1 - Retest our Pet Application](#step-1---retest-our-pet-application) + - [Step 2 - Add More Records to the Database](#step-2---add-more-records-to-the-database) + - [Conclusion](#conclusion) + +## Objectives + +* Confirm our pet app is still functioning as expected after the upgrade +* Add new records to the app database to see what happens when rolling back + +## Guide + +In [Exercise 1.6](../1.6-my-pet-app/README.md) we installed a sample pet application and tested its functionality. Now that we have upgraded the RHEL version of our app server, let's retest to see if there has been any impact. + +### Step 1 - Retest our Pet Application + +It's time to repeat the testing you did for [Step 3](../1.6-my-pet-app/README.md#step-3---test-the-pet-application) in the previous exercise. + +- You should usually be able to access the application web user interface at the same address you used before. If you still have the app open in one of you browser tabs, try refreshing the page. + + > **Note** + > + > Because the external IP addresses of the EC2 instances provisioned for the workshop are dynamically assigned (i.e., using DHCP), it is possible that the web user interface URL may change after a reboot. If that happens, run this command at the shell prompt of the app server to determine the new URL for the application web user interface: + > + > ``` + > echo "http://$(curl -s ifconfig.me):8080" + > ``` + +- Did you previously use the "Edit Owner" or "Add New Pet" buttons to change any data or add new records? If so, check to see if that data is still there and displayed correctly. + +- If you observe any changes in your application behavior or the application isn't working at all, troubleshoot the issue to try to narrow down the root cause. Make note of any issues so you can retest after rolling back the upgrade. + +### Step 2 - Add More Records to the Database + +In [Exercise 2.2](../2.2-snapshots/README.md), we considered the potential pitfalls of including app data in the scope of our snapshot. Imagine what would happen if your app at first appeared fine after the upgrade, but an issue was later discovered after the app had been returned to production use. + +- Add a new record to the database. For example, add a new pet record with the name "Post Upgrade" to make it easy to distinguish. Remember this record to see what happens after we revert the OS upgrade in the next section of the workshop. + +- What will be the business impact if data updates are rolled back with the upgrade? That is exactly the problem we will demonstrate next because the pet app servers deployed in this workshop do not have separate volumes to isolate the OS from the app data. + +## Conclusion + +In this exercise, we observed that the RHEL in-place upgrade left our application untouched and we found that it still works as expected after the upgrade. Then we added some new app data to demonstrate what will happen after rolling back the upgrade. + +This concludes the RHEL Upgrade section of the workshop. In the next and final section, we will be rolling back the RHEL upgrade, taking us right back to where we started. + +--- + +**Navigation** + +[Previous Exercise](../2.3-check-upg/README.md) - [Next Exercise](../3.1-rm-rf/README.md) + +[Home](../README.md) diff --git a/exercises/ansible_ripu/3.1-rm-rf/README.md b/exercises/ansible_ripu/3.1-rm-rf/README.md new file mode 100644 index 000000000..59c7e54d0 --- /dev/null +++ b/exercises/ansible_ripu/3.1-rm-rf/README.md @@ -0,0 +1,124 @@ +# Workshop Exercise - Trash the Instance + +## Table of Contents + +- [Workshop Exercise - Trash the Instance](#workshop-exercise---trash-the-instance) + - [Table of Contents](#table-of-contents) + - [Optional Exercise](#optional-exercise) + - [Objectives](#objectives) + - [Guide](#guide) + - [Step 1 - Select a Pet App Server](#step-1---select-a-pet-app-server) + - [Step 2 - Choose your Poison](#step-2---choose-your-poison) + - [Delete everything](#delete-everything) + - [Uninstall glibc](#uninstall-glibc) + - [Break the application](#break-the-application) + - [Wipe the boot record](#wipe-the-boot-record) + - [Fill up your disk](#fill-up-your-disk) + - [Set off a fork bomb](#set-off-a-fork-bomb) + - [Conclusion](#conclusion) + +## Optional Exercise + +This is an optional exercise. It is not required to successfully complete the workshop, but it will help demonstrate the effectiveness of rolling back a RHEL upgrade. Review the objectives listed in the next section to decide if you want to do this exercise or if you would rather skip ahead to the next exercise: + +* [Exercise 3.2 - Run Rollback Job](3.2-rollback/README.md) + +## Objectives + +* Simulate a failed OS upgrade or application impact +* Demonstrate the scope of rolling back a snapshot + +## Guide + +Have you ever wanted to try doing `rm -rf /*` on a RHEL host just to see what happens? Or maybe you have accidentally done an equally destructive recursive command and already know the consequences. In this exercise, we are going to intentionally mess up one of our pet app servers to demonstrate how rolling back can save the day. + +Let's get started! + +### Step 1 - Select a Pet App Server + +In the next exercise, we will be rolling back the RHEL upgrade on one of our servers. + +- Choose an app server. It can be one of the RHEL7 instances that is now on RHEL8, or one of the RHEL8 instances that was upgraded to RHEL9. + +- Follow the steps you used with [Exercise 1.1: Step 2](../1.1-setup/README.md#step-2---open-a-terminal-session) to open a terminal session on the app server you have chosen to roll back. + +- At the shell prompt, use the `sudo -i` command to switch to the root user. For example: + + ``` + [ec2-user@cute-bedbug ~]$ sudo -i + [root@cute-bedbug ~]# + ``` + + Verify you see a root prompt like the example above. + +### Step 2 - Choose your Poison + +The `rm -rf /*` command appears frequently in the urban folklore about Unix disasters. The command recursively and forcibly tries to delete every directory and file on a system. When it is run with root privileges, this command will quickly break everything on your pet app server and render it unable to reboot ever again. However, there are much less spectacular ways to mess things up. + +Mess up your app server by choosing one of the following suggestions or dream up your own. + +#### Delete everything + +- As mentioned already, `rm -rf /*` can be fun to try. Expect to see lots of warnings and error messages. Even with root privileges, there will be "permission denied" errors because of read-only objects under pseudo-filesystem like `/proc` and `/sys`. Don't worry, irreparable damage is still being done. + + You might be surprised that you will get back to a shell prompt after this. While all files have been deleted from the disk, already running processes like your shell will continue to be able to access any deleted files to which they still have an open file descriptor. Built-in shell commands may even still work, but most commands will result in a "command not found" error. + + If you want to reboot the instance to prove that it will not come back up, you will not be able to use the `reboot` command, however, the `echo b > /proc/sysrq-trigger` might work. + +#### Uninstall glibc + +- The command `rpm -e --nodeps glibc` will uninstall the glibc package, removing the standard C library upon which all other libraries depend. The damage done by this command is just as bad as the the previous example, but without all the drama. This package also provides the dynamic linker/loader, so now commands will fail with errors like this: + + ``` + [root@cute-bedbug ~]# reboot + -bash: /sbin/reboot: /lib64/ld-linux-x86-64.so.2: bad ELF interpreter: No such file or directory + ``` + + If you want to do a `reboot` command, use `echo b > /proc/sysrq-trigger` instead. + +#### Break the application + +- In [Exercise 1.6: Step 5](../1.6-my-pet-app/README.md#step-5---run-another-pre-upgrade-report), we observed a pre-upgrade finding warning of a potential risk that our `temurin-17-jdk` 3rd-party JDK runtime package might be removed during the upgrade in case it had unresolvable dependencies. Of course, we know this did not happen because our pet app is still working perfectly. + + But what if this package did get removed? Our pet app requires the JDK runtime to function. Without it, our application will be broken. We can simulate this by manually removing the package like this: + + ``` + dnf -y remove temurin-17-jdk + ``` + + Now if you `reboot` the app server, the pet app will not come back up and the following error will be seen at the end of the `~/app.log` file: + + ``` + ... + which: no javac in (/home/ec2-user/.local/bin:/home/ec2-user/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin) + Error: JAVA_HOME is not defined correctly. + We cannot execute + ``` + + This is a realistic example of application impact that can be reversed by rolling back the upgrade. + +#### Wipe the boot record + +- The `dd if=/dev/zero of=/sys/block/* count=1` command will clobber the master boot record of your instance. It's rather insidious because you will see that everything continues to function perfectly after running this command, but after you do a `reboot` command, the instance will not come back up again. + +#### Fill up your disk + +- Try the `while fallocate -l9M $((i++)); do true; done; yes > $((i++))` command. While there are many ways you can consume all the free space in a filesystem, this command gets it done in just a couple seconds. Use a `df -h /` command to verify your root filesystem is at 100%. + +#### Set off a fork bomb + +- The shell command `:(){ :|:& };:` will trigger a [fork bomb](https://en.wikipedia.org/wiki/Fork_bomb). When this is done with root privileges, system resources will be quickly exhausted resulting in the server entering a "hung" state. Use the fork bomb if you want to demonstrate rolling back a server that has become unresponsive. + +## Conclusion + +Congratulations, you have trashed one of your app servers. Wasn't that fun? + +In the next exercise, you will untrash it by rolling back. + +--- + +**Navigation** + +[Previous Exercise](../2.4-check-pet-app/README.md) - [Next Exercise](../3.2-rollback/README.md) + +[Home](../README.md) diff --git a/exercises/ansible_ripu/3.2-rollback/README.md b/exercises/ansible_ripu/3.2-rollback/README.md new file mode 100644 index 000000000..56359b3dd --- /dev/null +++ b/exercises/ansible_ripu/3.2-rollback/README.md @@ -0,0 +1,81 @@ +# Workshop Exercise - Run Rollback Job + +## Table of Contents + +- [Workshop Exercise - Run Rollback Job](#workshop-exercise---run-rollback-job) + - [Table of Contents](#table-of-contents) + - [Objectives](#objectives) + - [Guide](#guide) + - [Step 1 - Launch the Rollback Workflow Job Template](#step-1---launch-the-rollback-workflow-job-template) + - [Step 2 - Observe the Rollback Job Output](#step-2---observe-the-rollback-job-output) + - [Step 3 - Check the RHEL Version](#step-3---check-the-rhel-version) + - [Conclusion](#conclusion) + +## Objectives + +* Demonstrate using an Ansible playbook for rolling back a RHEL upgrade +* Verify the RHEL major version is reverted back + +## Guide + +In this exercise, we will demonstrate rolling back one of our pet app servers, just as we would if the RHEL upgrade had failed or if we had found the upgrade caused unexpected impact to the application. + +We are now here in our exploration of the RHEL in-place automation workflow: + +![Automation approach workflow diagram with rollback playbook highlighted](images/ripu-workflow-hl-rollback.svg) + +After rolling back, the pet app server will be restored to as it was just before entering the upgrade phase of the workflow. + +### Step 1 - Launch the Rollback Workflow Job Template + +In this step, we will be rolling back the RHEL in-place upgrade on one of our pet application servers. + +- Return to the AAP Web UI tab in your web browser. Navigate to Resources > Templates and then open the "AUTO / 03 Rollback" job template. Here is what it looks like: + + ![AAP Web UI showing the rollback job template details view](images/rollback_template.svg) + +- Click the "Launch" button which will bring up a the survey prompt. We only want to do a rollback of one server. To do this, choose the "ALL_rhel" option under "Select inventory group" and then enter the hostname of your chosen pet app server under the "Enter server name" prompt. For example: + + ![AAP Web UI showing the rollback job survey prompt](images/rollback_survey.svg) + + Click the "Next" button to proceed. + +- Next you will see the job preview prompt, for example: + + ![AAP Web UI showing the rollback job preview prompt](images/rollback_preview.svg) + + If everything looks good, use the "Launch" button to start the playbook job. + +### Step 2 - Observe the Rollback Job Output + +After launching the rollback playbook job, the AAP Web UI will navigate automatically to the job output page. + +- The automated rollback takes only a few minutes to run. You can monitor the log output as the playbook run progresses. + +- When the job has finished running, scroll to the bottom of the job output. If it finished successfully, you should see "failed=0" status in the job summary like this example: + + ![Rollback job "PLAY RECAP" as seen at the end of the job output](images/rollback_job_recap.svg) + + Notice in the example above, rolling back was done in just under 2 minutes. + +### Step 3 - Check the RHEL Version + +Repeat the steps you followed with [Exercise 2.3: Step 2](../2.3-check-upg/README.md#step-2---verify-the-hosts-are-upgraded-to-next-rhel-version), this time to verify that the RHEL version is reverted back. + +- For example, if the pet app host you rolled back had been upgraded from RHEL7 to RHEL8, you should now see it is back to RHEL7: + + ![command output showing the host is back to RHEL7 installed](images/commands_after_rollback.svg) + +## Conclusion + +In this exercise, we used automation to quickly reverse the RHEL in-place upgrade and restore the app server back to its original state. + +In the next exercise, we'll dig deeper to validate that all changes and impacts caused by the upgrade are now undone. + +--- + +**Navigation** + +[Previous Exercise](../3.1-rm-rf/README.md) - [Next Exercise](../3.3-check-undo/README.md) + +[Home](../README.md) diff --git a/exercises/ansible_ripu/3.2-rollback/images/commands_after_rollback.svg b/exercises/ansible_ripu/3.2-rollback/images/commands_after_rollback.svg new file mode 100644 index 000000000..c66f28706 --- /dev/null +++ b/exercises/ansible_ripu/3.2-rollback/images/commands_after_rollback.svg @@ -0,0 +1,242 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/3.2-rollback/images/ripu-workflow-hl-rollback.svg b/exercises/ansible_ripu/3.2-rollback/images/ripu-workflow-hl-rollback.svg new file mode 100644 index 000000000..1c3f1267b --- /dev/null +++ b/exercises/ansible_ripu/3.2-rollback/images/ripu-workflow-hl-rollback.svg @@ -0,0 +1,786 @@ + + + +ReviewReportLooksOK?YESNOAnalysis PhaseUpgrade PhaseCommit PhaseLooksOK?App TeamValidationsCan Fix?RollbackYESNONOYESDeleteSnapshotDoneStartApplyRecommendedRemediationRunPre-upgradeAnalysisRunIn-placeUpgradeCreateSnapshot diff --git a/exercises/ansible_ripu/3.2-rollback/images/rollback_job_recap.svg b/exercises/ansible_ripu/3.2-rollback/images/rollback_job_recap.svg new file mode 100644 index 000000000..092edfe34 --- /dev/null +++ b/exercises/ansible_ripu/3.2-rollback/images/rollback_job_recap.svg @@ -0,0 +1,5628 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/3.2-rollback/images/rollback_preview.svg b/exercises/ansible_ripu/3.2-rollback/images/rollback_preview.svg new file mode 100644 index 000000000..eb191ff5a --- /dev/null +++ b/exercises/ansible_ripu/3.2-rollback/images/rollback_preview.svg @@ -0,0 +1,1994 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/3.2-rollback/images/rollback_survey.svg b/exercises/ansible_ripu/3.2-rollback/images/rollback_survey.svg new file mode 100644 index 000000000..3d421906e --- /dev/null +++ b/exercises/ansible_ripu/3.2-rollback/images/rollback_survey.svg @@ -0,0 +1,929 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/3.2-rollback/images/rollback_template.svg b/exercises/ansible_ripu/3.2-rollback/images/rollback_template.svg new file mode 100644 index 000000000..b3a047f8b --- /dev/null +++ b/exercises/ansible_ripu/3.2-rollback/images/rollback_template.svg @@ -0,0 +1,4050 @@ + + + + + + + + + diff --git a/exercises/ansible_ripu/3.3-check-undo/README.md b/exercises/ansible_ripu/3.3-check-undo/README.md new file mode 100644 index 000000000..636075ef6 --- /dev/null +++ b/exercises/ansible_ripu/3.3-check-undo/README.md @@ -0,0 +1,77 @@ +# Workshop Exercise - Check if Upgrade Undone + +## Table of Contents + +- [Workshop Exercise - Check if Upgrade Undone](#workshop-exercise---check-if-upgrade-undone) + - [Table of Contents](#table-of-contents) + - [Objectives](#objectives) + - [Guide](#guide) + - [Step 1 - Our State After Rolling Back](#step-1---our-state-after-rolling-back) + - [Step 2 - What's Next?](#step-2---whats-next) + - [Conclusion](#conclusion) + +## Objectives + +* Validate that all changes and impacts caused by the upgrade are undone after rolling back +* Revisit the topic of snapshot scope +* Consider next steps after rolling back + +## Guide + +In the previous exercise, we rolled back one of our pet app servers. Now we will take a closer look at where that has left us and consider where do we go from there. + +### Step 1 - Our State After Rolling Back + +In this step, we will repeat the observations we made on our host after the upgrade with the expectation that everything is back as it was before the upgrade. + +- In the previous exercise, we checked that the RHEL version is reverted back. You can verify this at a shell prompt using commands like `cat /etc/redhat-release` and `uname -r` to output the OS release information and kernel version. You can also refresh the RHEL Web Console to confirm the RHEL version shown on the system overview page. + +- If you inflicted some damage to the pet app server with the optional [Trash the Instance](../3.1-rm-rf/README.md) exercise, you will now find no evidence of that. + + For example, did you delete everything with the `rm -rf /*` command or remove the standard C library with the `rpm -e --nodeps glibc` command? Your efforts to nuke the OS have been nullified! + + Did you break your pet application by removing the JDK runtime package? If you check now with the `rpm -q temurin-17-jdk` command, you will see the package is installed. + + If you filled up your disk, check now with the `df -h /` command. You will see there is now plenty of free space reported. + +- If you installed the Spring Pet Clinic Sample Application from the optional [Deploy a Pet App](../1.6-my-pet-app/README.md) exercise, refresh the web user interface in your browser to verify it is running correctly after rolling back. + + > **Note** + > + > Because the external IP addresses of the EC2 instances provisioned for the workshop are dynamically assigned (i.e., using DHCP), it is possible that the web user interface URL may change after a reboot. If that happens, run this command at the shell prompt of the app server to determine the new URL for the application web user interface: + > + > ``` + > echo "http://$(curl -s ifconfig.me):8080" + > ``` + + Look for any app data you added or modified after the upgrade and you will find that all those changes are lost. What does this tell us about the snapshot scope implemented by our rollback playbook? + + The business will not be happy about losing their changes, so make sure they are aware of the risk to app data if you choose to do everything scope snapshots. + +### Step 2 - What's Next? + +No matter what the reason is for rolling back, the next step will be to decide what to do next. + +- Maybe you rolled back because there was a failure with the Leapp upgrade itself or some custom automation meant to deal with your standard tools and agents. If so, assess what happened and use the experience to improve your automation to prevent similar issues going forward. + + Developing robust automation is an iterative process. Remember it's good to fail fast! Failed upgrades early in the development of your automation should be expected and will inform you about what needs fixed. Having an automated rollback capability helps accelerate the development of robust playbooks. + +- What if you rolled back because application impact was discovered after the upgrade. In this case, it's important for the app team to investigate the cause of the impact. Once the cause is understood, roll back so what was learned can be applied to avoid the issue next time. + + Understanding the cause of application impacts is also an iterative process. App teams must take the time required to fully assess any application issues when upgrading their dev and test servers, rolling back as often as needed. Use the lower environments this way so everyone is fully prepared before moving to production. + +- After rolling back, the last pre-upgrade report generated before the upgrade will still be available. If no significant changes are made after rolling back, it is not necessary to generate a fresh pre-upgrade report before trying another upgrade. + +## Conclusion + +In this exercise, we reviewed the state of our pet app server after rolling back and we considered next steps for some different scenarios. + +In the next and final exercise of the workshop, we'll review what we learned and explore some additional things you may want to play with in the workshop lab environment. + +--- + +**Navigation** + +[Previous Exercise](../3.2-rollback/README.md) - [Next Exercise](../3.4-conclusion/README.md) + +[Home](../README.md) diff --git a/exercises/ansible_ripu/3.4-conclusion/README.md b/exercises/ansible_ripu/3.4-conclusion/README.md new file mode 100644 index 000000000..eedd066f8 --- /dev/null +++ b/exercises/ansible_ripu/3.4-conclusion/README.md @@ -0,0 +1,90 @@ +# Workshop Exercise - Rinse and Repeat + +## Table of Contents + +- [Workshop Exercise - Rinse and Repeat](#workshop-exercise---rinse-and-repeat) + - [Table of Contents](#table-of-contents) + - [Objectives](#objectives) + - [Guide](#guide) + - [Step 1 - What You Learned](#step-1---what-you-learned) + - [Step 2 - Activities for Extra Credit](#step-2---activities-for-extra-credit) + - [Step 3 - Look at the Code](#step-3---look-at-the-code) + - [redhat-cop/infra.leapp](#redhat-copinfraleapp) + - [oamg/leapp-supplements](#oamgleapp-supplements) + - [redhat-partner-tech/leapp-project](#redhat-partner-techleapp-project) + - [swapdisk/snapshot](#swapdisksnapshot) + - [Thank You!](#thank-you) + +## Objectives + +* Review what we have learned in this workshop +* Consider ideas for further exploration +* Look at the code and get involved upstream + +## Guide + +Congratulations! You have reached the end of the RHEL In-place Upgrade Automation Workshop. You are now armed with the knowledge needed to start developing an automation solution to help your organization manage RHEL upgrades at scale. + +Let's review what we learned and think about what's next. + +### Step 1 - What You Learned + +With this workshop, you gained hands-on experience while learning about a prescriptive approach to automating RHEL in-place upgrades. + +- You upgraded only a handful of RHEL cloud instances while progressing through the workshop exercises, but with the power of an enterprise deployment of AAP, this approach can be rolled out at scale across a large fleet of RHEL hosts. + +- You learned why automated snapshot/rollback is one of the most important capabilities required to successfully deliver RHEL in-place upgrade automation. Snapshots not only eliminate the risk and anxiety experienced by an app team facing a RHEL upgrade, but they also help accelerate the development of robust upgrade automation playbooks. + +- You also learned about the custom automation that must be developed to deal with the complex requirements of a large enterprise environment. We demonstrated a few examples of this including using Leapp custom actors for reporting special pre-upgrade checks as well as running Ansible playbooks to handle common remediations and third-party tools and agents. + +- But the most important lesson we learned is "You can do this!" + +### Step 2 - Activities for Extra Credit + +Hopefully, this workshop has opened your eyes to what is possible, but we have just scratched the surface. + +- Is it possible to upgrade from RHEL7 to RHEL9? While the Leapp framework doesn't support a "double upgrade" directly, it is possible to take a host that was upgraded from RHEL7 to RHEL8 and then upgrade it from there to RHEL9. You can try this with one of the pet app instances in the workshop lab. + + There are a couple things to be aware of if you want to try it. You will first need to run the "AUTO / 04 Commit" playbook job template. This job will delete the snapshot created for your RHEL7 to RHEL8 upgrade, so be sure you are happy with everything before you do this. While rolling back to RHEL7 will then be impossible, you will be able to roll back to RHEL8 if needed after upgrading to RHEL9. + + Another consideration with going from RHEL7 to RHEL9 is the increased risk of application impacts. While RHEL system library forward binary compatibility is really solid between each RHEL major version, "N+2" compatibility is not guaranteed. Of course, the only way to know for sure is try it! + +- If you skipped over any of the optional exercises, it's not too late to go back and try them now: + - [Exercise 1.5 - Custom Pre-upgrade Checks](../1.5-custom-modules/README.md) + - [Exercise 1.6 - Deploy a Pet App](../1.6-my-pet-app/README.md) + - [Exercise 2.4 - How is the Pet App Doing?](../2.4-check-pet-app/README.md) + - [Exercise 3.1 - Trash the Instance](../3.1-rm-rf/README.md) + +- The workshop lab environment is now yours to play with. Dream up your own ideas for additional learning and experimentation. Remember you can upgrade and roll back as often as you like. Rinse and repeat! + +### Step 3 - Look at the Code + +All of the Ansible roles and playbooks used in this workshop are maintained in upstream repositories that can be found on GitHub. Take some time to review the code and get engaged with the communities supporting these resources. + +#### [redhat-cop/infra.leapp](https://github.com/redhat-cop/infra.leapp) + +- The `infra.leapp` collection provides the Ansible role that generates the pre-upgrade reports and another that is used to perform the RHEL upgrades. This collection uses the Leapp framework for upgrades from RHEL7 and later, but also supports upgrading from RHEL6 using the older Red Hat Upgrade Tool. The collection is published on Ansible Galaxy [here](https://galaxy.ansible.com/infra/leapp) and also available from Ansible Automation Hub validated content [here](https://console.redhat.com/ansible/automation-hub/repo/validated/infra/leapp/). If you are planning to do RHEL in-place upgrades for your organization, these roles will help you quickly roll out proof-of-concept automation and start upgrading. + +#### [oamg/leapp-supplements](https://github.com/oamg/leapp-supplements) + +- Leapp Supplements is a repository of example Leapp custom actors. The CheckRebootHygiene actor that was demonstrated in the optional [Custom Pre-upgrade Checks](../1.5-custom-modules/README.md) exercise is maintained here. There is also a Makefile and RPM spec file that can be used to build packages for installing your Leapp custom actors. + +#### [redhat-partner-tech/leapp-project](https://github.com/redhat-partner-tech/leapp-project) + +- This is where you will find all of the AAP job templates and Ansible playbooks included in the workshop. You can also explore the infrastructure as code (IaC) magic that is used to provision the workshop lab environment. + +#### [swapdisk/snapshot](https://github.com/swapdisk/snapshot) + +- Here you will find work in progress on a new Ansible role for managing snapshot sets using LVM. If you are interested in automating LVM snapshots as explained in the [Let's Talk About Snapshots](../2.2-snapshots/README.md#lvm) exercise, connect with the authors of this project to get in on the action. + +## Thank You! + +If you enjoyed this workshop, please take a moment to give it a 5-star rating or write a review. If you have any ideas for improvements or new features, don't hesitate to raise an issue [here](https://github.com/ansible/workshops/issues/new/choose) tagging @swapdisk and @heatmiser. All ideas and feedback are welcome! + +--- + +**Navigation** + +[Previous Exercise](../3.3-check-undo/README.md) + +[Home](../README.md) diff --git a/exercises/ansible_ripu/README.md b/exercises/ansible_ripu/README.md new file mode 100644 index 000000000..6e1ad6b43 --- /dev/null +++ b/exercises/ansible_ripu/README.md @@ -0,0 +1,77 @@ +# RHEL In-place Upgrade Automation Workshop + +This workshop will introduce a comprehensive approach to automate in-place upgrades for Red Hat Enterprise Linux (RHEL). The solution uses Ansible Automation Platform (AAP) to execute upgrades at enterprise scale across a large estate of RHEL hosts. The workshop demonstrates how to use an example of this approach to perform upgrades from RHEL7 to RHEL8 and from RHEL8 to RHEL9. You will also learn about how this solution can be customized to meet the special requirements of your enterprise environment. + +There are four key features that the solution approach recommends to deliver success at scale: + +![Automate Everything, Snapshot/rollback, Custom Modules, Reporting Dashboard](images/ripu_key_features.svg) + +As you progress through this workshop, you will learn more about the importance of these features and the different options for how you might implement them in your enterprise. For this workshop, we assume you have at least some experience using Ansible Automation Platform and working with Ansible playbooks and roles. If you're new to Ansible, consider first completing the workshop [Ansible for Red Hat Enterprise Linux](https://aap2.demoredhat.com/exercises/ansible_rhel). + +## Table of Contents + +- [RHEL In-place Upgrade Automation Workshop](#rhel-in-place-upgrade-automation-workshop) + - [Table of Contents](#table-of-contents) + - [Presentations](#presentations) + - [Time Planning](#time-planning) + - [Lab Diagram](#lab-diagram) + - [Workshop Exercises](#workshop-exercises) + - [Section 1 - Pre-upgrade Analysis](#section-1---pre-upgrade-analysis) + - [Section 2 - RHEL OS Upgrade](#section-2---rhel-os-upgrade) + - [Section 3 - Rolling Back](#section-3---rolling-back) + - [Supplemental Exercises](#supplemental-exercises) + - [Workshop Navigation](#workshop-navigation) + +## Presentations + +The exercises are self explanatory and guide the participants through all the phases of an automated RHEL in-place upgrade. All concepts are explained as they are introduced. + +There is an optional presentation deck available with additional information on the benefits of the approach demonstrated in this workshop: +[RHEL In-place Upgrade Automation](../../decks/ansible_ripu.pdf) + +## Time Planning + +The time required to complete the workshop depends on the number of participants and how familiar they are with Linux and Ansible. The exercises themselves should take a minimum of 4 hours. The introduction in the optional presentation adds another hour. There are also some optional exercises which can be skipped, but are recommended if time allows. There are also supplemental exercises at the end of the workshop to allow for open-ended experimentation and exploring customizations that may apply to your specific environment and requirements. The lab environment provisioned could even be used for a multi-day deep dive workshop, but that is beyond the scope of this guide. + +## Lab Diagram + +The lab environment provisioned for the workshop includes a number of RHEL cloud instances. One instance is dedicated to hosting AAP and is used to run playbook and workflow jobs. The jobs are executed against the remaining hosts which will be upgraded in-place to the next RHEL major version. The automation uses Amazon EBS to manage the snapshot/rollback capability. + +![RHEL In-place Upgrade Automation Workshop lab diagram](images/ripu_lab_diagram.svg) + +## Workshop Exercises + +The workshop is composed of three sections each of which includes a number of exercises. Each exercise builds upon the steps performed and concepts learned in the previous exercises, so it is important to do them in the prescribed order. + +### Section 1 - Pre-upgrade Analysis + +* [Exercise 1.1 - Workshop Lab Environment](1.1-setup/README.md) +* [Exercise 1.2 - Run Pre-upgrade Jobs](1.2-preupg/README.md) +* [Exercise 1.3 - Review Pre-upgrade Reports](1.3-report/README.md) +* [Exercise 1.4 - Perform Recommended Remediation](1.4-remediate/README.md) +* [Exercise 1.5 - (Optional) Custom Pre-upgrade Checks](1.5-custom-modules/README.md) +* [Exercise 1.6 - (Optional) Deploy a Pet App](1.6-my-pet-app/README.md) + +### Section 2 - RHEL OS Upgrade + +* [Exercise 2.1 - Run OS Upgrade Jobs](2.1-upgrade/README.md) +* [Exercise 2.2 - Let's Talk About Snapshots](2.2-snapshots/README.md) +* [Exercise 2.3 - Check if the Upgrade Worked](2.3-check-upg/README.md) +* [Exercise 2.4 - (Optional) How is the Pet App Doing?](2.4-check-pet-app/README.md) + +### Section 3 - Rolling Back + +* [Exercise 3.1 - (Optional) Trash the Instance](3.1-rm-rf/README.md) +* [Exercise 3.2 - Run Rollback Job](3.2-rollback/README.md) +* [Exercise 3.3 - Check if Upgrade Undone](3.3-check-undo/README.md) +* [Exercise 3.4 - Rinse and Repeat](3.4-conclusion/README.md) + +## Workshop Navigation + +Your will find links to the previous and next exercises at the bottom of each exercise page. Click the link below to get started. + +--- + +**Navigation** + +[Next Exercise](1.1-setup/README.md) diff --git a/exercises/ansible_ripu/images/ripu_key_features.svg b/exercises/ansible_ripu/images/ripu_key_features.svg new file mode 100644 index 000000000..9c9dd58fd --- /dev/null +++ b/exercises/ansible_ripu/images/ripu_key_features.svg @@ -0,0 +1,312 @@ + +image/svg+xml diff --git a/exercises/ansible_ripu/images/ripu_lab_diagram.svg b/exercises/ansible_ripu/images/ripu_lab_diagram.svg new file mode 100644 index 000000000..4451f8372 --- /dev/null +++ b/exercises/ansible_ripu/images/ripu_lab_diagram.svg @@ -0,0 +1,477 @@ + +image/svg+xmlControl nodeHosts to be upgradedEBS snapshotsRHEL7Pet application serversRHEL8Pet application serversAnsible Automation Platform diff --git a/roles/manage_ec2_instances/templates/skylight_windows_userdata.j2 b/roles/manage_ec2_instances/templates/skylight_windows_userdata.j2 index 41f4dce2b..19552c7cc 100755 --- a/roles/manage_ec2_instances/templates/skylight_windows_userdata.j2 +++ b/roles/manage_ec2_instances/templates/skylight_windows_userdata.j2 @@ -17,7 +17,7 @@ reg delete "HKLM\SOFTWARE\Policies\Microsoft\Windows\WinRM\Service" /v DisableRu Set-MpPreference -DisableRealtimeMonitoring $true # Enable WinRM -Invoke-WebRequest -Uri https://raw.githubusercontent.com/ansible/ansible/devel/examples/scripts/ConfigureRemotingForAnsible.ps1 -OutFile C:\ConfigureRemotingForAnsible.ps1 +Invoke-WebRequest -Uri https://raw.githubusercontent.com/ansible/ansible/stable-2.15/examples/scripts/ConfigureRemotingForAnsible.ps1 -OutFile C:\ConfigureRemotingForAnsible.ps1 C:\ConfigureRemotingForAnsible.ps1 -ForceNewSSLCert -EnableCredSSP Rename-Computer -NewName {{ vm_name }} -Force -Restart diff --git a/roles/manage_ec2_instances/templates/windows_userdata.txt.j2 b/roles/manage_ec2_instances/templates/windows_userdata.txt.j2 index a09b29794..8c0218d70 100644 --- a/roles/manage_ec2_instances/templates/windows_userdata.txt.j2 +++ b/roles/manage_ec2_instances/templates/windows_userdata.txt.j2 @@ -20,7 +20,7 @@ Set-MpPreference -DisableRealtimeMonitoring $true [Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12 # Enable WinRM -Invoke-WebRequest -Uri https://raw.githubusercontent.com/ansible/ansible/devel/examples/scripts/ConfigureRemotingForAnsible.ps1 -OutFile C:\ConfigureRemotingForAnsible.ps1 +Invoke-WebRequest -Uri https://raw.githubusercontent.com/ansible/ansible/stable-2.15/examples/scripts/ConfigureRemotingForAnsible.ps1 -OutFile C:\ConfigureRemotingForAnsible.ps1 C:\ConfigureRemotingForAnsible.ps1 -ForceNewSSLCert -EnableCredSSP # Set Administrator Password