Velero Restore Hooks Product Requirements Document #2679
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Velero Restore Hooks - PRD (Product Requirements Document)
Relates to: #2116
Change tracking
This is a live document, you can reach me on the following channels for more information or for any questions:
Relates to Git Issues:
Background
Velero supports restore operations but there are gaps in the process. Gaps in the restore process require users to manually carry out steps to start, clean up, and end the restore process. Other gaps in the restore process can cause issues with application performance for applications running in a pod when a restore operation is carried out.
On a restore, Velero currently does not include hooks to execute a pre- or -post restore script. As a result, users are required to perform additional actions following a velero restore operation. Some gaps that currently exist in the Velero restore process are:
Strategic Fit
Adding a restore hook action today would allow Velero to unpack the data that was backed up in an automated way by enabling Velero to execute commands in containers during a restore. This will also improve the restore operations on a container and mitigate against any negative performance impacts of apps running in the container during restore.
Purpose / Goal
The purpose of this feature is to improve the extensibility and user experience of pre and post restore operations for Velero users.
Goals for this feature include:
Non-goals
Feature Description
This feature will automate the restore operations/processes in Velero, and will provide restore hook actions in Velero that allows users to execute restore commands on a container. Restore hooks include pre-hook actions and post-hook actions.
This feature will mirror the actions of a Velero backup by allowing Velero to check for any restore hooks specified for a pod.
Assumptions
Use Cases
The following use cases must be included as part of the Velero restore hooks MVP (minimum viable product).
USE CASE 1
**Title: **Create restore pre-hook
**Description:**As a user, I would like to run Velero pre-hook for performing restore operations on a container at the start of a restore operation.
**Functional Requirements:**The restore pre-hook should allow the user to run the command on the container where the pre-hook should be executed. Similar to the backup hooks, this hook should run to default to fun on the first container in the pod.
**Note: **If the user does not want to the hook to default to the first container in the pod, the user should be able to specify which container on which to run the container restore hook.
USE CASE 2
**Title: **Automate setting backup storage location to read-only on restore start.
**Description: **As a user, at the start of a restore operation for a specified backup name, I would like to automatically set the backup storage location to ‘read-only’ mode prior to the start of the ‘velero restore create --from-backup command executing.
USE CASE 3
**Title: **Automate to default to most recent backup snapshot use on restore as optional setting.
**Description:**As a user, once a restore operation has started, I would like velero to create the restore using the most recent Velero backups snapshot by default.
USE CASE 4
**Title: **Annotate specific backup snapshot use on restore with snapshots.
**Description: **As a user, I would like the option to specify a specific backup snapshot for use by Velero during restore create. I would like to do this instead of using the default most recent backup restore snapshot.
USE CASE 5
**Title: **Restore all resources by default.
**Description: **As a user, I would like to include all resources in namespaces contained in a backup by default in my restore spec.
If velero is asked to restore something that already exists in a pod, the restore will not return a success but will still work - Ashish needs to verify.
USE CASE 6
**Title: **Exclude resources from restore.
**Description: **As a user, in my restore spec, I would like to annotate specific namespaces to exclude from a restore.
USE CASE 7
**Title: **Restore of a stateful application (Unquiescing data from a quiesced backup)
**Description: **As a user, I would like to unquiesce data during a restore to prevent the need to shut down the database and disrupt the application end user experience.
USE CASE 8
**Title: **Display/surface restore status
Description: As a user, I would like to see the status of my restore status surfaced from the pod volume restore status.
USE CASE 9
**Title: **Retry restore upon restore failure/error/timeout
**Description: **As a user, if I see that a restore has failed, I would like for Velero to retry the restore operations using the restore specification.
Retry should happen to support the following failure/error scenarios for a restore:
USE CASE 10
**Title: **Return backup storage location to read-write mode
**Description: **As a user, once the restore is complete, I would like Velero to automatically revert the backup storage location to read-write mode.
Use Case 11
**Title: **Delete all restore objects by default.
**Description: **As a user, I would like to delete all CRs associated the restore as part of clean-up operations following the restore.
User Experience
The following is representative of what the user experience could look like for Velero restore pre-hooks and post-hooks.
Note: These examples are representative and are not to be considered for us in pre- and post- restore hook operations until the technical design is complete.
Restore Pre-Hooks
Container Command
Command Execute
Includes commands for:
Create
Set backup storage location to read only
Set backup storage location to read-write
Error handling
Timeout
Requirements
P0 = must not ship without
P1 = should not ship without
P2 = nice to have
P0. Running restore hook at the start of a restore operation.
P0. Automate setting backup storage location to read-only on restore start.
P0. Automate to default to most recent backup snapshot use on restore as optional setting.
P0. Include all resources in namespaces contained in a backup by default in my restore operation.
P0. Restore of a stateful application (Unquiescing data from a quiesced backup.)
P0. Retry restore upon restore failure/error/timeout.
P0. Return backup storage location to read-write mode.
P0. Delete all restore objects by default.
P1. ** **Annotate specific backup snapshot use on restore with snapshots.
P1. Exclude resources from restore.
P1. Display/surface restore status.
Out of scope
Verifying the integrity of a backup, resource, or other artifact will not be included in the scope of this effort.
Questions
For questions, please contact michaelmi@vmware.com, bstephanie@vmware.com