Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

procedures: rewrite and expand the "Backup and disaster recovery" sections of the admin guide #2121

Merged
merged 242 commits into from
Jan 25, 2022

Conversation

max-cx
Copy link
Contributor

@max-cx max-cx commented Sep 22, 2021

What does this pull request change?

https://www.eclipse.org/che/docs/che-7/administration-guide/backup-and-disaster-recovery/
Preview

What issues does this pull request fix or reference?

3268

Specify the version of the product this pull request applies to

7.36

Pull Request checklist

The author and the reviewers validate the content of this pull request with the following checklist, in addition to the automated tests.

  • Any procedure:
    • Successfully tested.
  • Any page or link rename:
  • Builds on Eclipse Che hosted by Red Hat.
  • the Validate language on files added or modified step reports no vale warnings.

max-cx and others added 30 commits July 16, 2021 23:57
…-a-file.adoc

Co-authored-by: Rolfe Dlugy-Hegwer <rolfedh@users.noreply.github.com>
…-a-file.adoc

Co-authored-by: Rolfe Dlugy-Hegwer <rolfedh@users.noreply.github.com>
…-a-file.adoc

Co-authored-by: Rolfe Dlugy-Hegwer <rolfedh@users.noreply.github.com>
…-a-file.adoc

Co-authored-by: Rolfe Dlugy-Hegwer <rolfedh@users.noreply.github.com>
…-a-file.adoc

Co-authored-by: Rolfe Dlugy-Hegwer <rolfedh@users.noreply.github.com>
…-an-variable.adoc

Co-authored-by: Rolfe Dlugy-Hegwer <rolfedh@users.noreply.github.com>

{prod-short} can use the following backup servers that are compatible with the integrated Restic:

SFTP:: See the documentation for the SFTP server solution you plan to use (link:https://www.openssh.com/[OpenSSH] or a derived commercial product) and the link:https://restic.readthedocs.io/en/latest/030_preparing_a_new_repo.html#sftp[Restic Docs on SFTP].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: I'd like to put it at the end of the list, please, or even better, reverse the order to have rest, s3, sftp.

Copy link
Contributor Author

@max-cx max-cx Dec 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmorhun
I understand that I'm not a typical admin. That said, on another page, we recommend a specific REST server with which I ran into the following issues when I was trying to set it up:

  1. Neither restic nor rest server show any error messages if something in the setup is missing or not as expected. So trying this combination of both restic nor rest server to work together using a remote server turned into frustrating guesswork for me.
  2. The README on https://github.com/restic/rest-server is brief and unclear, and I couldn't find online any third-party explanation about the details of how it should be used correctly (without experimenting to make it work "trial and error"). I believe they need to improve their docs.
  3. I was setting up the external server from scratch, so I ended up having to research and install some dependencies for the rest server that I didn't have to worry about for SFTP.

So I killed a lot of time trying to make the rest server work on a remote server, and then in contrast I quickly and easily set up an SFTP server. BTW, SFTP also involves an open source solution.

The page on REST servers is full-size in this PR. It's just that I moved it to the third place for users who have no preference and just pick the first option (like I did). If any users prefer to use rest server, they still have everything we expect to give them in the docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@max-cx thank you for explaining the reasons. In my case, REST server was the easiest solution, it just worked, where as I had to deal with SSH keys and proper settings for SSH server in case of SFTP (not a problem, but took more time than for REST server).
@tolusha what do you think about this?


include::snip_warning-backup-snapshots-are-bound-to-specific-cluster.adoc[]

IMPORTANT: The `CheClusterRestore` custom object is unusable for recovering a {prod-short} instance of an earlier version of {prod-short}. Use `{prod-cli}` to recover a {prod-short} instance of an earlier version of {prod-short}; see xref:restoring-a-che-instance-from-a-backup-by-using-prod-cli_{context}[].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: this is more my fault, but it is still possible if admin manually deletes current and deploy required version of Che and the apply the procedure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmorhun I can fix it. Could you clarify what you want me to change or add in this IMPORTANT admonition?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should hightlight that, but just let users know. Preferred way is to use chectl, but just in case, if a user doesn't want to use it for some reason, we can tell about a long way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmorhun, no worries, I've changed it to:

IMPORTANT: Do not use the `CheClusterRestore` custom object 
to recover a {prod-short} instance of an earlier version of {prod-short}! 
Use only `{prod-cli}` to recover a {prod-short} instance of 
an earlier version of {prod-short}; 
see xref:restoring-a-che-instance-from-a-backup-by-using-prod-cli_{context}[]!

useInternalBackupServer: false <2>
----
<1> Name of the `CheBackupServerConfiguration` custom object defining what backup server to use.
<2> Configures the custom resource to back up to a backup server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rephrase because it backup to a backup server anyway. The question is between internal backup server managed by Che or a server (internal or external) managed by admin.
nitpick: it configures operator via this CR, not CR itself

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmorhun, I've changed it to:

<2> Configures the Operator via this custom resource 
to use the {prod-short}-managed internal backup server 
or an administrator-managed external backup server 
(SFTP, Amazon S3 or S3 API compatible storage, or REST).

Copy link
Contributor

@mmorhun mmorhun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but please address my remarks

@max-cx max-cx changed the title procedures: substantially edit "Backup and disaster recovery" sections of the admin guide procedures: completely rewrite and expand in length the "Backup and disaster recovery" sections of the admin guide Jan 5, 2022
@max-cx max-cx changed the title procedures: completely rewrite and expand in length the "Backup and disaster recovery" sections of the admin guide procedures: rewrite and expand the "Backup and disaster recovery" sections of the admin guide Jan 5, 2022
@max-cx
Copy link
Contributor Author

max-cx commented Jan 5, 2022

@themr0c

Mykola completed his engineering review.
Brian, Rolfe, Preeti, and Oss completed their language reviews.
I updated the PR in response to all of the suggestions and nitpicks.

This PR can be (finally!) merged after the following Vale updates that I requested are completed:
redhat-documentation/vale-at-red-hat#145

@themr0c themr0c merged commit 0fc2d92 into eclipse-che:master Jan 25, 2022
themr0c added a commit that referenced this pull request Jan 25, 2022
…tions of the admin guide (#2121)

Co-authored-by: Rolfe Dlugy-Hegwer <rolfedh@users.noreply.github.com>
Co-authored-by: Robert Krátký <rkratky@redhat.com>
Co-authored-by: Michal Maléř <mmaler@redhat.com>
Co-authored-by: Fabrice Flore-Thébault <ffloreth@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants