Skip to content

X_Elasticsearch 5.6 backup instructions

Jun Li edited this page Jun 9, 2021 · 1 revision

These are the instructions to create a snapshot/backup and restore data for the legal API.

Look at the official docs elastic search on cloud.gov to check for changes and to add more context

Management Commands

Configure backup repository

This should only have to be run once, if it's never been run before.

  • Configure repository: cf run-task api --command "python manage.py configure_backup_repository" -m 4G --name es-backup-1

Create a backup

  • Create a backup: cf run-task api --command "python manage.py create_elasticsearch_backup" -m 4G --name es-backup-1
  • Create a backup with a special name: cf run-task api --command "python manage.py create_elasticsearch_backup -s my_test_backup" -m 4G --name es-backup-2

Restore from backup

  • Go to the latest backup: cf run-task api --command "python manage.py restore_elasticsearch_backup" -m 4G --name es-backup-1
  • Go to specific backup: cf run-task api --command "python manage.py restore_elasticsearch_backup -s 20180725_archived_mur_reload_in_progress" -m 4G --name es-backup-2

Wait a couple minutes

Manual Instructions

Set up instructions

Set up a service key or, you can set up the variables locally. You will need following variables:

export es_hostname=""
export es_port=""
export es_username=""
export es_password=""

export s3_region=""
export s3_bucket=""
export s3_access_key=""
export s3_secret_key=""

You can get the elastic search(es_) and s3 bucket(s3_) environment variables with cf env api You will want to use the api backup bucket in the environment.

Make sure you are not running elasticsearch locally, or you will get an address already in use error.

After confirming are in the right environment, you are ready to open a connection to elasticsearch via ssh by running:

cf ssh api -L "9200:${es_hostname}:${es_port}"

Keep open this ssh session in a tab.

One-time setup

If this is the first time doing this or things were not set up with service keys, you will want to create a repository for elasticsearch. This will allow elasticsearch to connect to the bucket.

In another tab run the following command replace "my" with legal or eregs depending on which elasticsearch instance you are backing up:

curl -X PUT -u "${es_username}:${es_password}" "localhost:9200/_snapshot/my_s3_repository" -d @<(cat <<EOF
{
  "type": "s3",
  "settings": {
    "bucket": "${s3_bucket}",
    "region": "${s3_region}",
    "access_key": "${s3_access_key}",
    "secret_key": "${s3_secret_key}"
  }
}
EOF
)

You can see all snapshots in legal by running:

curl -X GET -u "${es_username}:${es_password}" "localhost:9200/_snapshot/legal_s3_repository/_all" | python -m json.tool | less

Create a backup

After you set up your environment, and have the ssh session going, in a new tab create the snapshot that you can use as a backup with:

curl -X PUT -u "${es_username}:${es_password}" "localhost:9200/_snapshot/my_s3_repository/my_s3_snapshot"

Restore from a backup

Delete existing index

The existing index can be delete using:

python manage.py delete_docs_index

Restore from backup

You can restore from backup after you set up your environment, have a backup, and have the ssh session going. In a new tab, run:

curl -X POST -u "${es_username}:${es_password}" "localhost:9200/_snapshot/my_s3_repository/my_s3_snapshot/_restore" -d '{"indices": "docs"}'

You can check the status of the snapshot for errors with:

curl -X GET -u "${es_username}:${es_password}"  "localhost:9200/_snapshot/_status"

Next you can check the result in the ssh session by looking up the uri with env in your ssh session and running:

curl <uri>/docs/_count

You will get an all shards failed error while the process is loading, but should be fine after it loads.

You will also need to reload current MURs and AO's in the likely case that any have been published or updated since the backup:

python manage.py refresh_current_legal_docs_zero_downtime

Once that looks good, check the data on the website.

Here is the more documentation on elasticsearch snapshots: https://www.elastic.co/guide/en/elasticsearch/reference/5.6/modules-snapshots.html