-
Notifications
You must be signed in to change notification settings - Fork 30
Existing Data Migration Quick Start Guide
This document outlines how to deploy the Migration Assistant and execute an existing data migration using Reindex-from-Snapshot (RFS). Note that this does not include steps for deploying and capturing live traffic, which is necessary for a zero-downtime migration. Please refer to the "Phases of a Migration" section in the wiki navigation bar for a complete end-to-end migration process, including metadata migration, live capture, Reindex-from-Snapshot, and replay.
- Verify your migration path is supported. Note that we test with the exact versions specified, but you should be able to migrate data on alternative minor versions as long as the major version is supported.
- Source cluster must be deployed with the S3 plugin.
- Target cluster must be deployed.
- A snapshot will be taken and stored in S3 in this guide, and the following assumptions are made about this snapshot:
- The
_source
flag is enabled on all indices to be migrated. - The snapshot includes the global cluster state (
include_global_state
istrue
). - Shard sizes up to approximately 80GB are supported. Larger shards will not be able to migrate. If this is a blocker, please consult the migrations team.
- The
- Migration Assistant will be installed in the same region and have access to both the source snapshot and target cluster.
- Log into the target AWS account where you want to deploy the Migration Assistant.
- From the browser where you are logged into your target AWS account right-click here ↗ to load the CloudFormation (Cfn) template from a new browser tab.
- Follow the CloudFormation stack wizard:
-
Stack Name:
MigrationBootstrap
-
Stage Name:
dev
- Hit Next on each step, acknowledge on the fourth screen, and hit Submit.
-
Stack Name:
- Verify that the bootstrap stack exists and is set to
CREATE_COMPLETE
. This process takes around 10 minutes.
- After deployment, find the EC2 instance ID for the
bootstrap-dev-instance
. - Create an IAM policy using the snippet below, replacing
<aws-region>
,<aws-account>
,<stage>
, and<ec2-instance-id>
:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "ssm:StartSession",
"Resource": [
"arn:aws:ec2:<aws-region>:<aws-account>:instance/<ec2-instance-id>",
"arn:aws:ssm:<aws-region>:<aws-account>:document/SSM-<stage>-BootstrapShell"
]
}
]
}
- Name the policy, e.g.,
SSM-OSMigrationBootstrapAccess
, and create the policy.
- AWS CLI and AWS Session Manager Plugin installed.
- AWS credentials configured (
aws configure
).
- Load AWS credentials into your terminal.
- Login to the instance using the command below, replacing
<instance-id>
and<aws-region>
:
aws ssm start-session --document-name SSM-dev-BootstrapShell --target <instance-id> --region <aws-region> [--profile <profile-name>]
- Once logged in, run the following command from the shell of the bootstrap instance (within the /opensearch-migrations directory):
./initBootstrap.sh && cd deployment/cdk/opensearch-service-migration
- After a successful build, remember the path for infrastructure deployment in the next step.
- Add the target cluster password to AWS Secrets Manager as an unstructured string. Be sure to copy the secret ARN for use during deployment.
- From the same shell on the bootstrap instance, modify the cdk.context.json file located in the
/opensearch-migrations/deployment/cdk/opensearch-service-migration
directory:
{
"migration-assistant": {
"vpcId": "<TARGET CLUSTER VPC ID>",
"targetCluster": {
"endpoint": "<TARGET CLUSTER ENDPOINT>",
"auth": {
"type": "basic",
"username": "<TARGET CLUSTER USERNAME>",
"passwordFromSecretArn": "<TARGET CLUSTER PASSWORD SECRET>"
}
},
"sourceCluster": {
"endpoint": "<SOURCE CLUSTER ENDPOINT>",
"auth": {
"type": "basic",
"username": "<TARGET CLUSTER USERNAME>",
"passwordFromSecretArn": "<TARGET CLUSTER PASSWORD SECRET>"
}
},
"reindexFromSnapshotExtraArgs": "<RFS PARAMETERS (see below)>",
"stage": "dev",
"otelCollectorEnabled": true,
"migrationConsoleServiceEnabled": true,
"reindexFromSnapshotServiceEnabled": true,
"migrationAssistanceEnabled": true
}
}
The source and target cluster authorization can be configured to have none, basic
with a username and password, or sigv4
. There are examples of each available here.
- Bootstrap the account with the following command:
cdk bootstrap --c contextId=migration-assistant --require-approval never
- Deploy the stacks:
cdk deploy "*" --c contextId=migration-assistant --require-approval never --concurrency 5
- Verify that all CloudFormation stacks were installed successfully.
- If you're creating a snapshot using migration tooling, these parameters are auto-configured. If you're using an existing snapshot, modify
reindexFromSnapshotExtraArgs
with the following values:
--s3-repo-uri s3://<bucket-name>/<repo> --s3-region <region> --snapshot-name <name>
Note, you will also need to give access to the migrationconsole and reindexFromSnapshot taskRole permissions to the bucket
- Bootstrap the account:
cdk bootstrap --c contextId=migration-assistant --require-approval never --concurrency 5
- Deploy the stacks when
cdk.context.json
is fully configured:
cdk deploy "*" --c contextId=migration-assistant --require-approval never --concurrency 3
- Migration Assistant Network stack
- Reindex From Snapshot stack
- Migration Console stack
Run the following command to access the migration console:
./accessContainer.sh migration-console dev <region>
Note
accessContainer.sh
is located in /opensearch-migrations/deployment/cdk/opensearch-service-migration/
on the bootstrap instance.
Learn more Accessing the Migration Console
To verify the connection to the clusters, run:
console clusters connection-check
- Source Cluster: Successfully connected!
- Target Cluster: Successfully connected!
Learn more Console commands reference
Run the following to initiate creating a snapshot from the source cluster
console snapshot create [...]
To check on the progress,
console snapshot status [...]
or, for more detail,
console snapshot status --deep-check [...]
Wait for the snapshot to complete before moving to the next step.
Learn more Snapshot Creation Verification Snapshot Creation
Run the following command to migrate metadata:
console metadata migrate [...]
Learn more Metadata Migration
Start the backfill process:
console backfill start
Scale up the number of workers:
console backfill scale <NUM_WORKERS>
Check the status:
console backfill status
To stop the workers:
console backfill stop
Learn more Backfill Execution
Use the following command for detailed monitoring:
console backfill status --deep-check
BackfillStatus.RUNNING
Running=9
Pending=1
Desired=10
Shards total: 62
Shards completed: 46
Shards incomplete: 16
Shards in progress: 11
Shards unclaimed: 5
Logs and metrics are available in CloudWatch in the OpenSearchMigrations log group.
Use the following query in CloudWatch Logs Insights to identify failed documents:
fields @message
| filter @message like "Bulk request succeeded, but some operations failed."
| sort @timestamp desc
| limit 10000
Learn more Backfill Result Validation
Encountering a compatibility issue or missing feature?
- Search existing issues to see if it’s already reported. If it is, feel free to upvote and comment.
- Can’t find it? Create a new issue to let us know.
- Migration Assistant Overview
- Is Migration Assistant Right for You?
- Existing Data Migration - Quick Start Guide
- A. Snapshot Creation Verification
- B. Client Traffic Switchover Verification
- C. Traffic Capture Verification
- D. System Reset Before Migration