Aiven database migration tool. This tool is meant for easy migration of databases from some database service provider, such AWS RDS, or on premises data center, to Aiven Database as a Service. However, it's not limited for Aiven services and it might be useful as a generic database migration tool.
Usually database service providers, such as Aiven, AWS RDS, and alike, don't allow superuser/root access.
Instead the service's master/admin user is granted permissions for the most common DBA tasks, see e.g.
https://help.aiven.io/en/articles/489557-postgresql-superuser-access.
In addition, service provider's web console/API can be used for performing some DBA tasks requiring more privileges than
granted for the master/admin user. However, the missing superuser access makes some existing database migrating tools,
such as pg_dumpall
, not useful when migrating database to/from service provider.
Currently this tool supports only PostgreSQL but we aim to add support for other databases, such as MySQL.
Requires Python 3.9 or newer.
Running library module:
$ python3 -m aiven_db_migrate.migrate -h
Available commands: pg
Installing in virtualenv:
$ python3 -m venv venv
$ . venv/bin/activate
$ ## Run make to set the proper version
$ make
$ pip install .
This installs console scripts which have the same interface as the library module:
pg_migrate
: PostgreSQL migration
Requirements:
pg_dump
: from any PostgreSQL version between the source and target versionspsql
: any modern version should work
Run library module:
$ python3 -m aiven_db_migrate.migrate pg -h
or, if installed:
$ pg_migrate -h
Migrating is supported to the same or newer PostgreSQL version starting from PostgreSQL 10 to PostgreSQL 14. Migrating to older version is not supported.
By default it searches pg_dump
under /usr/
, when using PostgreSQL installs on different directory such as on Mac, use --pgbin
parameter to define PostgreSQL home directory. e,g,
--pgbin /Applications/Postgres.app/Contents/Versions/14/bin
Supports regular data dump (pg_dump
) and logical replication (PostgreSQL 10 or newer).
In case that logical replication is not available or privileges/requirements are missing migrating falls back to
data dump.
Migrating from AWS RDS to Aiven for PostgreSQL. Logical replication is enabled in source AWS RDS PostgreSQL
server and aiven-extras
extension is installed in target database.
$ pg_migrate -s "postgres://postgres:<password>@jappja-pg1.chfhzaircbpb.eu-west-1.rds.amazonaws.com:5432/defaultdb" -t "postgres://avnadmin:<password>@pg1-test-jappja-test.avns.net:26192/defaultdb?sslmode=require"
# Or:
$ SOURCE_SERVICE_URI="postgres://postgres:<password>@jappja-pg1.chfhzaircbpb.eu-west-1.rds.amazonaws.com:5432/defaultdb" TARGET_SERVICE_URI="postgres://avnadmin:<password>@pg1-test-jappja-test.avns.net:26192/defaultdb?sslmode=require" pg_migrate
...
Roles:
rolname: 'rdsadmin', rolpassword: None, status: 'failed', message: 'must be superuser to create superusers'
rolname: 'rds_password', rolpassword: None, status: 'created', message: 'role created'
rolname: 'rds_superuser', rolpassword: None, status: 'created', message: 'role created'
rolname: 'test_user1', rolpassword: 'placeholder_kfbqrvmdhgrpgpvy', status: 'created', message: 'role created'
rolname: 'rds_ad', rolpassword: None, status: 'created', message: 'role created'
rolname: 'rds_iam', rolpassword: None, status: 'created', message: 'role created'
rolname: 'rds_replication', rolpassword: None, status: 'created', message: 'role created'
rolname: 'rdsrepladmin', rolpassword: None, status: 'failed', message: 'must be superuser to create replication users'
rolname: 'postgres', rolpassword: None, status: 'exists', message: 'role already exists'
rolname: 'test_user2', rolpassword: None, status: 'created', message: 'role created'
Databases:
dbaname: 'rdsadmin', method: None, status: 'failed', message: 'FATAL: pg_hba.conf rejects connection for host "80.220.195.174", user "postgres", database "rdsadmin", SSL on\nFATAL: pg_hba.conf rejects connection for host "80.220.195.174", user "postgres", database "rdsadmin", SSL off\n'
dbaname: 'defaultdb', method: 'replication', status: 'running', message: 'migrated to existing database'
By default logical replication is left running and the created pub/sub objects need to be cleaned up once workloads have been
moved to the new server. Objects created by this tool are named like managed_db_migrate_<dbname>_<sub|pub|slot>
.
Starting from the target (using aiven-extras
extension), get first the subscription name:
defaultdb= > SELECT * FROM aiven_extras.pg_list_all_subscriptions();
and then drop it:
defaultdb= > SELECT * FROM aiven_extras.pg_drop_subscription('managed_db_migrate_defaultdb_sub');
Note that with aiven-extras
dropping subscription in target also drops replication slot in source (dblink
).
In the source get first the publication name:
defaultdb=> SELECT * FROM pg_publication;
and then drop it:
defaultdb=> DROP PUBLICATION managed_db_migrate_defaultdb_pub;
In case that aiven-extras
is not used clean up replication slot too:
defaultdb=> SELECT * FROM pg_replication_slots;
defaultdb=> SELECT * FROM pg_drop_replication_slot('managed_db_migrate_defaultdb_slot');
Using --max-replication-lag
waits until replication lag in bytes is less than/equal to given max replication lag. This
can be used together with --stop-replication
to clean up all created pub/sub objects when replication is done.
With --validate
only best effort validation is run. This checks e.g. PL/pgSQL languages, extensions etc. installed
in source are also installed/available in target.
Use --no-replicate-extension-tables
to skip extension tables. By default it attempts to replicate all extension tables during logical replication.
With --force-method
you can specify if you wish to use either replication or dump method. Otherwise the most suitable method is chosen automatically.
Using --dbs-max-total-size
together with --validate
you can check if the size of the source database in below some threshold.
Migrating from AWS RDS to Aiven for PostgreSQL. Logical replication is enabled in source AWS RDS PostgreSQL
server but aiven-extras
extension is not installed in target database so migrating falls back to data dump.
>>> from aiven.migrate import PGMigrate, PGMigrateResult
>>> pg_mig = PGMigrate(source_conn_info="postgres://postgres:<password>@jappja-pg1.chfhzaircbpb.eu-west-1.rds.amazonaws.com:5432/defaultdb", target_conn_info="postgres://avnadmin:<password>@pg2-test-jappja-test.avns.net:26192/defaultdb?sslmode=require")
>>> result: PGMigrateResult = pg_mig.migrate()
...
Logical replication failed with error: 'must be superuser to create subscriptions', fallback to dump
>>> result
PGMigrateResult(pg_databases={'rdsadmin': {'dbname': 'rdsadmin', 'message': 'FATAL: pg_hba.conf rejects connection for host "80.220.195.174", user "postgres", database "rdsadmin", SSL on\nFATAL: pg_hba.conf rejects connection for host "80.220.195.174", user "postgres", database "rdsadmin", SSL off\n', 'method': None, 'status': 'failed'}, 'defaultdb': {'dbname': 'defaultdb', 'message': 'migrated to existing database', 'method': 'dump', 'status': 'done'}}, pg_roles={'rdsadmin': {'message': 'must be superuser to create superusers', 'rolname': 'rdsadmin', 'rolpassword': None, 'status': 'failed'}, 'rds_password': {'message': 'role created', 'rolname': 'rds_password', 'rolpassword': None, 'status': 'created'}, 'rds_superuser': {'message': 'role created', 'rolname': 'rds_superuser', 'rolpassword': None, 'status': 'created'}, 'test_user1': {'message': 'role created', 'rolname': 'test_user1', 'rolpassword': 'placeholder_qkdryldfsrdaocio', 'status': 'created'}, 'rds_ad': {'message': 'role created', 'rolname': 'rds_ad', 'rolpassword': None, 'status': 'created'}, 'rds_iam': {'message': 'role created', 'rolname': 'rds_iam', 'rolpassword': None, 'status': 'created'}, 'rds_replication': {'message': 'role created', 'rolname': 'rds_replication', 'rolpassword': None, 'status': 'created'}, 'rdsrepladmin': {'message': 'must be superuser to create replication users', 'rolname': 'rdsrepladmin', 'rolpassword': None, 'status': 'failed'}, 'postgres': {'message': 'role already exists', 'rolname': 'postgres', 'rolpassword': None, 'status': 'exists'}, 'test_user2': {'message': 'role created', 'rolname': 'test_user2', 'rolpassword': None, 'status': 'created'}})
- requires PostgreSQL 10 or newer
wal_level
needs to belogical
- currently supports only FOR ALL TABLES publication in source
- aiven-extras extension installed in both source and target database, or
- superuser or superuser-like privileges, such as
rds_replication
role in AWS RDS, in both source and target - AWS RDS additional settings/info
- schemas are migrated without object ownership; the user used for migration is given all object ownership
- NOTE: schema changes break logical replication
- roles with
LOGIN
attribute are migrated using placeholder passwords:placeholder_<16 char random string>
- migrating superuser or replication roles requires superuser privileges
- requires whitelisting the extension in target with pgextwlist, or
- superuser or superuser-like privileges
- Aiven for PostgreSQL supported extensions
Install build depends (Fedora):
$ make build-dep-fedora
Style checks:
$ make validate-style
Fix style errors with:
$ make isort
$ make yapf
Static checks (flake8
, pylint
and mypy
):
$ make static-checks
Tests (pytest
):
$ make test
Running whole test set takes time since all supported migration paths are tested. During development it's usually enough to run tests only for a certain PostgreSQL version, e.g.:
$ PG_VERSION="12" make test
It's also possible to test migration from one PostgreSQL version to another, e.g.:
$ PG_SOURCE_VERSION="10" PG_TARGET_VERSION="12" make test
Test set can be targeted even further by invoking pytest
, e.g.:
$ PG_SOURCE_VERSION="10" PG_TARGET_VERSION="12" python3 -m pytest -s test/test_pg_migrate.py::Test_PGMigrate::test_migrate
- JSON output with CLI (for automation)
- Hard to make pg_dump silent for outputting JSON to stdout
- Output json to file instead?
- More options
- --dump-only, --repl-only
- --include-databases, --exclude-databases
- --include-tables, --exclude-tables
- --role-passwords (role/passwords file for creating roles with real passwords instead of placeholders)
- More tests
- Notably error/corner cases
- Schema changes break logical replication
- While logical replication is running dump schema periodically and check if has changed, e.g. by calculating hash of the schema dump
- How to continue if schema has changed? Stop replication, dump schema and restart replication?
- Proper README + API doc
- RPM build recipe for aiven-core/prune integration
- Test automation: Jenkins/Github Actions