py-mongo-sync

It synchronizes data from a replica set to another MongoDB deployment, e.g., standalone, replica set, and sharded cluster.

It's oplog-based and provides a realtime data synchronization.

It's written in Python 2.7.

Support

MongoDB 2.4
MongoDB 2.6
MongoDB 3.0
MongoDB 3.2
MongoDB 3.4

Features

initial sync and oplog based incremental sync
sync the specified databases and collections
concurrent oplog replaying

Requirements

See requirements for details.

gevent
toml
mmh3
pymongo

Always use pymongo 3.5.1.

Refer to https://api.mongodb.com/python/3.6.0/changelog.html

Version 3.6 adds support for MongoDB 3.6, drops support for CPython 3.3 (PyPy3 is still supported), and drops support for MongoDB versions older than 2.6. If connecting to a MongoDB 2.4 server or older, PyMongo now throws a ConfigurationError.

Notice

source MUST be a replica set
ignore system databases
- admin
- local
ignore system collections
- system.*
create users for destination manually if necessary
suggest to authenticate with administrator if source enabled authentication
not support geospatial index

if the source is a sharded cluster

first, stop the balancer
then, start a seprate sync process for each shard

Configurations

Use TOML as configuration file format.

Refer to mongo_conf.toml.

src

Source config items.

src.hosts - hostportstr of a member of replica set
src.username - username
src.password - password
src.authdb - authentiction database

dst

Destination config items.

dst.mongo.hosts
dst.mongo.authdb
dst.mongo.username
dst.mongo.password

sync

Custom options for synchronization.

sync.dbs specfies the databases to sync. sync.dbs.colls specifies the collections to sync.

sync.dbs - databases to sync, sync all databases if not specify
- sync.dbs.db - source database name
- sync.dbs.rename_db - destination database name, stay the same if not specify
- sync.dbs.colls - collectons to sync, sync all collections if not specify

coll in sync.dbs.colls element specifies the collection to sync. fileds in sync.dbs.colls element specifies the fields of current collection to sync.

log

log.filepath - log file path, write to stdout if empty or not set

Usage

Command options has functional limitations. It's strongly recommended that use config file.

sync

usage: sync.py [-h] [-f [CONFIG]] [--src [SRC]] [--src-authdb [SRC_AUTHDB]]
               [--src-username [SRC_USERNAME]] [--src-password [SRC_PASSWORD]]
               [--dst [DST]] [--dst-authdb [DST_AUTHDB]]
               [--dst-username [DST_USERNAME]] [--dst-password [DST_PASSWORD]]
               [--start-optime [START_OPTIME]]
               [--optime-logfile [OPTIME_LOGFILE]] [--logfile [LOGFILE]]

Sync data from a replica-set to another MongoDB/Elasticsearch.

optional arguments:
  -h, --help            show this help message and exit
  -f [CONFIG], --config [CONFIG]
                        configuration file, note that command options will
                        override items in config file
  --src [SRC]           source should be hostportstr of a replica-set member
  --src-authdb [SRC_AUTHDB]
                        src authentication database, default is 'admin'
  --src-username [SRC_USERNAME]
                        src username
  --src-password [SRC_PASSWORD]
                        src password
  --dst [DST]           destination should be hostportstr of a mongos or
                        mongod instance
  --dst-authdb [DST_AUTHDB]
                        dst authentication database, default is 'admin', for
                        MongoDB
  --dst-username [DST_USERNAME]
                        dst username, for MongoDB
  --dst-password [DST_PASSWORD]
                        dst password, for MongoDB
  --start-optime [START_OPTIME]
                        timestamp in second, indicates oplog based increment
                        sync
  --optime-logfile [OPTIME_LOGFILE]
                        optime log file path, use this as start optime if
                        without '--start-optime'
  --logfile [LOGFILE]   log file path

TODO List

command options tuning
config file format tuning
sync sharding config (enableSharding & shardCollection)

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
example		example
mongosync		mongosync
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
_config.yml		_config.yml
check.py		check.py
requirements.txt		requirements.txt
sync.py		sync.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

py-mongo-sync

Support

Features

Requirements

Notice

Configurations

src

dst

sync

log

Usage

sync

TODO List

About

Releases

Packages

Contributors 3

Languages

License

caosiyang/py-mongo-sync

Folders and files

Latest commit

History

Repository files navigation

py-mongo-sync

Support

Features

Requirements

Notice

Configurations

src

dst

sync

log

Usage

sync

TODO List

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages