MobyDQ is a tool for data engineering teams to automate data quality checks on their data pipeline, capture data quality issues and trigger alerts in case of anomaly, regardless of the data sources they use.
This tool has been inspired by an internal project developed at Ubisoft Entertainment in order to measure and improve the data quality of its Enterprise Data Platform. However, this open source version has been reworked to improve its design, simplify it and remove technical dependencies with commercial software.
Skip the bla bla and run your data quality indicators by following the Getting Started page. The complete documentation is also available on Github Pages: https://ubisoft.github.io/mobydq.
Some screenshot of the web application to give you a taste of how it's like.
Run MobyDQ in development mode with the following command:
$ cd mobydq
$ docker-compose -f docker-compose.yml -f docker-compose.dev.yml up db graphql app nginx
Run MobyDQ in production mode with the following command. The argument -d
is to run containers in the background as daemons.
$ cd mobydq
$ docker-compose up -d db graphql app nginx
You can run tests using the following commands:
$ cd mobydq
# Start test database instances
$ docker-compose -f docker-compose.yml -f docker-compose.test.yml up -d db graphql
$ docker-compose -f docker-compose.yml -f docker-compose.test.yml up -d db-cloudera db-mysql db-mariadb db-postgresql db-sql-server
# Run tests
$ docker-compose -f docker-compose.yml -f docker-compose.test.yml up test-db test-scripts
# Run linter
$ docker-compose -f docker-compose.yml -f docker-compose.test.yml build test-scripts test-lint-python
$ docker run --rm mobydq-test-lint-python pylint scripts test
- To be documented