Skip to content

Commit

Permalink
Using aggregation function on oracle and postgres (#8)
Browse files Browse the repository at this point in the history
Using aggregation function on oracle and postgres to speed up the process

Signed-off-by: GregoireW <24318548+GregoireW@users.noreply.github.com>
Co-authored-by: Fawaz PARAISO <fawaz.paraiso@decathlon.com>
Co-authored-by: GregoireW <24318548+GregoireW@users.noreply.github.com>
  • Loading branch information
3 people authored May 9, 2022
1 parent 153ae88 commit ba8f983
Show file tree
Hide file tree
Showing 57 changed files with 1,650 additions and 4,994 deletions.
45 changes: 45 additions & 0 deletions .github/workflows/build.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
name: Build

on:
pull_request: {}
push:
branches:
- main

jobs:
build:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0

- uses: actions/setup-python@v3
with:
python-version: '3.9'

- name: setup
run: |
pip install -r requirements.txt
pip install -r requirements-dev.txt
- name: test
run: |
coverage run -m unittest
coverage xml
- name: SonarCloud Scan
uses: SonarSource/sonarcloud-github-action@master
if: ${{ success() && github.event_name == 'push' && github.event.ref == 'refs/heads/main' }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}

- name: build docker image
run: docker build -t decathlon/scribedb:latest .

- name: push
if: ${{ success() && github.event_name == 'push' && github.event.ref == 'refs/heads/main' }}
run: |
echo "${{ secrets.DOCKERHUB_TOKEN }}" | docker login -u "${{ secrets.DOCKERHUB_USERNAME }}" --password-stdin
docker push decathlon/scribedb:latest
22 changes: 22 additions & 0 deletions .github/workflows/release.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: release version

on:
release:
types:
- published

jobs:
release:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v3

- name: build docker image
run: docker build -t decathlon/scribedb:${{ github.event.release.tag_name }} .

- name: push
run: |
echo "${{ secrets.DOCKERHUB_TOKEN }}" | docker login -u "${{ secrets.DOCKERHUB_USERNAME }}" --password-stdin
docker push decathlon/scribedb:${{ github.event.release.tag_name }}
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,5 @@
*.pyc
*.DS_Store
env/
helm/
bin/
2 changes: 2 additions & 0 deletions .tool-versions
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
python 3.9.0
testcontainers 3.5.2
23 changes: 0 additions & 23 deletions .travis.yml

This file was deleted.

36 changes: 18 additions & 18 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,25 +1,25 @@
FROM oraclelinux:7-slim
FROM python:3.9-bullseye

LABEL maintainer "oss@decathlon.com"
LABEL org.opencontainers.image.authors="oss@decathlon.com"

RUN curl -o /etc/yum.repos.d/public-yum-ol7.repo https://yum.oracle.com/public-yum-ol7.repo && \
yum -y install https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-7-x86_64/pgdg-oraclelinux96-9.6-3.noarch.rpm && \
yum-config-manager --enable ol7_oracle_instantclient && \
yum -y install oracle-instantclient18.3-basic oracle-instantclient18.3-devel oracle-instantclient18.3-sqlplus postgresql96 && \
echo /usr/lib/oracle/18.3/client64/lib > /etc/ld.so.conf.d/oracle-instantclient18.3.conf && \
ldconfig && \
yum install -y yum-utils && \
yum-config-manager --enable *EPEL && \
yum install -y python36 && \
yum install -y python36-pip && \
rm -rf /var/cache/yum
RUN apt-get update && apt-get install -y --no-install-recommends alien libaio1 wget && \
wget https://download.oracle.com/otn_software/linux/instantclient/185000/oracle-instantclient18.5-basic-18.5.0.0.0-3.x86_64.rpm && \
wget https://download.oracle.com/otn_software/linux/instantclient/185000/oracle-instantclient18.5-devel-18.5.0.0.0-3.x86_64.rpm && \
wget https://download.oracle.com/otn_software/linux/instantclient/185000/oracle-instantclient18.5-sqlplus-18.5.0.0.0-3.x86_64.rpm && \
alien -i oracle-instantclient18.5-basic-18.5.0.0.0-3.x86_64.rpm && \
alien -i oracle-instantclient18.5-devel-18.5.0.0.0-3.x86_64.rpm && \
alien -i oracle-instantclient18.5-sqlplus-18.5.0.0.0-3.x86_64.rpm && \
rm -f oracle-instantclient18.5-basic-18.5.0.0.0-3.x86_64.rpm && \
rm -f oracle-instantclient18.5-devel-18.5.0.0.0-3.x86_64.rpm && \
rm -f oracle-instantclient18.5-sqlplus-18.5.0.0.0-3.x86_64.rpm

ENV PATH=$PATH:/usr/lib/oracle/18.3/client64/bin
ENV LD_LIBRARY_PATH=usr/lib/oracle/18.3/client64/lib
ENV LD_LIBRARY_PATH="/usr/lib/oracle/18.5/client64/lib:${LD_LIBRARY_PATH}"
ENV PATH=$PATH:/usr/lib/oracle/18.5/client64/bin

COPY requirements.txt .
RUN pip3.6 install --no-cache-dir -r requirements.txt
RUN pip3 install --no-cache-dir -r requirements.txt

COPY scribedb/*.py /
COPY main.py /
COPY scribedb /scribedb

CMD ["python3.6","./scribedb.py"]
ENTRYPOINT ["python3","/main.py"]
16 changes: 0 additions & 16 deletions ENV_VARIABLES.md

This file was deleted.

53 changes: 0 additions & 53 deletions Jenkinsfile

This file was deleted.

2 changes: 1 addition & 1 deletion LICENCE → LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@
same "printed page" as the copyright notice for easier
identification within third-party archives.

Copyright {2019} {DECATHLON}
Copyright {2019-2022} {DECATHLON}

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
Expand Down
57 changes: 48 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,69 @@
# scribedb
[![Build Status](https://travis-ci.com/Decathlon/scribedb.svg?branch=pmpetit)](https://travis-ci.com/Decathlon/scribedb)

Compare data at schema level between postgresql and oracle. postgresql is server1 and oracle is server2 in this document.
![Build workflow](https://github.com/decathlon/scribedb/actions/workflows/build.yaml/badge.svg?branch=main)
![Last version](https://img.shields.io/github/v/release/decathlon/scribedb.svg)

Global Concept [Scribedb in gdoc](https://docs.google.com/presentation/d/1fm95I4YT40y5ZUj8Yaqxk-MaZO0ILIwpwGKuuNAk3JY/edit?usp=sharing)
The tool aims to compare data at schema level between two databases.

For instance, if we compare two dataset with a data difference on a single line, we may end up with a result like:

```text
1/3 NOK tgt hash:(6E12FA362B03456CC7601ABEBD454F35) (in 4532.164ms) 40%
src:(50, 60, 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ')
tgt:(50, 60, 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNO')
2/3 OK src hash:(5D1FE7284E48A7F751672C7096F3FE98) (in 163.449ms) 79%
Dataset are different
```

Today, postgresql and oracle databases are supported.

Here a global concept overview: [Scribedb in gdoc](https://docs.google.com/presentation/d/1fm95I4YT40y5ZUj8Yaqxk-MaZO0ILIwpwGKuuNAk3JY/edit?usp=sharing)

## Getting Started

Clone the project, it comes with an example directory. This example will create 3 containers (Postgres,Oracle,Scribedb) and will compare data bewteen dbs.
You can check the [example](example.md).

If you want to launch scribedb you can also use the docker image:

```bash
# We assume the configuration file in /working/dir/config.yaml reference the password DB1_PASS and DB2_PASS
$ docker run --rm -v /working/dir/config.yaml:/config.yaml -e DB1_PASS=xxxxx -e DB2_PASS=xxxxx decathlon/scribedb:2.0 -f /config.yaml
```

## Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.
New features are always welcome (but first, you should open an issue to discuss new idea)
Please read [contributing](CONTRIBUTING.md) and our [code of conduct](CODE_OF_CONDUCT.md), to check the process for submitting improvements/new features.

## Versioning

We use [SemVer](http://semver.org/) for versioning. For the versions available, see the [tags on this repository](https://github.com/dktunited/scribedb/tags).
We use [SemVer](http://semver.org/) for versioning. For the versions available, see the [tags on this repository](https://github.com/dktunited/scribedb/tags).

## Authors

* **Pierre-Marie Petit** - *Initial work*
* **Pierre-Marie Petit** - *Initial work*

See also the list of contributors who participated in this project.
See also the list of [contributors](CONTRIBUTORS.md) who participated in this project.

## Acknowledgments

* Hat tip to anyone whose code was used
* Inspiration
* etc

## License

> Copyright 2019-2022 Decathlon.
>
> Licensed under the Apache License, Version 2.0 (the "License");
> you may not use this file except in compliance with the License.
> You may obtain a copy of the License at
>
> http://www.apache.org/licenses/LICENSE-2.0
>
> Unless required by applicable law or agreed to in writing, software
> distributed under the License is distributed on an "AS IS" BASIS,
> WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> See the License for the specific language governing permissions and
> limitations under the License.
[Full license](LICENSE)
58 changes: 0 additions & 58 deletions docker-compose.yml

This file was deleted.

Loading

0 comments on commit ba8f983

Please sign in to comment.