Skip to content
This repository has been archived by the owner on Sep 23, 2024. It is now read-only.

transferwise/pipelinewise-tap-oracle

 
 

Repository files navigation

Notice

To better serve Wise business and customer needs, the PipelineWise codebase needs to shrink. We have made the difficult decision that, going forward many components of PipelineWise will be removed or incorporated in the main repo. The last version before this decision is v0.64.1

We thank all in the open-source community, that over the past 6 years, have helped to make PipelineWise a robust product for heterogeneous replication of many many Terabytes, daily

pipelinewise-tap-oracle

PyPI version PyPI - Python Version License: MIT

Singer tap that extracts data from a Oracle database and produces JSON-formatted data following the Singer spec.

This is a PipelineWise compatible tap connector.

How to use it

The recommended method of running this tap is to use it from PipelineWise. When running it from PipelineWise you don't need to configure this tap with JSON files and most of things are automated. Please check the related documentation at Tap Oracle

If you want to run this Singer Tap independently please read further.

Log based replication

Tap-Oracle Log-based replication requires some configuration changes in Oracle database:

  • Enable ARCHIVELOG mode

  • Set retention period a reasonable and long enough period, ie. 1 day, 3 days, etc.

  • Enable Supplemental logging

Setting up Log-based replication on a self hosted Oracle Database:

To verify the current archiving mode, if the result is ARCHIVELOG, archiving is enabled:

  SQL> SELECT LOG_MODE FROM V$DATABASE

To enable ARCHIVELOG mode (if not enabled yet):

  SQL> SHUTDOWN IMMEDIATE
  SQL> STARTUP MOUNT
  SQL> ALTER DATABASE ARCHIVELOG
  SQL> ALTER DATABASE OPEN

To set retention period, use RMAN:

  RMAN> CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 1 DAYS;

To enable supplemental logging:

  SQL> ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS

Setting up Log-based replication on Oracle on Amazon RDS

To set retention period:

  begin
      rdsadmin.rdsadmin_util.set_configuration(
          name  => 'archivelog retention hours',
          value => '24');
  end;

To enable supplemental logging:

  begin
    rdsadmin.rdsadmin_util.alter_supplemental_logging(p_action => 'ADD');
  end;

Install and Run

First, make sure Python 3 is installed on your system or follow these installation instructions for Mac or Ubuntu.

It's recommended to use a virtualenv:

  python3 -m venv venv
  pip install pipelinewise-tap-oracle

or

  python3 -m venv venv
  . venv/bin/activate
  pip install --upgrade pip
  pip install .

Configuration

Running the the tap requires a config.json file. Example with the minimal settings:

  {
    "host": "foo.com",
    "port": 1521,
    "user": "my_user",
    "password": "password",
    "sid": "ORCL",
    "filter_schemas": "MY_USER" # optional
  }

To run tests:

Tests require Oracle on Amazon RDS >= 12.1, and a user called ROOT.

  1. Define environment variables that requires running the tests.
  export TAP_ORACLE_HOST=<oracle-rds-host>
  export TAP_ORACLE_PORT=<oracle-rds-port>
  export TAP_ORACLE_USER=ROOT
  export TAP_ORACLE_PASSWORD=<oracle-rds-password>
  export TAP_ORACLE_SID=<oracle-rds-sid>
  1. Install python dependencies in a virtual env and run nose unit and integration tests
  python3 -m venv venv
  . venv/bin/activate
  pip install --upgrade pip
  pip install .
  pip install nose
  1. To run unit tests:
  nosetests

About

Singer.io Tap for Oracle - PipelineWise compatible

Resources

License

Security policy

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%