Skip to content

Commit

Permalink
Feature/ Runtime SQL storage for import & export jobs (#463)
Browse files Browse the repository at this point in the history
* Move test models for unit testing into test resources package

* Model API for data storage (experimental)

* New plugin API for data storage (not public yet)

* First pass on unit tests for data storage API

* First pass on context impl for new data storage wrapper API

* Do not expose stdlib in new storage plugin interface

* Use new internal storage plugin interface in _impl.storage

* Generalise SQL integration testing in CI

* Test suite for the new IDataStorageBase interface

* First implementation of SQL data plugin (works with MySQL)

* Update SQL plugin and alchemy driver to accept URL credentials

* Move alchemy driver into the main storage_sql plugin module

* Skip data context tests for now (local implementation needed)

* Python integration test hook, run data storage suite for SQL storage

* Apply commit to create table statement (required for Postgres)

* Optional SQL dependency for the dist package (SQL Alchemy only)

* Plugin requirements for SQL testing

* Add storage integration tests for external SQL storage in Python - MySQL, MariaDB and Postgres

* Disable cloud integration tests while testing SQL CI

* Fix sed command in SQL CI workflow

* Do not install mariadb driver yet (binary dependencies)

* Fix max varchar size for MySQL

* Fix CI configs for MariaDB and Postgresql

* Do not write float NaN to postgres backend (not supported)

* Reduce varchar max when creating MySQL tables

* Fix dialect in MySQL CI config

* Fix type for boolean fields in MySQL dialect

* Try using pymysql to talk to MariaDB in CI

* Exceptions for MariaDB in the storage suite

* Add internal extensions module to setup.cfg

* Try to run CI for SQL Server external storage

* Install ODBC package for SQL server integration testing

* Update SQL Server driver in config file to match version installed in CI

* Skips / exclusions for SQL Server in the data storage suite

* Use ANSI Standard as the base SQL dialect

* Example of a selective import using the data storage API

* Add helpers to define ARRAY / MAP types for model parameters

* Use DB API description to get field names for SQL responses

* Fix base class for SQL Server dialect

* Add tests for illegal SQL queries (DML, DDL)

* Add some extra checks / protections in SQL storage impl

* Fix one IDE warning

* Allow dev-mode config parser to handle lists (for model parameters)

* Wire up new external storage to make it available in the context for import / export jobs

* Add some validation for set_source_metadata()

* Delay guard rail protections until after plugin loading is complete

* Re-enable cloud storage CI jobs

* Add MIT-0 variation of MIT license to Python licenses config

* Remove unneeded vars mapping in SQL CI job

* Update result set processing

* Update result set processing
  • Loading branch information
Martin Traverse authored Oct 29, 2024
1 parent 84bc9bc commit c6cd697
Show file tree
Hide file tree
Showing 30 changed files with 2,552 additions and 175 deletions.
14 changes: 14 additions & 0 deletions .github/config/rt-storage-ext-mariadb.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@


storage:

external:

data_integration:
protocol: SQL
properties:
dialect: MARIADB
driver.python: alchemy
alchemy.url: mariadb+pymysql://metadb:3306/trac
alchemy.username: trac_admin
alchemy.password: DB_SECRET
14 changes: 14 additions & 0 deletions .github/config/rt-storage-ext-mysql.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@


storage:

external:

data_integration:
protocol: SQL
properties:
dialect: MYSQL
driver.python: alchemy
alchemy.url: mysql+pymysql://metadb:3306/trac
alchemy.username: trac_admin
alchemy.password: DB_SECRET
14 changes: 14 additions & 0 deletions .github/config/rt-storage-ext-postgresql.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@


storage:

external:

data_integration:
protocol: SQL
properties:
dialect: POSTGRESQL
driver.python: alchemy
alchemy.url: postgresql+pg8000://metadb:5432/trac
alchemy.username: trac_admin
alchemy.password: DB_SECRET
14 changes: 14 additions & 0 deletions .github/config/rt-storage-ext-sqlserver.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@


storage:

external:

data_integration:
protocol: SQL
properties:
dialect: SQLSERVER
driver.python: alchemy
alchemy.url: mssql+pyodbc://metadb:1433/master?driver=ODBC+Driver+18+for+SQL+Server&TrustServerCertificate=yes
alchemy.username: sa
alchemy.password: DB_SECRET
190 changes: 190 additions & 0 deletions .github/workflows/integration-sql.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
name: Integration (SQL)

on:
workflow_call:
inputs:
matrix:
required: true
type: string
dialect:
required: true
type: string
db_image:
required: true
type: string
db_port:
required: true
type: number
db_options:
required: false
type: string


# Use latest supported language versions for integration testing
env:
JAVA_VERSION: "21"
JAVA_DISTRIBUTION: "zulu"
PYTHON_VERSION: "3.12"
NODE_VERSION: "22"


jobs:

int-metadb:

name: int-metadb-java-${{ inputs.dialect }}

env:
BUILD_sql_${{ inputs.dialect }}: true
TRAC_CONFIG_FILE: ".github/config/int-metadb-${{ inputs.dialect }}.yaml"
TRAC_SECRET_KEY: "testing_secret"

runs-on: ubuntu-latest
timeout-minutes: 20

container:
image: ubuntu:latest

services:

metadb:

image: ${{ inputs.db_image }}
ports:
- ${{ inputs.db_port }}:${{ inputs.db_port }}
options: ${{ inputs.db_options }}

# DB container needs various env vars defined in the matrix
env: ${{ fromJson( inputs.matrix ) }}

steps:

- name: Checkout
uses: actions/checkout@v4

- name: Set up Java
uses: actions/setup-java@v4
with:
distribution: ${{ env.JAVA_DISTRIBUTION }}
java-version: ${{ env.JAVA_VERSION }}

- name: Build
run: ./gradlew trac-svc-meta:testClasses

# Auth tool will also create the secrets file if it doesn't exist
- name: Prepare secrets
env: ${{ fromJson( inputs.matrix ) }}
run: |
./gradlew secret-tool:run --args="--config ${{ env.TRAC_CONFIG_FILE }} --task init_secrets"
./gradlew secret-tool:run --args="--config ${{ env.TRAC_CONFIG_FILE }} --task create_root_auth_key EC 256"
echo "${DB_SECRET}" | ./gradlew secret-tool:run --args="--config ${{ env.TRAC_CONFIG_FILE }} --task add_secret metadb_secret"
# The name and description of the test tenant are verified in one of the test cases so they need to match
# MetadataReapApiTest listTenants()
- name: Prepare database
run: |
./gradlew deploy-metadb:run --args="\
--config ${{ env.TRAC_CONFIG_FILE }} \
--secret-key ${{ env.TRAC_SECRET_KEY }} \
--task deploy_schema \
--task add_tenant ACME_CORP 'Test tenant [ACME_CORP]'"
- name: Integration tests
run: ./gradlew trac-svc-meta:integration -DintegrationTags="int-metadb"

# If the tests fail, make the output available for download
- name: Store failed test results
uses: actions/upload-artifact@v4
if: failure()
with:
name: junit-test-results
path: build/modules/*/reports/**
retention-days: 7

int-storage--python:

name: int-storage-python-${{ inputs.dialect }}

env:
BUILD_sql_${{ inputs.dialect }}: true
TRAC_RT_SYS_CONFIG: ".github/config/rt-storage-ext-${{ inputs.dialect }}.yaml"

runs-on: ubuntu-latest
timeout-minutes: 20

container:
image: ubuntu:latest

services:

metadb:

image: ${{ inputs.db_image }}
ports:
- ${{ inputs.db_port }}:${{ inputs.db_port }}
options: ${{ inputs.db_options }}

# DB container needs various env vars defined in the matrix
env: ${{ fromJson( inputs.matrix ) }}

steps:

# https://learn.microsoft.com/en-us/sql/connect/odbc/linux-mac/installing-the-microsoft-odbc-driver-for-sql-server
- name: Install ODBC (if required)
if: ${{ inputs.dialect == 'sqlserver' }}
run: |
apt-get update
apt-get install -y curl gpg lsb-release
LSB_RELEASE=`lsb_release -rs`
curl "https://packages.microsoft.com/keys/microsoft.asc" | gpg --dearmor -o /usr/share/keyrings/microsoft-prod.gpg
curl "https://packages.microsoft.com/config/ubuntu/${LSB_RELEASE}/prod.list" | tee /etc/apt/sources.list.d/mssql-release.list
apt-get update
ACCEPT_EULA=Y apt-get install -y msodbcsql18
ACCEPT_EULA=Y apt-get install -y mssql-tools18
- name: Checkout
uses: actions/checkout@v4

- name: Pre-process config
env: ${{ fromJson( inputs.matrix ) }}
run: |
sed -i "s#DB_SECRET#${DB_SECRET}#" ${TRAC_RT_SYS_CONFIG}
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: ${{ env.PYTHON_VERSION }}

- name: Upgrade PIP
run: |
python -m venv ./venv
. ./venv/bin/activate
python -m pip install --upgrade pip
# Filter plugin dependencies, only install for the plugin being tested
# This prevents dependency issues in one plugin affecting all the others
- name: Select plugin dependencies
run: |
cd tracdap-runtime/python
sed -n '/BEGIN_PLUGIN sql/, /END_PLUGIN sql/p' requirements_plugins.txt > requirements_selected.txt
- name: Install dependencies
run: |
. ./venv/bin/activate
cd tracdap-runtime/python
pip install -r requirements.txt
pip install -r requirements_selected.txt
- name: Protoc code generation
run: |
. ./venv/bin/activate
python tracdap-runtime/python/build_runtime.py --target codegen
- name: Integration tests
run: |
. ./venv/bin/activate
python tracdap-runtime/python/build_runtime.py --target integration --pattern int_storage_sql*.py
105 changes: 20 additions & 85 deletions .github/workflows/integration.yml
Original file line number Diff line number Diff line change
Expand Up @@ -214,13 +214,7 @@ jobs:
retention-days: 7


int-metadb:

runs-on: ubuntu-latest
timeout-minutes: 20

container:
image: ubuntu:latest
int-sql:

strategy:

Expand All @@ -231,114 +225,55 @@ jobs:

database:

- { DB_NAME: MySQL,
- { DIALECT: mysql,
DB_IMAGE: 'mysql:8.4',
DB_PORT: 3306,
DB_OPTIONS: '--health-cmd="mysqladmin ping" --health-interval=10s --health-timeout=5s --health-retries=3',
BUILD_sql_mysql: true,
TRAC_CONFIG_FILE: '.github/config/int-metadb-mysql.yaml',
TRAC_SECRET_KEY: wDeq3x-NjaLL7,
MYSQL_ALLOW_EMPTY_PASSWORD: yes,
MYSQL_DATABASE: trac,
MYSQL_USER: trac_admin,
MYSQL_PASSWORD: trac_admin,
METADB_SECRET: trac_admin,
MYSQL_ALLOW_EMPTY_PASSWORD: yes }
DB_SECRET: trac_admin }

- { DB_NAME: MariaDB,
- { DIALECT: mariadb,
DB_IMAGE: 'mariadb:11.4',
DB_PORT: 3306,
DB_OPTIONS: '--health-cmd="healthcheck.sh --innodb_initialized" --health-interval=10s --health-timeout=5s --health-retries=3',
BUILD_sql_mariadb: true,
TRAC_CONFIG_FILE: '.github/config/int-metadb-mariadb.yaml',
TRAC_SECRET_KEY: uYhnKwq8+esS,
MYSQL_ALLOW_EMPTY_PASSWORD: yes,
MYSQL_DATABASE: trac,
MYSQL_USER: trac_admin,
MYSQL_PASSWORD: trac_admin,
METADB_SECRET: trac_admin,
MYSQL_ALLOW_EMPTY_PASSWORD: yes }
DB_SECRET: trac_admin }

- { DB_NAME: PostgreSQL,
- { DIALECT: postgresql,
DB_IMAGE: 'postgres:16-alpine',
DB_PORT: 5432,
DB_OPTIONS: '--health-cmd pg_isready --health-interval 10s --health-timeout 5s --health-retries 5',
BUILD_sql_postgresql: true,
TRAC_CONFIG_FILE: '.github/config/int-metadb-postgresql.yaml',
TRAC_SECRET_KEY: hjXks83bX=wxMr,
POSTGRES_DB: trac,
POSTGRES_USER: trac_admin,
METADB_SECRET: trac_admin,
POSTGRES_PASSWORD: trac_admin }
POSTGRES_PASSWORD: trac_admin,
DB_SECRET: trac_admin,}

- { DB_NAME: SQLServer,
- { DIALECT: sqlserver,
DB_IMAGE: 'mcr.microsoft.com/mssql/server:2022-latest',
DB_PORT: 1433,
DB_OPTIONS: '-e "NO_DB_OPTIONS=not_used"', # docker run -e flag sets an env variable, passing '' causes errors
BUILD_sql_sqlserver: true,
TRAC_CONFIG_FILE: '.github/config/int-metadb-sqlserver.yaml',
TRAC_SECRET_KEY: unHkj>weN2jSl,
MSSQL_PID: Developer,
ACCEPT_EULA: Y,
SA_PASSWORD: "tR4c_aDm!n",
METADB_SECRET: "tR4c_aDm!n" }

env: ${{ matrix.database }}

services:

metadb:

image: ${{ matrix.database.DB_IMAGE }}
env: ${{ matrix.database }}
ports:
- ${{ matrix.database.DB_PORT }}:${{ matrix.database.DB_PORT }}
options: ${{ matrix.database.DB_OPTIONS }}

steps:
DB_SECRET: "tR4c_aDm!n" }

- name: Checkout
uses: actions/checkout@v4

- name: Set up Java
uses: actions/setup-java@v4
with:
distribution: ${{ env.JAVA_DISTRIBUTION }}
java-version: ${{ env.JAVA_VERSION }}
uses: ./.github/workflows/integration-sql.yaml

- name: Build
run: ./gradlew trac-svc-meta:testClasses

# Auth tool will also create the secrets file if it doesn't exist
- name: Prepare secrets
run: |
./gradlew secret-tool:run --args="--config ${{ env.TRAC_CONFIG_FILE }} --task init_secrets"
./gradlew secret-tool:run --args="--config ${{ env.TRAC_CONFIG_FILE }} --task create_root_auth_key EC 256"
echo "${METADB_SECRET}" | ./gradlew secret-tool:run --args="--config ${{ env.TRAC_CONFIG_FILE }} --task add_secret metadb_secret"
# The name and description of the test tenant are verified in one of the test cases so they need to match
# MetadataReapApiTest listTenants()
- name: Prepare database
run: |
./gradlew deploy-metadb:run --args="\
--config ${{ env.TRAC_CONFIG_FILE }} \
--secret-key ${{ env.TRAC_SECRET_KEY }} \
--task deploy_schema \
--task add_tenant ACME_CORP 'Test tenant [ACME_CORP]'"
- name: Integration tests
run: ./gradlew trac-svc-meta:integration -DintegrationTags="int-metadb"

# If the tests fail, make the output available for download
- name: Store failed test results
uses: actions/upload-artifact@v4
if: failure()
with:
name: junit-test-results
path: build/modules/*/reports/**
retention-days: 7
with:
matrix: ${{ toJson( matrix.database ) }}
dialect: ${{ matrix.database.DIALECT }}
db_image: ${{ matrix.database.DB_IMAGE }}
db_port: ${{ matrix.database.DB_PORT }}
db_options: ${{ matrix.database.DB_OPTIONS }}


int-storage:
int-cloud-storage:

strategy:

Expand Down
Loading

0 comments on commit c6cd697

Please sign in to comment.