python tools to assist with standardized data ingestion workflows


Installation, Usage, and Release Management

Install from PyPi

pip install osc-ingest-tools


>>> from osc_ingest_trino import *

>>> import pandas as pd

>>> data = [['tom', 10], ['nick', 15], ['juli', 14]]

>>> df = pd.DataFrame(data, columns = ['First Name', 'Age In Years']).convert_dtypes()

>>> df
  First Name  Age In Years
0        tom            10
1       nick            15
2       juli            14

>>> enforce_sql_column_names(df)
  first_name  age_in_years
0        tom            10
1       nick            15
2       juli            14

>>> enforce_sql_column_names(df, inplace=True)

>>> df
  first_name  age_in_years
0        tom            10
1       nick            15
2       juli            14

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
 #   Column        Non-Null Count  Dtype
---  ------        --------------  -----
 0   first_name    3 non-null      string
 1   age_in_years  3 non-null      Int64
dtypes: Int64(1), string(1)
memory usage: 179.0 bytes

>>> p = create_table_schema_pairs(df)

>>> print(p)
    first_name varchar,
    age_in_years bigint


Adding custom type mappings to create_table_schema_pairs

>>> df = pd.DataFrame(data, columns = ['First Name', 'Age In Years'])

>>> enforce_sql_column_names(df, inplace=True)

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
 #   Column        Non-Null Count  Dtype
---  ------        --------------  -----
 0   first_name    3 non-null      object
 1   age_in_years  3 non-null      int64
dtypes: int64(1), object(1)
memory usage: 176.0+ bytes

>>> p = create_table_schema_pairs(df, typemap={'object':'varchar'})

>>> print(p)
    first_name varchar,
    age_in_years bigint



Patches may be contributed via pull requests to

All changes must pass the automated test suite, along with various static checks.

Black code style and isort import ordering are enforced.

Enabling automatic formatting via pre-commit is recommended:

pip install black isort pre-commit
pre-commit install

To ensure compliance with static check tools, developers may wish to run;

pip install black isort
# auto-sort imports
isort .
# auto-format code
black .

Code can then be tested using tox:

# run static checks and tests
# run only tests
tox -e py3
# run only static checks
tox -e static
# run tests and produce a code coverage report
tox -e cov


To release a new version of this library, authorized developers should;

  • Prepare a signed release commit updating version in
  • Tag the commit using Semantic Versioning prepended with "v"
  • Push the tag


git commit -sm "Release v0.3.4"
git tag v0.3.4
git push --follow-tags

A Github workflow will then automatically release the version to PyPI.