This package contains classes that represent the core domain objects stored in the TranSMART platform, an open source data sharing and analytics platform for translational biomedical research.
It also provides a utility that writes such objects to tab-separated files that can be loaded into a TranSMART database using the transmart-copy tool.
To install transmart_loader, do:
pip install transmart-loader
or from sources:
git clone https://github.com/thehyve/python_transmart_loader.git
cd python_transmart_loader
pip install .
Define a TranSMART data collection, using the classes in transmart_loader/transmart.py, e.g.,
# Create the dimension elements
age_concept = Concept('test:age', 'Age', '\\Test\\age', ValueType.Numeric)
concepts = [age_concept]
studies = [Study('test', 'Test study')]
trial_visits = [TrialVisit(studies[0], 'Week 1', 'Week', 1)]
patients = [Patient('SUBJ0', 'male', [])]
visits = [Visit(patients[0], 'visit1', None, None, None, None, None, None, [])]
# Create the observations
observations = [
Observation(patients[0], age_concept, visits[0], trial_visits[0],
date(2019, 3, 28), None, NumericalValue(28))]
Create a hierarchical ontology for the concepts, e.g., to create the following structure:
└ Ontology └ Age
# Create an ontology with one top node and a concept node
top_node = TreeNode('Ontology')
top_node.add_child(ConceptNode(concepts[0]))
ontology = [top_node]
Write the data collection to a format that can be loaded using transmart-copy
:
collection = DataCollection(concepts, [], [], studies,
trial_visits, visits, ontology, patients, observations)
# Write collection to a temporary directory
# The generated files can be loaded into TranSMART with transmart-copy.
output_dir = mkdtemp()
copy_writer = TransmartCopyWriter(output_dir)
copy_writer.write_collection(collection)
Check examples/data_collection.py for a complete example.
Usage examples can be found in these projects:
- fhir2transmart: a tool that translates core HL7 FHIR resources to the TranSMART data model.
- claml2transmart: a tool that translates ontologies in ClaML format (e.g., ICD-10, available from DIMDI) to TranSMART ontologies.
- csr2transmart: a custom data transformation and loading pipeline for a Dutch center for pediatric oncology.
- transmart-hyper-dicer: a tool that reads a selection of data from a TranSMART instance using its REST API and loads it into another TranSMART instance.
Full documentation of the package is available at Read the Docs.
For a quick reference on software development, we refer to the software guide checklist.
This packages is tested with Python versions >= 3.7.
This project uses pip for installing dependencies and package management.
- Dependencies should be added to setup.py in the install_requires list.
- Tests are in the
tests
folder. - The
tests
folder contains:- A test if files for transmart-copy are generated for fake data (file:
test_transmart_loader
) - A test that checks whether your code conforms to the Python style guide (PEP 8) (file:
test_lint.py
)
- A test if files for transmart-copy are generated for fake data (file:
- The testing framework used is PyTest
- Tests can be run with
python setup.py test
- Documentation should be put in the
docs
folder. - To generate html documentation run
python setup.py build_sphinx
- Check your code style with
prospector
- You may need run
pip install .[dev]
first, to install the required dependencies
Copyright (c) 2019 The Hyve B.V.
The TranSMART loader is licensed under the MIT License. See the file LICENSE.
This project was funded by the German Ministry of Education and Research (BMBF) as part of the project DIFUTURE - Data Integration for Future Medicine within the German Medical Informatics Initiative (grant no. 01ZZ1804D).
This package was created with Cookiecutter and the NLeSC/python-template.