Build pipeline for the Phenoscape Knowledgebase
Current version of the pipeline to be replaced with a newer version built in accordance with the following aims:
- Modularize the pipeline
- Institute quality control measures
- Use generic (commoditized) tools (robot, dosdp, etal)
- Loading and reasoning core Phenoscape data
- Auto-generating axioms
- Importing Model Organism data
- Semantic similarity
- Create automated test suite for all the parts of the pipeline
- Formal and machine-testable definitions of OWL/RDF model expectations
- ShEx
- SHACL
- SPARQL Queries
Docker image phenoscape/pipeline-tools
packages the software tools required to run the KB build pipeline.
run.sh
pulls the image and launches the container with the specified command to build the pipeline.
To build the entire pipeline:
./run.sh make all
To build a specific component of pipeline, like for instance semantic similarity scores
:
./run.sh make ss-scores-gen
The pipeline can also be run on a Slurm Cluster that has Singularity installed. To build the entire pipeline on a slurm cluster:
sbatch run.sh make all
The build workflow can be found here.
The build process involves
- Importing/mirroring ontologies given in ontologies.ofn
- Downloading [data] (https://github.com/phenoscape/phenoscape-data) annotated by curators using [Phenex] (https://github.com/phenoscape/Phenex) (NeXML files)
- Downloading ontologies developed by the MONARCH Initiative - MGI, ZFIN, HPOA
All these ontologies are merged into a single ontology, reasoned over to generate tbox and abox axioms, and finally combined together to form the Phenoscape-KB.
ontology-versions.ttl
contains metadata about the ontologies used in a particular kb build.
The Makefile can be tested using placeholder programs via the test/test-makefile.sh script. The placeholder scripts are in the test/bin directory and create empty output files.
To run this script requires nodejs and GNU sed in your PATH. The placeholder scripts requirements can be installed like os:
npm install yargs
The Makefile test can be run as follows:
./tests/test-makefile.sh
If the exit status of the above script is 0 the tests succeeded. The script will also print the following message when everything passed:
SUCCESS: Makefile tests passed.