Fully connecting the Observational Health Data Science and Informatics (OHDSI) initiative with the world of linked open data
Leveraging Ananke, the utility here converts Athena vocbulary files into one large Turtle file containing the vocabulary converted to an RDF graph.
This work was conceptualized for/and (mostly) carried out while at the Biomedical Linked Annotation Hackathon 5 in Kashiwa, Japan.
We are veru grateful for the support on this work.
There are three versions of this utility: OHDSI2RDF_dict.py, OHDSI2RDF.py and OHDSI2RDF_mp.py. The first one uses a dictionary, the second one is single threaded and the third program uses multi-processing. However, the second one seems slower, so be sure to try them.
Assumptions: This program assumes that you have the OHDSI vocabulary CSV files extracted in the folder you are running this code and they have the standard uppercase named files. The second assumption is that you have the Ananke mappings in the standard CSV file provided.
How to run
python OHDSI2RDF_dict.py >> OHDSI2RDF.ttl
The program outputs to the screen, so be sure to capture the output on a file.
- Add ancestor and Synonym relationships
- Improve CUI assigning from Ananke source