The Medical Subject Headings (MeSH) is a controled vocabulary produced by the NLM for cataloging biomedical information. The resource is structured as an ontology and is used for PubMed/MEDLINE annotation. Here we provide user-friendly datasets derived from MeSH. Currently, two record types are processed: Descriptors and Supplementary Concept Records.
descriptors.ipynb
— processes Descriptors (also known as Main Headings)supplementary-concept-records.ipynb
— processes Supplementary Concept Records (SCRs)
The data
directory contains the created datasets:
terms.tsv
— table of Descriptor terms.descriptor-terms.tsv
— table of Descriptor names.mesh.json
— a JSON-formatted representation of the Descriptor ontology. Includes term identifiers, names, semantic types, parents, and tree numbers.ontology.gexf.gz
— a GEXF representation of the descriptor ontology that is compatable withnewtorkx
.symptoms.tsv
— symptom Descriptors (the 438 descendants ofD012816
tree-numbers.tsv
— table of tree numbers for each Descriptor. A tree number represents a path to the the root. This file is handy for mapping to external resources which occasionally identify MeSH Descriptors by their tree numbers (a bad but prevalent practice).supplemental-records.tsv
— table of SCR terms.supplemental-terms.tsv
— table of SCR names.
This repository is released as CC0