RadText is a high-performance Python Radiology Text Analysis System.
- Python >= 3.6, <3.9
- Linux
- Java
# Set up environment
$ sudo apt-get install python3-dev build-essential default-java
The latest radtext releases are available over pypi.
Using pip, RadText releases are available as source packages and binary wheels. It is also generally recommended installing packages in a virtual environment to avoid modifying system state:
$ python -m venv venv
$ source venv/bin/activate
$ pip install -U pip setuptools wheel
$ pip install -U radtext
$ python -m spacy download en_core_web_sm
$ radtext-download --all
To see RadText’s pipeline in action, you can launch the Python interactive interpreter, and try the following commands:
import radtext
nlp = radtext.Pipeline()
with open('/PATH/TO/BIOC_FILE.xml') as fp:
doc = bioc.load(fp)
annotations = nlp(doc)
print(annotations)
RadText also supports command-line interfaces for specific NLP tasks (e.g., de-identification, sentence split, or named entity recognition).
$ radtext-deid --repl=X -i /path/to/input.xml -o /path/to/output.xml
$ radtext-ssplit -i /path/to/input.xml -o /path/to/output.xml
$ radext-ner spacy --radlex /path/to/Radlex4.1.xlsx -i /path/to/input.xml -o /path/to/output.xml
You will find complete documentation at our Read the Docs site.
You can find information about contributing to RadText at our Contribution page.
This work is supported by the National Library of Medicine under Award No. 4R00LM013001 and the NIH Intramural Research Program, National Library of Medicine.
You can find Acknowledgment information at our Acknowledgment page.
Copyright BioNLP Lab at Weill Cornell Medicine, 2022.
Distributed under the terms of the MIT license, RadText is free and open source software.