I am the director of natural language processing company Fast Data Science Ltd, which can be found at https://fastdatascience.com.
I am passionate about using data science and natural language processing to solve real-world problems. I have a MPhil in Computer Speech, Text and Internet Technology from the University of Cambridge, and I have worked on a variety of projects in the healthcare, pharmaceutical, financial services, and telecommunications industries.
I am also an active member of the open source community, and I have contributed to a number of open source projects, including the Harmony project and the Clinical Trial Risk Tool which was funded by the Gates Foundation and won the Plotly Dash Apps challenge in 2023.
I am always looking for new challenges, and I am excited to see what the future holds for data science and natural language processing.
You can contact me at https://fastdatascience.com/contact.
- Fast Data Science website
- Fast Data Science blog
- Harmony project repo (repo)
- Clinical Trial Risk Tool (repo)
- Cantab profile
Localspelling (Github repo) - a library for localising spelling between US and UK variants.
install from the command line with pip install localspelling
country_named_entity_recognition (Github repo) - a lightweight Python library for recognising country names in unstructured text and returning Pycountry objects.
install from the command line with pip install country_named_entity_recognition
drug_named_entity_recognition (Github repo) - a lightweight Python library for recognising drug names in unstructured text
install from the command line with pip install drug-named-entity-recognition
Fast Stylometry (Github repo) - a Python library for forensic stylometry. Read tutorial here.
install from the command line with pip install faststylometry
I regularly post on Fast Data Science's blog.
Popular posts include