Description

The following script allows to convert ann data into .spacy data for training purposes of NER models. It's also possible to use processing methods like removing stopwords to get possibly better training results. The indices of entities can also be detected again if a preprocessing step has been used that changes the indices of the entities.

Background informations

I wrote the script for a text mining uni course. The purpose was to train a model that detects specific entities in German medical documents. Removing the special characters and converting the umlauts from the data were the most effective preprocessing methods for my use case.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Converter.py		Converter.py
Preprocess.py		Preprocess.py
README.md		README.md
base_config.cfg		base_config.cfg
config.cfg		config.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description

Background informations

About

Releases

Packages

Languages

YerStev/ann-spacy-converter

Folders and files

Latest commit

History

Repository files navigation

Description

Background informations

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages