A sentence splitter wrapper for CoreNLP

About

This wrapper returns the untokenized sentence splitting result from CoreNLP toolkit.

Before starting

Please download CoreNLP and unzip everything to stanford-corenlp-4.2.0 folder.

If you want to work in Arabic, please download the Arabic package and put it into stanford-corenlp-4.2.0 folder.

The latest version for CoreNLP package and Arabic package can be found from their offical website.

Usage

python sentence_splitter_wrapper_for_CoreNLP_En.py

python sentence_splitter_wrapper_for_CoreNLP_Ar.py

Each file contains an example sentence. The code will print out the splitting results.

Update hisotry

This sentence splitter has gone through a few changes.

Danqi Chen wrote the original python wrapper for the tokenization function in CoreNLP.
Chao Jiang modified the code to make the sentence splitter produce split setnences with untokenized text.
Wuwei Lan modified the code to make it works with the Arabic language.

Acknowledgment

This material is based in part on research sponsored by IARPA via the BETTER program (2019-19051600004).

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
stanford-corenlp-4.2.0		stanford-corenlp-4.2.0
README.md		README.md
sentence_splitter_wrapper_for_CoreNLP_Ar.py		sentence_splitter_wrapper_for_CoreNLP_Ar.py
sentence_splitter_wrapper_for_CoreNLP_En.py		sentence_splitter_wrapper_for_CoreNLP_En.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A sentence splitter wrapper for CoreNLP

About

Before starting

Usage

Update hisotry

Acknowledgment

About

Releases

Packages

Languages

chaojiang06/CoreNLP_sentence_splitter

Folders and files

Latest commit

History

Repository files navigation

A sentence splitter wrapper for CoreNLP

About

Before starting

Usage

Update hisotry

Acknowledgment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages