The Apache OpenNLP library provides binary models for processing of natural language text. This repository is intended for the distribution of model files as a Maven artifacts.
For additional information, visit the OpenNLP Home Page.
You can use OpenNLP with many languages. Additional demo models are provided here.
The models are fully compatible with the latest OpenNLP release. They can be used for testing or getting started.
Note
Please train your own models for all other, specialized use cases.
Documentation, including JavaDocs, code usage and command-line interface examples are available here
You can also follow our mailing lists for news and updates.
We provide Tokenizer, Sentence Detector and Part-of-Speech Tagger models for the following 23 languages:
- Bulgarian
- Croatian
- Czech
- Danish
- Dutch
- English
- Estonian
- Finnish
- French
- German
- Italian
- Latvian
- Norwegian
- Polish
- Portuguese
- Romanian
- Russian
- Serbian
- Slovak
- Slovenian
- Spanish
- Swedish
- Ukrainian
These models are compatible with OpenNLP >= 1.0.0
. Further details are available at the OpenNLP Models
page and in the CHANGELOG.
In addition, we provide a Language Detector, which is able to detect 103 languages in ISO 693-3 standard. Works well with longer texts that have at least 2 sentences or more from the same language.
It is compatible with OpenNLP >= 1.8.3
. Model details are available here.
You can import a model artifact directly via Maven, SBT or Gradle, for instance:
<dependency>
<groupId>org.apache.opennlp</groupId>
<artifactId>opennlp-models-langdetect</artifactId>
<version>${opennlp.models.version}</version>
</dependency>
libraryDependencies += "org.apache.opennlp" % "opennlp-models-langdetect" % "${opennlp.version}"
compile group: "org.apache.opennlp", name: "opennlp-models-langdetect", version: "${opennlp.version}"
For more details please check our documentation
Ensure to add a new model to the expected-models.txt
file located in opennlp-models-test
.
The Apache OpenNLP project is developed by volunteers and is always looking for new contributors to work on all parts of the project. Every contribution is welcome and needed to make it better. A contribution can be anything from a small documentation typo fix to a new component.
If you would like to get involved please follow the instructions here