Natural Language Processing in GO
This is a partial port of Freeling 3.1 (http://nlp.lsi.upc.edu/freeling/).
License is GPL to respect the License model of Freeling.
This is the list of features already implemented:
- Text tokenization
- Sentence splitting
- Morphological analysis
- Suffix treatment, retokenization of clitic pronouns
- Flexible multiword recognition
- Contraction splitting
- Probabilistic prediction of unknown word categories
- Named entity detection
- PoS tagging
- Chart-based shallow parsing
- Named entity classification (With an external library MITIE - https://github.com/mit-nlp/MITIE)
- Rule-based dependency parsing
How to use it:
go build gofreeling.go ./gofreeling
(http server listens on default port 9999 - port can be changed in conf/gofreeling.toml file)
To process a page:
HTTP GET: http://localhost:9999/analyzer?url=COPY HERE AN URL
or Use as API endpoint:
HTTP POST: http://localhost:9999/analyzer-api { content: 'Text you want to analyze' }
Response is a self-explaining json
Usage as package: (example)
package main import ( . "./lib" . "./models" "fmt" "encoding/json" ) func main() { document := new(DocumentEntity) analyzer := NewAnalyzer() document.Content = "Hello World" output := analyzer.AnalyzeText(document) js := output.ToJSON() b, err := json.Marshal(js) if err != nil { panic(err) } fmt.Println(string(b)) }
TODO:
- clean code
- add comments
- add tests
implement WordNet-based sense annotation and disambiguation
Linguistic Data to run the server can be download here (English only):
https://www.dropbox.com/s/fwwvfxp2s7dydet/data.zip
WordNet Database to add annotation (place it inside ./data
folder)