Web API for Java based text extractors. Implemented using Play framework.
Tomaž Kovačič <tomaz.kovacic@gmail.com>
- Boilerpipe
- Goose (using my fork)
Note: All parameters should be encoded using x-www-form-urlencoded
method: POST
endpoint: http://yourdomain/boilerpipe/extract/
params:
extractorType
:(article|default|sentence)
rawHtml
: html content
JSON response format:
{ "result": RESULT_TEXT "status": (OK|ERROR) "errorMsg": ERROR_MESSAGE (optional) }
method: POST
endpoint: http://yourdomain/goose/extract/
params:
rawHtml
: html content
JSON response format:
{ "result": RESULT_TEXT "status": (OK|ERROR) "errorMsg": ERROR_MESSAGE (optional) }
- Play framework v1.1.1.
Everything that's not in the
/lib/
directory is licenced under GPLv3Jar packages in the
/lib/
are licenced under their respective licence listed below:- Boilerpipe - Apache Licence 2.0
- NekoHTML - Apache Licence 2.0
- Xerces - Apache Licence 2.0
- Goose - (no licence provided)
Copyright (C) Tomaž Kovačič
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.