-
Notifications
You must be signed in to change notification settings - Fork 100
Discourse Parsers Details
bsharpataz edited this page Nov 27, 2017
·
7 revisions
Note: this is of limited utility to regular users; but it might be useful to developers interested in modifying (i.e., retraining) the discourse parsers.
To regenerate the constituent-syntax model, run this command:
sbt 'run-main org.clulab.discourse.rstparser.RSTParserMain -train /data/nlp/corpora/RST_cached_preprocessing/rst_train -model model.const.rst.gz'
This will generate the model file, model.const.rst.gz
, in the current directory. To evaluate, move model.const.rst.gz
to main/src/main/resources/.
and run:
sbt 'run-main org.clulab.discourse.rstparser.RSTParserMain -test /data/nlp/corpora/RST_cached_preprocessing/rst_test -model model.const.rst.gz'
To regenerate the dependency-syntax model, run this command:
sbt 'run-main org.clulab.discourse.rstparser.RSTParserMain -train /data/nlp/corpora/RST_cached_preprocessing/rst_train -model model.dep.rst.gz -dep'
Similarly, this command generates the model that uses only dependency information: model.dep.rst.gz
, in the current directory. To evaluate, move model.dep.rst.gz
to main/src/main/resources/.
and run:
sbt 'run-main org.clulab.discourse.rstparser.RSTParserMain -test /data/nlp/corpora/RST_cached_preprocessing/rst_test -model model.dep.rst.gz -dep'
Note that the latter dependency-based model is both faster and more accurate than the former constituent-based one.
- Users (r--)
- Developers (-w-)
- Maintainers (--x)