Skip to content
rzanoli edited this page Mar 20, 2014 · 16 revisions

EOP is being developed as a reference implementation of algorithms for textual entailment. Extending its functionalities with new algorithms or fixing bugs may be the subject of undergraduate or graduate theses, or class projects. We identified some areas and some specific projects you could be interested in working on.

Code Development

Implementation of new algorithms for Textual Entailment

This project aims at studying and evaluating a new algorithm for Textual Entailment based on modelling the Entailment Relations (i.e., Entailment, Not-Entailment) as a classification problem. First texts (T) are mapped into hypothesis (H) by sequences of editing operations (i.e., insertion, deletion, substitution of text portions) needed to transform T into H, where each edit operation has a cost associated with it. Then, and this is different from the algorithms which use these operations to calculate a threshold value that best separates the Entailment Relations from the Not-Entailment ones, the proposed algorithm uses the calculated operations as a feature set to feed a Supervised Learning Classifier System being able to classify the relations between T and H. The system will be evaluated on different data sets including the SemEval-2014 data set and in case of encouraging results it will be integrated into the EXCITEMENT Open Platform (EOP) for Textual Entailment

Software quality evaluation

Ensuring that software has an adequate quality level is one of the most important aspects of software engineering. This project involves studying and configuring tools that can automatically be run during development to check the quality of the produced software. EOP uses three quality assurance tools that have to be configured according to the EOP specification; they are namely Checkstyle, PMD, FindBugs. In addition Cobertura tool should be adopted to help discover if the code lacks in test coverage.

Code Distribution

Evaluation

Evaluating EOP on RTE datasets

This project aims at evaluating configurations of the EOP with well known dataset developed under the Recognizing Textual Entailment (RTE) evaluations. Specifically, the Excitement consortium, would like to evaluate three entailment algorithms available in the EOP (i.e. EDITS, TIE and BIUTEE) on the following English datasets: RTE-1, RTE-2, RTE-4 and RTE-5.

Documentation

Clone this wiki locally