-
Notifications
You must be signed in to change notification settings - Fork 74
AdArte
AdArte (A Transformation-Driven Approach for Recognizing Textual Entailment) is based on modelling entailment relations as a classification problem where the single T-H pairs are first represented by a sequence of edit operations (i.e., deleting, replacing and inserting pieces of text) called transformations needed to transform T into H, and then used as features to feed up a supervised learning classifier to classify the pairs as positive or negative examples.
The transformations are calculated by applying tree edit distance (Tai, 1979) on the dependency trees of the T-H pairs while some Background Knowledge like WordNet, VerbOcean and Catvar is used for recognizing cases where T and H use different textual expressions (e.g., girl vs young_woman, spray vs spraying) while preserving entailment.
The transformations are used as features for the T-H pairs classification. In this context we adopt Weka (Hall et al, 2009), that is a collection of machine learning algorithms that allows for trying different algorithms like Random Forest and Support Vector Machines (SVM).
The current implementation of AdArte has some limitations whose solution is subject to future work. In fact with some data sets like RTE-3 where the number of labelled examples is limited (a few hundreds of pairs) and the number of the produced transformations could exceed the examples, the predictive power of the learned model could be considerably reduced.
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: An update. SIGKDD Explor Newsl 11(1):10–18, DOI 10.1145/1656274.1656278
Marelli M, Menini S, Baroni M, Bentivogli L, Bernardi R, Zamparelli R (2014b) A SICK cure for the evaluation of compositional distributional semantic models. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC-2014), Reykjavik, Iceland, May 26-31, 2014., pp 216–223
Tai K (1979) The tree to tree correction problem. J ACM 26(3):442–433