Lexical and Positional Differences (LePoD) score is used to quantify the surface differences between paraphrases.
Comparing S1 and S2 with LePoD: hollow circles represent non-exact matched tokens, yielding a LeD score of . Given the alignment illustrated above, the PoD score is . |
We first compute the pairwise Lexical Difference (LeD) based on the percentages of tokens that are not found in both outputs. Formally,
where S1 and S2 is a pair of sequences and S1\S2 indicates tokens appearing in S1 but not in S2.
We then compute the pairwise Positional Difference (PoD). (1) We segment the sentence pairs into the longest sequence of phrasal units that are consistent with the word alignments. Word alignments are obtained using the latest METEOR software, which supports stem, synonym and paraphrase matches in addition to exact matches. (2) We compute the maximum distortion within each segment. To do these, we first re-index N aligned words and calculate distortions as the position differences (i.e., index2-index1 in the figure). Then, we keep a running total of the distortion array (d1, d2, ...), and do segmentation p=(di, ..., dj)∈P whenever the accumulation is zero (i.e., Σ p=0). Now we can define
In extreme cases, when the first word in S1 is reordered to the last position in S2, PoD score approaches 1. When words are aligned without any reordering, each alignment constitutes a segment and PoD equals 0.
If you have already set up the multitask-ft-fsmt project, you are all set. Otherwise, please use setup.sh to install the necessary software.
An example of using LePoD is given in example.sh.
Please cite the following paper if you use LePoD in your own work:
- Xing Niu, and Marine Carpuat. "Controlling Neural Machine Translation Formality with Synthetic Supervision". AAAI 2020. (Appendix)