Journal of Software, Vol 2, No 4 (2007), 12-23, Oct 2007
doi:10.4304/jsw.2.4.12-23

New Functions for Unsupervised Asymmetrical Paraphrase Detection

Cordeiro João, Dias Gaël, Brazdil Pavel

Abstract


Monolingual text-to-text generation is an emerging research area in Natural Language Processing. One reason for the interest in such generation systems is the possibility to automatically learn text-to-text generation strategies from aligned monolingual corpora. In this context, paraphrase detection can be seen as the task of aligning sentences that convey the same information but yet are written in different forms, thereby building a training set of rewriting examples. In this paper, we propose a new type of mathematical functions for unsupervised detection of paraphrases, and test it over a set of standard paraphrase corpora. The results are promising as they outperform stateof- the-art functions developed for similar tasks. We consider two types of paraphrases - symmetrical and asymmetrical entailed - and show that although our proposed functions were conceived and oriented toward the asymmetrical detection, they perform rather well for symmetrical sentence pairs identification.



Keywords


Paraphrasing, Paraphrase Identification, Sentence compression, Text Summarization, Text Generation, Textual Entailment, Text Mining

References



Full Text: PDF


Journal of Software (JSW, ISSN 1796-217X)

Copyright @ 2006-2012 by ACADEMY PUBLISHER – All rights reserved.