Title:
Towards High Precision Text Generation

Thumbnail Image
Author(s)
Parikh, Ankur
Authors
Advisor(s)
Advisor(s)
Editor(s)
Associated Organization(s)
Organizational Unit
Organizational Unit
Series
Collections
Supplementary to
Abstract
Despite large advances in neural text generation in terms of fluency, existing generation techniques are prone to hallucination and often produce output that is unfaithful or irrelevant to the source text. In this talk, we take a multi-faceted approach to this problem from 3 aspects: data, evaluation, and modeling. From the data standpoint, we propose ToTTo, a tables-to-text-dataset with high quality annotator revised references that we hope can serve as a benchmark for high precision text generation task. While the dataset is challenging, existing n-gram based evaluation metrics are often insufficient to detect hallucinations. To this end, we propose BLEURT, a fully learnt end-to-end metric based on transfer learning that can quickly adapt to measure specific evaluation criteria. Finally, we propose a model based on confidence decoding to mitigate hallucinations.
Sponsor
Date Issued
2020-11-11
Extent
53:25 minutes
Resource Type
Moving Image
Resource Subtype
Lecture
Rights Statement
Rights URI