To Sound Human or to Speak Human? A Survey of Success Metrics for AI-Generated Text
Author(s)
Liu, Yihe
Advisor(s)
Editor(s)
Collections
Supplementary to:
Permanent Link
Abstract
In light of the increasing prevalence of NLG, we investigate researchers’ methods to evaluate the quality of AI-generated text. We conduct a Systematic Literature Review on IEEExplore, The ACM Digital Library, and Springer-Link, the three major Computer Science publication databases. A corpus of 88 papers published in the last 8 years was analyzed. We discovered that the majority of technical articles place emphasis on text quality and that there exists no standard procedure for human evaluation bias. To address this gap in the literature, we proposed evaluative design concepts and discussed their implications. We also identify the techniques for Expert Defined Metrics and AI Generated Metrics and compare their advantages and disadvantages. We conclude with a discussion of the field's current challenges and future directions.
Sponsor
Date
Extent
Resource Type
Text
Resource Subtype
Undergraduate Research Option Thesis