Title:
Building and Evaluating Controllable Models for Text Simplification

Thumbnail Image
Author(s)
Maddela, Mounica
Authors
Advisor(s)
Xu, Wei
Advisor(s)
Person
Editor(s)
Associated Organization(s)
Organizational Unit
Organizational Unit
Series
Supplementary to
Abstract
Automatic Text Simplification (ATS) aims to improve the readability of texts with simpler grammar and word choices while preserving meaning. ATS is generally treated as a monolingual translation task where the input is a piece of text and the output is a simplified version of the input. One major drawback of the existing methods for ATS is the lack of controllability. ATS is an audience-dependent task and what constitutes simplified text for one group of users may not be acceptable for other groups. An ideal ATS system should be able to control various attributes of the generated text such as syntactic structures, length, readability levels, and word choices. Meanwhile, evaluating ATS systems is as important as building them because efficient automatic evaluation frameworks can accelerate the process of improving existing systems. However, the current automatic evaluation metrics for ATS focus on the semantic content of the simplified text but not the writing style. These metrics tend to favor conservative systems that make minimal changes to the input and inaccurately penalize simplifications that paraphrase the input. An ideal evaluation metric for ATS should not only capture simplification quality but also the different styles of simplification. In this dissertation, I develop controllable simplification systems and diverse automatic metrics for ATS. I propose two controllable approaches for ATS: a sentence simplification system that combines linguistic rules with Transformer models to generate simplified sentences at different readability levels and a lexical simplification system that leverages human judgments of word complexity to replace complex words with simpler phrases. Finally, I propose the first supervised automatic evaluation metric for ATS, LENS, which can capture multiple simplification styles and outperforms the existing metrics in evaluating diverse simplification systems. To train and evaluate LENS, I create SIMPEval, a new training and evaluation dataset for metrics that incorporates different types of simplification operations.
Sponsor
Date Issued
2023-08-17
Extent
Resource Type
Text
Resource Subtype
Dissertation
Rights Statement
Rights URI