Title:
Language Models: Generator and Labeler

Thumbnail Image
Author(s)
Rungta, Mukund
Authors
Advisor(s)
Zhang, Chao
Advisor(s)
Person
Editor(s)
Associated Organization(s)
Organizational Unit
Supplementary to
Abstract
Over the last few years, there has been remarkable progress in the capabilities of language models. These models have been trained on massive amounts of data using advanced learning algorithms, enabling them to perform a wide range of Natural Language Processing (NLP) tasks with great accuracy. This has made them highly reliable and robust, with state-of-the-art or comparable performance on various NLP benchmarks. In this thesis, I explore two uncharted territories of using large language models: hierarchical text classification and generating training data without human supervision. The proposed approach and design choices for both tasks demonstrate superior performance over different baselines. To support these claims, I examine different components of the model and analyze their contributions to the overall improvements. Although there are several limitations to using language models directly for generating training data, such as ensuring label accuracy and preserving dataset diversity, this work can inspire further research on exploiting dataset-generation-based zero-shot learning using large pre-trained language models.
Sponsor
Date Issued
2023-05-02
Extent
Resource Type
Text
Resource Subtype
Thesis
Rights Statement
Rights URI