Learning with and without human feedback
Author(s)
Xu, Austin Shiyi
Advisor(s)
Editor(s)
Collections
Supplementary to:
Permanent Link
Abstract
Labels and feedback provided by humans play a central role in training contemporary machine learning models, offering models ground truth annotations from which to extract patterns. However, collecting such feedback from humans is a challenging and time-consuming task. As a result, practitioners must be intentional both in how they choose to query humans for feedback and in the problem settings for which they request feedback. This thesis explores learning from human feedback along two fundamental directions. The first part of the thesis focuses on how we can more effectively learn from and collect human feedback from a mathematically grounded perspective. We first consider how to leverage paired comparisons, a simple mechanism for human feedback, for learning rich models of human preference. We then propose a new mechanism for collecting human feedback aimed at balancing informativeness and cognitive burden. The second part of the thesis focuses on how we can leverage pretrained models to avoid collecting additional human feedback. We consider two specific application settings: retrieval and synthetic dataset generation, and show that existing tools, such as large language models or image editing models, can be used to remove the need for collecting human feedback.
Sponsor
Date
2024-04-27
Extent
Resource Type
Text
Resource Subtype
Dissertation