Designing From Data to Discovery: Human-Centered Machine Learning for Interpretable Scientific Data Exploration
Loading...
Author(s)
Wright, Austin P.
Advisor(s)
Editor(s)
Collections
Supplementary to:
Permanent Link
Abstract
It is often ignored how scientific discovery is a social activity done by people; thus designing the statistical and computational tools that these people use to explore novel and complex data requires a human-centered approach. Modern data mining methods purport to be able to assist in this endeavor by making more data types more amenable to visualization and analysis. However, very frequently these methods (when straightforwardly applied) solve the wrong problems – ignoring the real problems that actual scientists face in their workflows. What is needed are better human-centered theories of applied data-science that take into account this divide between scientific users and existing machine learning problem formulations. This thesis contributes towards precisely that goal, using extensive embedded field work to identify and understand the needs of specific groups of scientists across multiple domains, and designing new machine learning tools to address them. From this concrete basis I develop frameworks for the centering of people in the meta-process of the design of machine learning models within their total context of actual scientists’ processes of scientific discovery – a synthesis of machine learning (ML) theoretic and human-computer interaction (HCI) methodological frameworks. Therefore this work has two interrelated parts:
(1) Human-Centered Discovery Frameworks: in which I develop frameworks for understanding the human processes of scientific discovery, based on embedded user research on scientists working collaboratively in context, and model how ML systems interact with
these processes, and create guidelines for improving the design of such systems.
(2) Interpretable ML for Exploratory Science: in which I utilize these frameworks to collaborate with scientists and develop novel interpretable ML methods that address the particular problems of scientific users doing exploratory data analysis. Altogether this work contributes to scholarship in in ML, HCI, and multiple scientific domains.
Sponsor
Date
2025-04-28
Extent
Resource Type
Text
Resource Subtype
Dissertation