Analyzing Health-related Behaviors Using First-person Vision

Author(s)
Zhang, Yun
Editor(s)
Associated Organization(s)
Supplementary to:
Abstract
Wearable sensors are increasingly affordable and easy to deploy, and as a result they are widely-used in mobile health applications. Sensors such as accelerometers and gyroscopes are broadly-used due to their low cost, small size, and relatively low power consumption. More recently, it has become feasible to deploy wearable cameras in mobile health applications, with two primary benefits: 1) when cameras are head-mounted, they can capture the health-related visual attention of the participant which is difficult to measure via other sensing modalities; 2) cameras can capture health-related behaviors under a wide range of conditions and contexts. When cameras are combined with other wearable sensors, the video can be used to curate labeled examples of behaviors under field conditions, and thereby drive the development of machine learning-based models, provided that the sensor data is time-synchronized. The goal of this thesis is to develop methods for analyzing video from wearable cameras, an area known as egocentric vision, that can enable both automatically-derived measures of behavior and the automatic synchronization of video with other sensing modalities in order to facilitate the large scale collection of labelled training data for use in building machine learning models. I begin by describing a model for egocentric action recognition which leverages the shared motion and appearance properties of different actions to enable zero shot learning (the model can predict novel actions that it has not been trained on.) Second, I develop a method to analyze video from a head-worn camera and quantify the participant's attention to screens (e.g. monitors, smart phones, etc.), without using an eye-tracker. Finally, I present a method based on a weighted kernel density estimation approach to automatically synchronize the timestamps of a wearable camera and wearable accelerometer. The method is able to estimate the time offset between multiple modalities of sensor data collected from devices mounted on different locations in the natural field environment, without introducing additional participant burden.
Sponsor
Date
2021-05-24
Extent
Resource Type
Text
Resource Subtype
Dissertation
Rights Statement
Rights URI