Title:
Robust Generative Subspace Modeling: The Subspace t Distribution

Thumbnail Image
Author(s)
Khan, Zia
Dellaert, Frank
Authors
Advisor(s)
Advisor(s)
Editor(s)
Associated Organization(s)
Organizational Unit
Supplementary to
Abstract
Linear latent variable models such as statistical factor analysis (SFA) and probabilistic principal component analysis (PPCA) assume that the data are distributed according to a multivariate Gaussian. A drawback of this assumption is that parameter learning in these models is sensitive to outliers in the training data. Approaches that rely on M-estimation have been introduced to render principal component analysis (PCA) more robust to outliers. M-estimation approaches assume the data are distributed according to a density with heavier tails than a Gaussian. Yet, these methods are limited in that they fail to define a probability model for the data. Data cannot be generated from these models, and the normalized probability of new data cannot evaluated. To address these limitations, we describe a generative probability model that accounts for outliers. The model is a linear latent variable model in which the marginal density over the data is a multivariate t, a distribution with heavier tails than a Gaussian. We present a computationally efficient expectation maximization (EM) algorithm for estimating the model parameters, and compare our approach with that of PPCA on both synthetic and real data sets.
Sponsor
Date Issued
2004
Extent
278840 bytes
Resource Type
Text
Resource Subtype
Technical Report
Rights Statement
Rights URI