Robust Generative Subspace Modeling: The Subspace t Distribution

Thumbnail Image
Khan, Zia
Dellaert, Frank
Associated Organizations
Organizational Unit
Organizational Unit
Supplementary to
Linear latent variable models such as statistical factor analysis (SFA) and probabilistic principal component analysis (PPCA) assume that the data are distributed according to a multivariate Gaussian. A drawback of this assumption is that parameter learning in these models is sensitive to outliers in the training data. Approaches that rely on M-estimation have been introduced to render principal component analysis (PCA) more robust to outliers. M-estimation approaches assume the data are distributed according to a density with heavier tails than a Gaussian. Yet, these methods are limited in that they fail to define a probability model for the data. Data cannot be generated from these models, and the normalized probability of new data cannot evaluated. To address these limitations, we describe a generative probability model that accounts for outliers. The model is a linear latent variable model in which the marginal density over the data is a multivariate t, a distribution with heavier tails than a Gaussian. We present a computationally efficient expectation maximization (EM) algorithm for estimating the model parameters, and compare our approach with that of PPCA on both synthetic and real data sets.
Date Issued
278840 bytes
Resource Type
Resource Subtype
Technical Report
Rights Statement
Rights URI