Subspace Outlier Detection in Data with Mixture of Variances and Noise

Author(s)
Nguyen, Minh Quoc
Mark, Leo
Omiecinski, Edward
Advisor(s)
Editor(s)
Associated Organization(s)
Organizational Unit
Organizational Unit
School of Computer Science
School established in 2007
Supplementary to:
Abstract
In this paper, we introduce a bottom-up approach to discover clusters of outliers in any m-dimensional subspace from an n-dimensional space. First, we propose a method to compute the outlier score for all points in each dimension. We show that if a point is an outlier in a subspace, the score must be high for that point in each dimension of the subspace. We then aggregate the scores to compute the final outlier score for the points in the dataset. We introduce a filter threshold to eliminate the high dimensional noise during the aggregation. The concept of outlier is extended to allow the discovery of clusters of outliers. An oscore(C/S) function is introduced to rank the clusters accordingly. In addition, the outliers can be easily visualized in our approach.
Sponsor
Date
2008
Extent
Resource Type
Text
Resource Subtype
Technical Report
Rights Statement
Rights URI