Subspace Outlier Detection in Data with Mixture of Variances and Noise
Author(s)
Nguyen, Minh Quoc
Mark, Leo
Omiecinski, Edward
Advisor(s)
Editor(s)
Collections
Supplementary to:
Permanent Link
Abstract
In this paper, we introduce a bottom-up approach to discover clusters of outliers in any m-dimensional subspace from
an n-dimensional space. First, we propose a method to compute the outlier score for all points in each dimension. We
show that if a point is an outlier in a subspace, the score
must be high for that point in each dimension of the subspace. We then aggregate the scores to compute the final
outlier score for the points in the dataset. We introduce a filter threshold to eliminate the high dimensional noise during
the aggregation. The concept of outlier is extended to allow
the discovery of clusters of outliers. An oscore(C/S) function is introduced to rank the clusters accordingly. In addition, the outliers can be easily visualized in our approach.
Sponsor
Date
2008
Extent
Resource Type
Text
Resource Subtype
Technical Report