Title:
Detecting the Change of Clustering Structure in Categorical Data Streams
Detecting the Change of Clustering Structure in Categorical Data Streams
Author(s)
Chen, Keke
Liu, Ling
Liu, Ling
Advisor(s)
Editor(s)
Collections
Supplementary to
Permanent Link
Abstract
Clustering data streams can provide critical information
for making decision in real-time. We argue
that detecting the change of clustering structure in
the data streams can be beneficial to many realtime
monitoring applications. In this paper, we
present a framework for detecting changes of clustering
structure in categorical data streams. The
change of clustering structure is detected by the
change of the best number of clusters in the data
stream. The framework consists of two main components:
the BkPlot method for determining the
best number of clusters in a categorical dataset,
and the summarization structure, Hierarchical Entropy
Tree (HE-Tree), for efficiently capturing the
entropy property of the categorical data streams.
HE-Tree enables us to quickly and precisely draw
the clustering information from the data stream
that is needed by BkPlot method to identify the
change of best number of clusters. Combining
the snapshots of the HE-Tree information and the
BkPlot method, we are able to observe the change
of clustering structure online. The experiments
show that HE-Tree + BkPlot method can efficiently
and precisely detect the change of clustering
structure in categorical data streams.
Sponsor
Date Issued
2005
Extent
155101 bytes
Resource Type
Text
Resource Subtype
Technical Report