Efficient and Effective Visual Codebook Generation Using Additive Kernels
Author(s)
Advisor(s)
Editor(s)
Collections
Supplementary to:
Permanent Link
Abstract
Common visual codebook generation methods used in a bag of visual words model, for example,
k-means or Gaussian Mixture Model, use the Euclidean distance to cluster features into visual code
words. However, most popular visual descriptors are histograms of image measurements. It has
been shown that with histogram features, the Histogram Intersection Kernel (HIK) is more effective
than the Euclidean distance in supervised learning tasks. In this paper, we demonstrate that HIK can
be used in an unsupervised manner to significantly improve the generation of visual codebooks. We
propose a histogram kernel k-means algorithm which is easy to implement and runs almost as fast
as the standard k-means. The HIK codebooks have consistently higher recognition accuracy over
k-means codebooks by 2–4% in several benchmark object and scene recognition data sets. The
algorithm is also generalized to arbitrary additive kernels. Its speed is thousands of times faster
than a naive implementation of the kernel k-means algorithm. In addition, we propose a one-class
SVM formulation to create more effective visual code words. Finally, we show that the standard kmedian
clustering method can be used for visual codebook generation and can act as a compromise
between the HIK / additive kernel and the k-means approaches.
Sponsor
Date
2011-11
Extent
Resource Type
Text
Resource Subtype
Proceedings