Person:
Rehg, James M.

Associated Organization(s)
Organizational Unit
ORCID
ArchiveSpace Name Record

Publication Search Results

Now showing 1 - 9 of 9
  • Item
    C⁴ : A Real-time Object Detection Framework
    (Georgia Institute of Technology, 2013-10) Wu, Jianxin ; Liu, Nini ; Geyer, Christopher ; Rehg, James M.
    A real-time and accurate object detection framework, C⁴, is proposed in this paper. C⁴ achieves 20 fps speed and state-of-the-art detection accuracy, using only one processing thread without resorting to special hardwares like GPU. Real-time accurate object detection is made possible by two contributions. First, we conjecture (with supporting experiments) that contour is what we should capture and signs of comparisons among neighboring pixels are the key information to capture contour cues. Second, we show that the CENTRIST visual descriptor is suitable for contour based object detection, because it encodes the sign information and can implicitly represent the global contour. When CENTRIST and linear classifier are used, we propose a computational method that does not need to explicitly generate feature vectors. It involves no image preprocessing or feature vector normalization, and only requires O(1) steps to test an image patch. C⁴ is also friendly to further hardware acceleration. It has been applied to detect objects such as pedestrians, faces, and cars on benchmark datasets. It has comparable detection accuracy with state-of-the-art methods, and has a clear advantage in detection speed.
  • Item
    Efficient and Effective Visual Codebook Generation Using Additive Kernels
    (Georgia Institute of Technology, 2011-11) Wu, Jianxin ; Tan, Wei-Chian ; Rehg, James M.
    Common visual codebook generation methods used in a bag of visual words model, for example, k-means or Gaussian Mixture Model, use the Euclidean distance to cluster features into visual code words. However, most popular visual descriptors are histograms of image measurements. It has been shown that with histogram features, the Histogram Intersection Kernel (HIK) is more effective than the Euclidean distance in supervised learning tasks. In this paper, we demonstrate that HIK can be used in an unsupervised manner to significantly improve the generation of visual codebooks. We propose a histogram kernel k-means algorithm which is easy to implement and runs almost as fast as the standard k-means. The HIK codebooks have consistently higher recognition accuracy over k-means codebooks by 2–4% in several benchmark object and scene recognition data sets. The algorithm is also generalized to arbitrary additive kernels. Its speed is thousands of times faster than a naive implementation of the kernel k-means algorithm. In addition, we propose a one-class SVM formulation to create more effective visual code words. Finally, we show that the standard kmedian clustering method can be used for visual codebook generation and can act as a compromise between the HIK / additive kernel and the k-means approaches.
  • Item
    CENTRIST: A Visual Descriptor for Scene Categorization
    (Georgia Institute of Technology, 2011-08) Wu, Jianxin ; Rehg, James M.
    CENTRIST (CENsus TRansform hISTogram), a new visual descriptor for recognizing topological places or scene categories, is introduced in this paper. We show that place and scene recognition, especially for indoor environments, require its visual descriptor to possess properties that are different from other vision domains (e.g. object recognition). CENTRIST satisfies these properties and suits the place and scene recognition task. It is a holistic representation and has strong generalizability for category recognition. CENTRIST mainly encodes the structural properties within an image and suppresses detailed textural information. Our experiments demonstrate that CENTRIST outperforms the current state-of-the-art in several place and scene recognition datasets, compared with other descriptors such as SIFT and Gist. Besides, it is easy to implement and evaluates extremely fast.
  • Item
    Real-Time Human Detection Using Contour Cues
    (Georgia Institute of Technology, 2011-05) Wu, Jianxin ; Geyer, Christopher ; Rehg, James M.
    A real-time and accurate human detector, C⁴, is proposed in this paper. C⁴ achieves 20 fps speed and stateof- the-art detection accuracy, using only one processing thread without resorting to special hardwares like GPU. Real-time accurate human detection is made possible by two contributions. First, we show that contour is exactly what we should capture and signs of comparisons among neighboring pixels are the key information to capture contours. Second, we show that the CENTRIST visual descriptor is particularly suitable for human detection, because it encodes the sign information and can implicitly represent the global contour. When CENTRIST and linear classifier are used, we propose a computational method that does not need to explicitly generate feature vectors. It involves no image pre-processing or feature vector normalization, and only requires O(1) steps to test an image patch. C⁴ is also friendly to further hardware acceleration. In a robot with embedded 1.2GHz CPU, we also achieved accurate and 20 fps high speed human detection.
  • Item
    Visual Place Categorization: Problem, Dataset, and Algorithm
    (Georgia Institute of Technology, 2009-10) Wu, Jianxin ; Rehg, James M. ; Christensen, Henrik I.
    In this paper we describe the problem of Visual Place Categorization (VPC) for mobile robotics, which involves predicting the semantic category of a place from image measurements acquired from an autonomous platform. For example, a robot in an unfamiliar home environment should be able to recognize the functionality of the rooms it visits, such as kitchen, living room, etc. We describe an approach to VPC based on sequential processing of images acquired with a conventional video camera.We identify two key challenges: Dealing with non-characteristic views and integrating restricted-FOV imagery into a holistic prediction. We present a solution to VPC based upon a recently-developed visual feature known as CENTRIST (CENsus TRansform hISTogram). We describe a new dataset for VPC which we have recently collected and are making publicly available. We believe this is the first significant, realistic dataset for the VPC problem. It contains the interiors of six different homes with ground truth labels. We use this dataset to validate our solution approach, achieving promising results.
  • Item
    CENTRIST: A Visual Descriptor for Scene Categorization
    (Georgia Institute of Technology, 2009-07-23) Wu, Jianxin ; Rehg, James M.
    CENTRIST (CENsus TRansform hISTogram), a new visual descriptor for recognizing topological places or scene categories, is introduced in this paper. We show that place and scene recognition, especially for indoor environments, require its visual descriptor to possess properties that are different from other vision domains (e.g. object recognition). CENTRIST satisfy these properties and suits the place and scene recognition task. It is a holistic representation and has strong generalizability for category recognition. CENTRIST mainly encodes the structural properties within an image and suppresses detailed textural information. Our experiments demonstrate that CENTRIST outperforms the current state-of-the art in several place and scene recognition datasets, compared with other descriptors such as SIFT and Gist. Besides, it is easy to implement. It has nearly no parameter to tune, and evaluates extremely fast.
  • Item
    On the Design of Cascades of Boosted Ensembles for Face Detection
    (Georgia Institute of Technology, 2005) Brubaker, S. Charles ; Wu, Jianxin ; Sun, Jie ; Mullin, Matthew D. ; Rehg, James M.
    Cascades of boosted ensembles have become popular in the object detection community following their highly successful introduction in the face detector of Viola and Jones. Since then, researchers have sought to improve upon the original approach by incorporating new methods along a variety of axes (e.g. alternative boosting methods, feature sets, etc). We explore several axes that have not yet received adequate attention in this context: cascade learning, stronger weak hypotheses, and feature filtering. We present a novel strategy to determine the appropriate balance between false positive and detection rates in the individual stages of the cascade, enabling us to control our experiments to a degree not previously possible. We show that while the choice of boosting method has little impact on the detector's performance and feature filtering is largely ineffective, the use of stronger weak hypotheses based on CART classifiers can significantly improve upon the standard results.
  • Item
    Fast Asymmetric Learning for Cascade Face Detection
    (Georgia Institute of Technology, 2005) Wu, Jianxin ; Brubaker, S. Charles ; Mullin, Matthew D. ; Rehg, James M.
    A cascade face detector uses a sequence of node classifiers to distinguish faces from non-faces. This paper presents a new approach to design node classifiers in the cascade detector. Previous methods used machine learning algorithms that simultaneously select features and form ensemble classifiers. We argue that if these two parts are decoupled, we have the freedom to design a classifier that explicitly addresses the difficulties caused by the asymmetric learning goal. There are three contributions in this paper. The first is a categorization of asymmetries in the learning goal, and why they make face detection hard. The second is the Forward Feature Selection (FFS) algorithm and a fast caching strategy for AdaBoost. FFS and the fast AdaBoost can reduce the training time by approximately 100 and 50 times, in comparison to a naive implementation of the AdaBoost feature selection method. The last contribution is Linear Asymmetric Classifier (LAC), a classifier that explicitly handles the asymmetric learning goal as a well-defined constrained optimization problem. We demonstrated experimentally that LAC results in improved ensemble classifier performance.
  • Item
    Learning a Rare Event Detection Cascade by Direct Feature Selection
    (Georgia Institute of Technology, 2003) Wu, Jianxin ; Rehg, James M. ; Mullin, Matthew D.
    Face detection is a canonical example of a rare event detection problem, in which target patterns occur with much lower frequency than non-targets. Out of millions of face-sized windows in an input image, for example, only a few will typically contain a face. Viola and Jones recently proposed a cascade architecture for face detection which successfully addresses the rare event nature of the task. A central part of their method is a feature selection algorithm based on AdaBoost. We present a novel cascade learning algorithm based on forward feature selection which is two orders of magnitude faster than the Viola-Jones approach and yields classifiers of similar quality. This faster method could be used for more demanding classification tasks, such as on-line learning or searching the space of classifier structures. Our experimental results highlight the dominant role of the feature set in the success of the cascade approach.