Organizational Unit:
Institute for Robotics and Intelligent Machines (IRIM)

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Includes Organization(s)
Organizational Unit
ArchiveSpace Name Record

Publication Search Results

Now showing 1 - 7 of 7
  • Item
    Feasibility of Identifying Eating Moments from First-Person Images Leveraging Human Computation
    (Georgia Institute of Technology, 2013-11) Thomaz, Edison ; Parnami, Aman ; Essa, Irfan ; Abowd, Gregory D.
    There is widespread agreement in the medical research community that more effective mechanisms for dietary assessment and food journaling are needed to fight back against obesity and other nutrition-related diseases. However, it is presently not possible to automatically capture and objectively assess an individual’s eating behavior. Currently used dietary assessment and journaling approaches have several limitations; they pose a significant burden on individuals and are often not detailed or accurate enough. In this paper, we describe an approach where we leverage human computation to identify eating moments in first-person point-of-view images taken with wearable cameras. Recognizing eating moments is a key first step both in terms of automating dietary assessment and building systems that help individuals reflect on their diet. In a feasibility study with 5 participants over 3 days, where 17,575 images were collected in total, our method was able to recognize eating moments with 89.68% accuracy.
  • Item
    Augmenting Bag-of-Words: Data-Driven Discovery of Temporal and Structural Information for Activity Recognition
    (Georgia Institute of Technology, 2013-06) Bettadapura, Vinay ; Schindler, Grant ; Plötz, Thomas ; Essa, Irfan
    We present data-driven techniques to augment Bag of Words (BoW) models, which allow for more robust modeling and recognition of complex long-term activities, especially when the structure and topology of the activities are not known a priori. Our approach specifically addresses the limitations of standard BoW approaches, which fail to represent the underlying temporal and causal information that is inherent in activity streams. In addition, we also propose the use of randomly sampled regular expressions to discover and encode patterns in activities. We demonstrate the effectiveness of our approach in experimental evaluations where we successfully recognize activities and detect anomalies in four complex datasets.
  • Item
    A Visualization Framework for Team Sports Captured using Multiple Static Cameras
    (Georgia Institute of Technology, 2013) Hamid, Raffay ; Kumar, Ramkrishan ; Hodgins, Jessica K. ; Essa, Irfan
    We present a novel approach for robust localization of multiple people observed using a set of static cameras. We use this location information to generate a visualization of the virtual offside line in soccer games. To compute the position of the offside line, we need to localize players' positions, and identify their team roles. We solve the problem of fusing corresponding players' positional information by finding minimum weight K-length cycles in a complete K-partite graph. Each partite of the graph corresponds to one of the K cameras, whereas each node of a partite encodes the position and appearance of a player observed from a particular camera. To find the minimum weight cycles in this graph, we use a dynamic programming based approach that varies over a continuum from maximally to minimally greedy in terms of the number of graph-paths explored at each iteration. We present proofs for the efficiency and performance bounds of our algorithms. Finally, we demonstrate the robustness of our framework by testing it on 82,000 frames of soccer footage captured over eight different illumination conditions, play types, and team attire. Our framework runs in near-real time, and processes video from 3 full HD cameras in about 0.4 seconds for each set of corresponding 3 frames.
  • Item
    Linguistic Transfer of Human Assembly Tasks to Robots
    (Georgia Institute of Technology, 2012-10) Dantam, Neil ; Essa, Irfan ; Stilman, Mike
    We demonstrate the automatic transfer of an assembly task from human to robot. This work extends efforts showing the utility of linguistic models in verifiable robot control policies by now performing real visual analysis of human demonstrations to automatically extract a policy for the task. This method tokenizes each human demonstration into a sequence of object connection symbols, then transforms the set of sequences from all demonstrations into an automaton, which represents the task-language for assembling a desired object. Finally, we combine this assembly automaton with a kinematic model of a robot arm to reproduce the demonstrated task.
  • Item
    Calibration-Free Rolling Shutter Removal
    (Georgia Institute of Technology, 2012-04) Grundmann, Matthias ; Kwatra, Vivek ; Castro, Daniel ; Essa, Irfan
    We present a novel algorithm for efficient removal of rolling shutter distortions in uncalibrated streaming videos. Our proposed method is calibration free as it does not need any knowledge of the camera used, nor does it require calibration using specially recorded calibration sequences. Our algorithm can perform rolling shutter removal under varying focal lengths, as in videos from CMOS cameras equipped with an optical zoom. We evaluate our approach across a broad range of cameras and video sequences demonstrating robustness, scaleability, and repeatability. We also conducted a user study, which demonstrates preference for the output of our algorithm over other state-of-the art methods. Our algorithm is computationally efficient, easy to parallelize, and robust to challenging artifacts introduced by various cameras with differing technologies.
  • Item
    Beyond Sentiment: The Manifold of Human Emotions
    (Georgia Institute of Technology, 2012-02) Kim, Seungyeon ; Li, Fuxin ; Lebanon, Guy ; Essa, Irfan
    Sentiment analysis predicts the presence of positive or negative emotions in a text document. In this paper we consider higher dimensional extensions of the sentiment concept, which represent a richer set of human emotions. Our approach goes beyond previous work in that our model contains a continuous manifold rather than a finite set of human emotions. We investigate the resulting model, compare it to psychological observations, and explore its predictive capabilities. Besides obtaining significant improvements over a baseline without manifold, we are also able to visualize different notions of positive sentiment in different domains.
  • Item
    Spectral Partitioning for Structure from Motion
    (Georgia Institute of Technology, 2003-10) Steedly, Drew ; Essa, Irfan ; Dellaert, Frank
    We propose a spectral partitioning approach for large-scale optimization problems, specifically structure from motion. In structure from motion, partitioning methods reduce the problem into smaller and better conditioned subproblems which can be efficiently optimized. Our partitioning method uses only the Hessian of the reprojection error and its eigenvectors. We show that partitioned systems that preserve the eigenvectors corresponding to small eigenvalues result in lower residual error when optimized. We create partitions by clustering the entries of the eigenvectors of the Hessian corresponding to small eigenvalues. This is a more general technique than relying on domain knowledge and heuristics such as bottom-up structure from motion approaches. Simultaneously, it takes advantage of more information than generic matrix partitioning algorithms.