Organizational Unit:
Institute for Robotics and Intelligent Machines (IRIM)

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Includes Organization(s)
Organizational Unit
ArchiveSpace Name Record

Publication Search Results

Now showing 1 - 10 of 37
  • Item
    Multimodal Real-Time Contingency Detection for HRI
    (Georgia Institute of Technology, 2014-09) Chu, Vivian ; Bullard, Kalesha ; Thomaz, Andrea L.
    Our goal is to develop robots that naturally engage people in social exchanges. In this paper, we focus on the problem of recognizing that a person is responsive to a robot’s request for interaction. Inspired by human cognition, our approach is to treat this as a contingency detection problem. We present a simple discriminative Support Vector Machine (SVM) classifier to compare against previous generative meth- ods introduced in prior work by Lee et al. [1]. We evaluate these methods in two ways. First, by training three separate SVMs with multi-modal sensory input on a set of batch data collected in a controlled setting, where we obtain an average F₁ score of 0.82. Second, in an open-ended experiment setting with seven participants, we show that our model is able to perform contingency detection in real-time and generalize to new people with a best F₁ score of 0.72.
  • Item
    Feasibility of Identifying Eating Moments from First-Person Images Leveraging Human Computation
    (Georgia Institute of Technology, 2013-11) Thomaz, Edison ; Parnami, Aman ; Essa, Irfan ; Abowd, Gregory D.
    There is widespread agreement in the medical research community that more effective mechanisms for dietary assessment and food journaling are needed to fight back against obesity and other nutrition-related diseases. However, it is presently not possible to automatically capture and objectively assess an individual’s eating behavior. Currently used dietary assessment and journaling approaches have several limitations; they pose a significant burden on individuals and are often not detailed or accurate enough. In this paper, we describe an approach where we leverage human computation to identify eating moments in first-person point-of-view images taken with wearable cameras. Recognizing eating moments is a key first step both in terms of automating dietary assessment and building systems that help individuals reflect on their diet. In a feasibility study with 5 participants over 3 days, where 17,575 images were collected in total, our method was able to recognize eating moments with 89.68% accuracy.
  • Item
    Learning Stable Pushing Locations
    (Georgia Institute of Technology, 2013-08) Hermans, Tucker ; Li, Fuxin ; Rehg, James M. ; Bobick, Aaron F.
    We present a method by which a robot learns to predict effective push-locations as a function of object shape. The robot performs push experiments at many contact locations on multiple objects and records local and global shape features at each point of contact. The robot observes the outcome trajectories of the manipulations and computes a novel push-stability score for each trial. The robot then learns a regression function in order to predict push effectiveness as a function of object shape. This mapping allows the robot to select effective push locations for subsequent objects whether they are previously manipulated instances, new instances from previously encountered object classes, or entirely novel objects. In the totally novel object case, the local shape property coupled with the overall distribution of the object allows for the discovery of effective push locations. These results are demonstrated on a mobile manipulator robot pushing a variety of household objects on a tabletop surface.
  • Item
    Augmenting Bag-of-Words: Data-Driven Discovery of Temporal and Structural Information for Activity Recognition
    (Georgia Institute of Technology, 2013-06) Bettadapura, Vinay ; Schindler, Grant ; Plötz, Thomas ; Essa, Irfan
    We present data-driven techniques to augment Bag of Words (BoW) models, which allow for more robust modeling and recognition of complex long-term activities, especially when the structure and topology of the activities are not known a priori. Our approach specifically addresses the limitations of standard BoW approaches, which fail to represent the underlying temporal and causal information that is inherent in activity streams. In addition, we also propose the use of randomly sampled regular expressions to discover and encode patterns in activities. We demonstrate the effectiveness of our approach in experimental evaluations where we successfully recognize activities and detect anomalies in four complex datasets.
  • Item
    DDF-SAM 2.0: Consistent Distributed Smoothing and Mapping
    (Georgia Institute of Technology, 2013-05) Cunningham, Alexander ; Indelman, Vadim ; Dellaert, Frank
    This paper presents an consistent decentralized data fusion approach for robust multi-robot SLAM in dan- gerous, unknown environments. The DDF-SAM 2.0 approach extends our previous work by combining local and neigh- borhood information in a single, consistent augmented local map, without the overly conservative approach to avoiding information double-counting in the previous DDF-SAM algo- rithm. We introduce the anti-factor as a means to subtract information in graphical SLAM systems, and illustrate its use to both replace information in an incremental solver and to cancel out neighborhood information from shared summarized maps. This paper presents and compares three summarization techniques, with two exact approaches and an approximation. We evaluated the proposed system in a synthetic example and show the augmented local system and the associated summarization technique do not double-count information, while keeping performance tractable.
  • Item
    Object Focused Q-Learning for Autonomous Agents
    (Georgia Institute of Technology, 2013) Cobo, Luis C. ; Isbell, Charles L. ; Thomaz, Andrea L.
    We present Object Focused Q-learning (OF-Q), a novel reinforcement learning algorithm that can offer exponential speed-ups over classic Q-learning on domains composed of independent objects. An OF-Q agent treats the state space as a collection of objects organized into different object classes. Our key contribution is a control policy that uses non-optimal Q-functions to estimate the risk of ignoring parts of the state space. We compare our algorithm to traditional Q-learning and previous arbitration algorithms in two domains, including a version of Space Invaders.
  • Item
    Linguistic Transfer of Human Assembly Tasks to Robots
    (Georgia Institute of Technology, 2012-10) Dantam, Neil ; Essa, Irfan ; Stilman, Mike
    We demonstrate the automatic transfer of an assembly task from human to robot. This work extends efforts showing the utility of linguistic models in verifiable robot control policies by now performing real visual analysis of human demonstrations to automatically extract a policy for the task. This method tokenizes each human demonstration into a sequence of object connection symbols, then transforms the set of sequences from all demonstrations into an automaton, which represents the task-language for assembling a desired object. Finally, we combine this assembly automaton with a kinematic model of a robot arm to reproduce the demonstrated task.
  • Item
    Calibration-Free Rolling Shutter Removal
    (Georgia Institute of Technology, 2012-04) Grundmann, Matthias ; Kwatra, Vivek ; Castro, Daniel ; Essa, Irfan
    We present a novel algorithm for efficient removal of rolling shutter distortions in uncalibrated streaming videos. Our proposed method is calibration free as it does not need any knowledge of the camera used, nor does it require calibration using specially recorded calibration sequences. Our algorithm can perform rolling shutter removal under varying focal lengths, as in videos from CMOS cameras equipped with an optical zoom. We evaluate our approach across a broad range of cameras and video sequences demonstrating robustness, scaleability, and repeatability. We also conducted a user study, which demonstrates preference for the output of our algorithm over other state-of-the art methods. Our algorithm is computationally efficient, easy to parallelize, and robust to challenging artifacts introduced by various cameras with differing technologies.
  • Item
    Enhancing Interaction Through Exaggerated Motion Synthesis
    (Georgia Institute of Technology, 2012-03) Gielniak, Michael J. ; Thomaz, Andrea L.
    Other than eye gaze and referential gestures (e.g. pointing), the relationship between robot motion and observer attention is not well understood. We explore this relationship to achieve social goals, such as influencing human partner behavior or directing attention. We present an algorithm that creates exaggerated variants of a motion in real-time. Through two experiments we confirm that exaggerated motion is perceptibly different than the input motion, provided that the motion is sufficiently exaggerated. We found that different levels of exaggeration correlate to human expectations of robot-like, human-like, and cartoon-like motion. We present empirical evidence that use of exaggerated motion in experiments enhances the interaction through the benefits of increased engagement and perceived entertainment value. Finally, we provide statistical evidence that exaggerated motion causes a human partner to have better retention of interaction details and predictable gaze direction
  • Item
    Trajectories and Keyframes for Kinesthetic Teaching: A Human-Robot Interaction Perspective
    (Georgia Institute of Technology, 2012-03) Akgun, Baris ; Cakmak, Maya ; Yoo, Jae Wook ; Thomaz, Andrea L.
    Kinesthetic teaching is an approach to providing demonstrations to a robot in Learning from Demonstration whereby a human physically guides a robot to perform a skill. In the common usage of kinesthetic teaching, the robot's trajectory during a demonstration is recorded from start to end. In this paper we consider an alternative, keyframe demonstrations, in which the human provides a sparse set of consecutive keyframes that can be connected to perform the skill. We present a user-study (n = 34) comparing the two approaches and highlighting their complementary nature. The study also tests and shows the potential benefits of iterative and adaptive versions of keyframe demonstrations. Finally, we introduce a hybrid method that combines trajectories and keyframes in a single demonstration