Organizational Unit:
Socially Intelligent Machines Lab

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 10 of 32
  • Item
    Multimodal Real-Time Contingency Detection for HRI
    (Georgia Institute of Technology, 2014-09) Chu, Vivian ; Bullard, Kalesha ; Thomaz, Andrea L.
    Our goal is to develop robots that naturally engage people in social exchanges. In this paper, we focus on the problem of recognizing that a person is responsive to a robot’s request for interaction. Inspired by human cognition, our approach is to treat this as a contingency detection problem. We present a simple discriminative Support Vector Machine (SVM) classifier to compare against previous generative meth- ods introduced in prior work by Lee et al. [1]. We evaluate these methods in two ways. First, by training three separate SVMs with multi-modal sensory input on a set of batch data collected in a controlled setting, where we obtain an average F₁ score of 0.82. Second, in an open-ended experiment setting with seven participants, we show that our model is able to perform contingency detection in real-time and generalize to new people with a best F₁ score of 0.72.
  • Item
    Generating Human-like Motion for Robots
    (Georgia Institute of Technology, 2013-07) Gielniak, Michael J. ; Liu, C. Karen ; Thomaz, Andrea L.
    Action prediction and fluidity are key elements of human-robot teamwork. If a robot’s actions are hard to understand, it can impede fluid HRI. Our goal is to improve the clarity of robot motion by making it more humanlike. We present an algorithm that autonomously synthesizes human-like variants of an input motion. Our approach is a three stage pipeline. First we optimize motion with respect to spatio-temporal correspondence (STC), which emulates the coordinated effects of human joints that are connected by muscles. We present three experiments that validate that our STC optimization approach increases human-likeness and recognition accuracy for human social partners. Next in the pipeline, we avoid repetitive motion by adding variance, through exploiting redundant and underutilized spaces of the input motion, which creates multiple motions from a single input. In two experiments we validate that our variance approach maintains the human-likeness from the previous step, and that a social partner can still accurately recognize the motion’s intent. As a final step, we maintain the robot’s ability to interact with it’s world by providing it the ability to satisfy constraints. We provide experimental analysis of the effects of constraints on the synthesized human-like robot motion variants.
  • Item
    Policy Shaping: Integrating Human Feedback with Reinforcement Learning
    (Georgia Institute of Technology, 2013) Griffith, Shane ; Subramanian, Kaushik ; Scholz, Jonathan ; Isbell, Charles L. ; Thomaz, Andrea L.
    A long term goal of Interactive Reinforcement Learning is to incorporate non- expert human feedback to solve complex tasks. Some state-of -the-art methods have approached this problem by mapping human information to rewards and values and iterating over them to compute better control policies. In this paper we argue for an alternate, more effective characterization of human feedback: Policy Shaping. We introduce Advise, a Bayesian approach that attempts to maximize the information gained from human feedback by utilizing it as direct policy labels. We compare Advise to state-of-the-art approaches and show that it can outperform them and is robust to infrequent and inconsistent human feedback.
  • Item
    Object Focused Q-Learning for Autonomous Agents
    (Georgia Institute of Technology, 2013) Cobo, Luis C. ; Isbell, Charles L. ; Thomaz, Andrea L.
    We present Object Focused Q-learning (OF-Q), a novel reinforcement learning algorithm that can offer exponential speed-ups over classic Q-learning on domains composed of independent objects. An OF-Q agent treats the state space as a collection of objects organized into different object classes. Our key contribution is a control policy that uses non-optimal Q-functions to estimate the risk of ignoring parts of the state space. We compare our algorithm to traditional Q-learning and previous arbitration algorithms in two domains, including a version of Space Invaders.
  • Item
    Automatic Task Decomposition and State Abstraction from Demonstration
    (Georgia Institute of Technology, 2012-06) Cobo, Luis C. ; Isbell, Charles L. ; Thomaz, Andrea L.
    Both Learning from Demonstration (LfD) and Reinforcement Learning (RL) are popular approaches for building decision-making agents. LfD applies supervised learning to a set of human demonstrations to infer and imitate the human policy, while RL uses only a reward signal and exploration to find an optimal policy. For complex tasks both of these techniques may be ineffective. LfD may require many more demonstrations than it is feasible to obtain, and RL can take an inadmissible amount of time to converge. We present Automatic Decomposition and Abstraction from demonstration (ADA), an algorithm that uses mutual information measures over a set of human demonstrations to decompose a sequential decision process into several sub- tasks, finding state abstractions for each one of these sub- tasks. ADA then projects the human demonstrations into the abstracted state space to build a policy. This policy can later be improved using RL algorithms to surpass the performance of the human teacher. We find empirically that ADA can find satisficing policies for problems that are too complex to be solved with traditional LfD and RL algorithms. In particular, we show that we can use mutual information across state features to leverage human demonstrations to reduce the effects of the curse of dimensionality by finding subtasks and abstractions in sequential decision processes.
  • Item
    Keyframe-based Learning from Demonstration Method and Evaluation
    (Georgia Institute of Technology, 2012-06) Akgun, Baris ; Cakmak, Maya ; Jiang, Karl ; Thomaz, Andrea L.
    We present a framework for learning skills from novel types of demonstrations that have been shown to be desirable from a human-robot interaction perspective. Our approach –Keyframe-based Learning from Demonstration (KLfD)– takes demonstrations that consist of keyframes; a sparse set of points in the state space that produces the intended skill when visited in sequence. The conventional type of trajectory demonstrations or a hybrid of the two are also handled by KLfD through a conversion to keyframes. Our method produces a skill model that consists of an ordered set of keyframe clusters, which we call Sequential Pose Distributions (SPD). The skill is reproduced by splining between clusters. We present results from two domains: mouse gestures in 2D and scooping, pouring and placing skills on a humanoid robot. KLfD has performance similar to existing LfD techniques when applied to conventional trajectory demonstrations. Additionally, we demonstrate that KLfD may be preferable when demonstration type is suited for the skill.
  • Item
    Multi-Cue Contingency Detection
    (Georgia Institute of Technology, 2012-04) Lee, Jinhan ; Chao, Crystal ; Bobick, Aaron F. ; Thomaz, Andrea L.
    The ability to detect a human's contingent response is an essential skill for a social robot attempting to engage new interaction partners or maintain ongoing turn-taking interactions. Prior work on contingency detection focuses on single cues from isolated channels, such as changes in gaze, motion, or sound.We propose a framework that integrates multiple cues for detecting contingency from multimodal sensor data in human-robot interaction scenarios. We describe three levels of integration and discuss our method for performing sensor fusion at each of these levels. We perform a Wizard-of-Oz data collection experiment in a turn-taking scenario in which our humanoid robot plays the turn-taking imitation game “Simon says" with human partners. Using this data set, which includes motion and body pose cues from a depth and color image and audio cues from a microphone, we evaluate our contingency detection module with the proposed integration mechanisms and show gains in accuracy of our multi-cue approach over single-cue contingency detection. We show the importance of selecting the appropriate level of cue integration as well as the implications of varying the referent event parameter.
  • Item
    Enhancing Interaction Through Exaggerated Motion Synthesis
    (Georgia Institute of Technology, 2012-03) Gielniak, Michael J. ; Thomaz, Andrea L.
    Other than eye gaze and referential gestures (e.g. pointing), the relationship between robot motion and observer attention is not well understood. We explore this relationship to achieve social goals, such as influencing human partner behavior or directing attention. We present an algorithm that creates exaggerated variants of a motion in real-time. Through two experiments we confirm that exaggerated motion is perceptibly different than the input motion, provided that the motion is sufficiently exaggerated. We found that different levels of exaggeration correlate to human expectations of robot-like, human-like, and cartoon-like motion. We present empirical evidence that use of exaggerated motion in experiments enhances the interaction through the benefits of increased engagement and perceived entertainment value. Finally, we provide statistical evidence that exaggerated motion causes a human partner to have better retention of interaction details and predictable gaze direction
  • Item
    Trajectories and Keyframes for Kinesthetic Teaching: A Human-Robot Interaction Perspective
    (Georgia Institute of Technology, 2012-03) Akgun, Baris ; Cakmak, Maya ; Yoo, Jae Wook ; Thomaz, Andrea L.
    Kinesthetic teaching is an approach to providing demonstrations to a robot in Learning from Demonstration whereby a human physically guides a robot to perform a skill. In the common usage of kinesthetic teaching, the robot's trajectory during a demonstration is recorded from start to end. In this paper we consider an alternative, keyframe demonstrations, in which the human provides a sparse set of consecutive keyframes that can be connected to perform the skill. We present a user-study (n = 34) comparing the two approaches and highlighting their complementary nature. The study also tests and shows the potential benefits of iterative and adaptive versions of keyframe demonstrations. Finally, we introduce a hybrid method that combines trajectories and keyframes in a single demonstration
  • Item
    Timing in Multimodal Turn-Taking Interactions: Control and Analysis Using Timed Petri Nets
    (Georgia Institute of Technology, 2012) Chao, Crystal ; Thomaz, Andrea L.
    Turn-taking interactions with humans are multimodal and reciprocal in nature. In addition, the timing of actions is of great importance, as it influences both social and task strategies. To enable the precise control and analysis of timed discrete events for a robot, we develop a system for multimodal collaboration based on a timed Petri net (TPN) representation. We also argue for action interruptions in reciprocal interaction and describe its implementation within our system. Using the system, our autonomously operating humanoid robot Simon collaborates with humans through both speech and physical action to solve the Towers of Hanoi, during which the human and the robot take turns manipulating objects in a shared physical workspace. We hypothesize that action interruptions have a positive impact on turn-taking and evaluate this in the Towers of Hanoi domain through two experimental methods. One is a between-groups user study with 16 participants. The other is a simulation experiment using 200 simulated users of varying speed, initiative, compliance, and correctness. In these experiments, action interruptions are either present or absent in the system. Our collective results show that action interruptions lead to increased task efficiency through increased user initiative, improved interaction balance, and higher sense of fluency. In arriving at these results, we demonstrate how these evaluation methods can be highly complementary in the analysis of interaction dynamics