Organizational Unit:

Socially Intelligent Machines Lab

Permanent Link

https://hdl.handle.net/1853/70784

Parent Organization

Organizational Unit

College of Computing

ArchiveSpace Name Record

https://finding-aids.library.gatech.edu/agents/corporate_entities/1114

Full item page

Publication Search Results

Now showing 1 - 10 of 23

Multimodal Real-Time Contingency Detection for HRI

(Georgia Institute of Technology, 2014-09) Chu, Vivian ; Bullard, Kalesha ; Thomaz, Andrea L.

Our goal is to develop robots that naturally engage people in social exchanges. In this paper, we focus on the problem of recognizing that a person is responsive to a robot’s request for interaction. Inspired by human cognition, our approach is to treat this as a contingency detection problem. We present a simple discriminative Support Vector Machine (SVM) classifier to compare against previous generative meth- ods introduced in prior work by Lee et al. [1]. We evaluate these methods in two ways. First, by training three separate SVMs with multi-modal sensory input on a set of batch data collected in a controlled setting, where we obtain an average F₁ score of 0.82. Second, in an open-ended experiment setting with seven participants, we show that our model is able to perform contingency detection in real-time and generalize to new people with a best F₁ score of 0.72.
Policy Shaping: Integrating Human Feedback with Reinforcement Learning

(Georgia Institute of Technology, 2013) Griffith, Shane ; Subramanian, Kaushik ; Scholz, Jonathan ; Isbell, Charles L. ; Thomaz, Andrea L.

A long term goal of Interactive Reinforcement Learning is to incorporate non- expert human feedback to solve complex tasks. Some state-of -the-art methods have approached this problem by mapping human information to rewards and values and iterating over them to compute better control policies. In this paper we argue for an alternate, more effective characterization of human feedback: Policy Shaping. We introduce Advise, a Bayesian approach that attempts to maximize the information gained from human feedback by utilizing it as direct policy labels. We compare Advise to state-of-the-art approaches and show that it can outperform them and is robust to infrequent and inconsistent human feedback.
Object Focused Q-Learning for Autonomous Agents

(Georgia Institute of Technology, 2013) Cobo, Luis C. ; Isbell, Charles L. ; Thomaz, Andrea L.

We present Object Focused Q-learning (OF-Q), a novel reinforcement learning algorithm that can offer exponential speed-ups over classic Q-learning on domains composed of independent objects. An OF-Q agent treats the state space as a collection of objects organized into different object classes. Our key contribution is a control policy that uses non-optimal Q-functions to estimate the risk of ignoring parts of the state space. We compare our algorithm to traditional Q-learning and previous arbitration algorithms in two domains, including a version of Space Invaders.
Automatic Task Decomposition and State Abstraction from Demonstration

(Georgia Institute of Technology, 2012-06) Cobo, Luis C. ; Isbell, Charles L. ; Thomaz, Andrea L.

Both Learning from Demonstration (LfD) and Reinforcement Learning (RL) are popular approaches for building decision-making agents. LfD applies supervised learning to a set of human demonstrations to infer and imitate the human policy, while RL uses only a reward signal and exploration to find an optimal policy. For complex tasks both of these techniques may be ineffective. LfD may require many more demonstrations than it is feasible to obtain, and RL can take an inadmissible amount of time to converge. We present Automatic Decomposition and Abstraction from demonstration (ADA), an algorithm that uses mutual information measures over a set of human demonstrations to decompose a sequential decision process into several sub- tasks, finding state abstractions for each one of these sub- tasks. ADA then projects the human demonstrations into the abstracted state space to build a policy. This policy can later be improved using RL algorithms to surpass the performance of the human teacher. We find empirically that ADA can find satisficing policies for problems that are too complex to be solved with traditional LfD and RL algorithms. In particular, we show that we can use mutual information across state features to leverage human demonstrations to reduce the effects of the curse of dimensionality by finding subtasks and abstractions in sequential decision processes.
Enhancing Interaction Through Exaggerated Motion Synthesis

(Georgia Institute of Technology, 2012-03) Gielniak, Michael J. ; Thomaz, Andrea L.

Other than eye gaze and referential gestures (e.g. pointing), the relationship between robot motion and observer attention is not well understood. We explore this relationship to achieve social goals, such as influencing human partner behavior or directing attention. We present an algorithm that creates exaggerated variants of a motion in real-time. Through two experiments we confirm that exaggerated motion is perceptibly different than the input motion, provided that the motion is sufficiently exaggerated. We found that different levels of exaggeration correlate to human expectations of robot-like, human-like, and cartoon-like motion. We present empirical evidence that use of exaggerated motion in experiments enhances the interaction through the benefits of increased engagement and perceived entertainment value. Finally, we provide statistical evidence that exaggerated motion causes a human partner to have better retention of interaction details and predictable gaze direction
Trajectories and Keyframes for Kinesthetic Teaching: A Human-Robot Interaction Perspective

(Georgia Institute of Technology, 2012-03) Akgun, Baris ; Cakmak, Maya ; Yoo, Jae Wook ; Thomaz, Andrea L.

Kinesthetic teaching is an approach to providing demonstrations to a robot in Learning from Demonstration whereby a human physically guides a robot to perform a skill. In the common usage of kinesthetic teaching, the robot's trajectory during a demonstration is recorded from start to end. In this paper we consider an alternative, keyframe demonstrations, in which the human provides a sparse set of consecutive keyframes that can be connected to perform the skill. We present a user-study (n = 34) comparing the two approaches and highlighting their complementary nature. The study also tests and shows the potential benefits of iterative and adaptive versions of keyframe demonstrations. Finally, we introduce a hybrid method that combines trajectories and keyframes in a single demonstration
Towards Grounding Concepts for Transfer in Goal Learning from Demonstration

(Georgia Institute of Technology, 2011-08) Chao, Crystal ; Cakmak, Maya ; Thomaz, Andrea L.

We aim to build robots that frame the task learning problem as goal inference so that they are natural to teach and meet people's expectations for a learning partner. The focus of this work is the scenario of a social robot that learns task goals from human demonstrations without prior knowledge of high-level concepts. In the system that we present, these discrete concepts are grounded from low-level continuous sensor data through unsupervised learning, and task goals are subsequently learned on them using Bayesian inference. The grounded concepts are derived from the structure of the Learning from Demonstration (LfD) problem and exhibit degrees of prototypicality. These concepts can be used to transfer knowledge to future tasks, resulting in faster learning of those tasks. Using sensor data taken during demonstrations to our robot from five human teachers, we show the expressivity of using grounded concepts when learning new tasks from demonstration. We then show how the learning curve improves when transferring the knowledge of grounded concepts to future tasks.
Task-Aware Variations in Robot Motion

(Georgia Institute of Technology, 2011-05) Gielniak, Michael J. ; Liu, C. Karen ; Thomaz, Andrea L.

Social robots can benefit from motion variance because non-repetitive gestures will be more natural and intuitive for human partners. We introduce a new approach for synthesizing variance, both with and without constraints, using a stochastic process. Based on optimal control theory and operational space control, our method can generate an infinite number of variations in real-time that resemble the kinematic and dynamic characteristics from the single input motion sequence. We also introduce a stochastic method to generate smooth but nondeterministic transitions between arbitrary motion variants. Furthermore, we quantitatively evaluate taskaware variance against random white torque noise, operational space control, style-based inverse kinematics, and retargeted human motion to prove that task-aware variance generates human-like motion. Finally, we demonstrate the ability of task-aware variance to maintain velocity and time-dependent features that exist in the input motion.
Simon plays Simon says: The timing of turn-taking in an imitation game

(Georgia Institute of Technology, 2011) Chao, Crystal ; Lee, Jinhan ; Begum, Momotaz ; Thomaz, Andrea L.

Turn-taking is fundamental to the way humans engage in information exchange, but robots currently lack the turn-taking skills required for natural communication. In order to bring effective turn-taking to robots, we must first understand the underlying processes in the context of what is possible to implement. We describe a data collection experiment with an interaction format inspired by “Simon says,” a turn-taking imitation game that engages the channels of gaze, speech, and motion. We analyze data from 23 human subjects interacting with a humanoid social robot and propose the principle of minimum necessary information (MNI) as a factor in determining the timing of the human response.We also describe the other observed phenomena of channel exclusion, efficiency, and adaptation. We discuss the implications of these principles and propose some ways to incorporate our findings into a computational model of turn-taking.
Vision-based Contingency Detection

(Georgia Institute of Technology, 2011) Lee, Jinhan ; Kiser, Jeffrey F. ; Bobick, Aaron F. ; Thomaz, Andrea L.

We present a novel method for the visual detection of a contingent response by a human to the stimulus of a robot action. Contingency is de ned as a change in an agent's be- havior within a speci c time window in direct response to a signal from another agent; detection of such responses is essential to assess the willingness and interest of a human in interacting with the robot. Using motion-based features to describe the possible contingent action, our approach as- sesses the visual self-similarity of video subsequences cap- tured before the robot exhibits its signaling behavior and statistically models the typical graph-partitioning cost of separating an arbitrary subsequence of frames from the oth- ers. After the behavioral signal, the video is similarly ana- lyzed and the cost of separating the after-signal frames from the before-signal sequences is computed; a lower than typ- ical cost indicates likely contingent reaction. We present a preliminary study in which data were captured and analyzed for algorithmic performance.