Organizational Unit:
Socially Intelligent Machines Lab

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 2 of 2
Thumbnail Image
Item

Multi-Cue Contingency Detection

2012-04 , Lee, Jinhan , Chao, Crystal , Bobick, Aaron F. , Thomaz, Andrea L.

The ability to detect a human's contingent response is an essential skill for a social robot attempting to engage new interaction partners or maintain ongoing turn-taking interactions. Prior work on contingency detection focuses on single cues from isolated channels, such as changes in gaze, motion, or sound.We propose a framework that integrates multiple cues for detecting contingency from multimodal sensor data in human-robot interaction scenarios. We describe three levels of integration and discuss our method for performing sensor fusion at each of these levels. We perform a Wizard-of-Oz data collection experiment in a turn-taking scenario in which our humanoid robot plays the turn-taking imitation game “Simon says" with human partners. Using this data set, which includes motion and body pose cues from a depth and color image and audio cues from a microphone, we evaluate our contingency detection module with the proposed integration mechanisms and show gains in accuracy of our multi-cue approach over single-cue contingency detection. We show the importance of selecting the appropriate level of cue integration as well as the implications of varying the referent event parameter.

Thumbnail Image
Item

Vision-based Contingency Detection

2011 , Lee, Jinhan , Kiser, Jeffrey F. , Bobick, Aaron F. , Thomaz, Andrea L.

We present a novel method for the visual detection of a contingent response by a human to the stimulus of a robot action. Contingency is de ned as a change in an agent's be- havior within a speci c time window in direct response to a signal from another agent; detection of such responses is essential to assess the willingness and interest of a human in interacting with the robot. Using motion-based features to describe the possible contingent action, our approach as- sesses the visual self-similarity of video subsequences cap- tured before the robot exhibits its signaling behavior and statistically models the typical graph-partitioning cost of separating an arbitrary subsequence of frames from the oth- ers. After the behavioral signal, the video is similarly ana- lyzed and the cost of separating the after-signal frames from the before-signal sequences is computed; a lower than typ- ical cost indicates likely contingent reaction. We present a preliminary study in which data were captured and analyzed for algorithmic performance.