Organizational Unit:
School of Music

Research Organization Registry ID
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 7 of 7
  • Item
    Machine Learning Driven Emotional Musical Prosody for Human-Robot Interaction
    (Georgia Institute of Technology, 2021-11-18) Savery, Richard
    This dissertation presents a method for non-anthropomorphic human-robot interaction using a newly developed concept entitled Emotional Musical Prosody (EMP). EMP consists of short expressive musical phrases capable of conveying emotions, which can be embedded in robots to accompany mechanical gestures. The main objective of EMP is to improve human engagement with, and trust in robots while avoiding the uncanny valley. We contend that music - one of the most emotionally meaningful human experiences - can serve as an effective medium to support human-robot engagement and trust. EMP allows for the development of personable, emotion-driven agents, capable of giving subtle cues to collaborators while presenting a sense of autonomy. We present four research areas aimed at developing and understanding the potential role of EMP in human-robot interaction. The first research area focuses on collecting and labeling a new EMP dataset from vocalists, and using this dataset to generate prosodic emotional phrases through deep learning methods. Through extensive listening tests, the collected dataset and generated phrases were validated with a high level of accuracy by a large subject pool. The second research effort focuses on understanding the effect of EMP in human-robot interaction with industrial and humanoid robots. Here, significant results were found for improved trust, perceived intelligence, and likeability of EMP enabled robotic arms, but not for humanoid robots. We also found significant results for improved trust in a social robot, as well as perceived intelligence, creativity and likeability in a robotic musician. The third and fourth research areas shift to broader use cases and potential methods to use EMP in HRI. The third research area explores the effect of robotic EMP on different personality types focusing on extraversion and neuroticism. For robots, personality traits offer a unique way to implement custom responses, individualized to human collaborators. We discovered that humans prefer robots with emotional responses based on high extraversion and low neuroticism, with some correlation between the humans collaborator’s own personality traits. The fourth and final research question focused on scaling up EMP to support interaction between groups of robots and humans. Here, we found that improvements in trust and likeability carried across from single robots to groups of industrial arms. Overall, the thesis suggests EMP is useful for improving trust and likeability for industrial, social and robot musicians but not in humanoid robots. The thesis bears future implications for HRI designers, showing the extensive potential of careful audio design, and the wide range of outcomes audio can have on HRI.
  • Item
    Regressing dexterous finger flexions using machine learning and multi-channel single element ultrasound transducers
    (Georgia Institute of Technology, 2018-04-27) Hantrakul, Lamtharn
    Human Machine Interfaces or "HMI's" come in many shapes and sizes. The mouse and keyboard is a typical and familiar HMI. In applications such as Virtual Reality or Music performance, a precise HMI for tracking finger movement is often required. Ultrasound, a safe and non-invasive imaging technique, has shown great promise as an alternative HMI interface that addresses the shortcomings of vision-based and glove-based sensors. This thesis develops a first-in-class system enabling real-time regression of individual and simultaneous finger flexions using single element ultrasound transducers. A comprehensive dataset of ultrasound signals is collected is collected from a study of 10 users. A series of machine learning experiments using this dataset demonstrate promising results supporting the use of single element transducers as a HMI device.
  • Item
    Towards an embodied musical mind: Generative algorithms for robotic musicians
    (Georgia Institute of Technology, 2017-04-19) Bretan, Peter Mason
    Embodied cognition is a theory stating that the processes and functions comprising the human mind are influenced by a person's physical body. The theory of embodied musical cognition holds that a person's body largely influences his or her musical experiences and actions. This work presents multiple frameworks for computer music generation as it pertains to robotic musicianship such that the musical decisions result from a joint optimization between the robot's physical constraints and musical knowledge. First, a generative framework based on hand-designed higher level musical concepts and the Viterbi beam search algorithm is described. The system allows for efficient and autonomous exploration on the relationship between music and physicality and the resulting music that is contingent on such a connection. It is evaluated objectively based on its ability to plan a series of sound actuating robotic movements (path planning) that minimize risk of collision, the number of dropped notes, spurious movements, and energy expenditure. Second, a method for developing higher level musical concepts (semantics) based on machine learning is presented. Using strategies based on neural networks and deep learning we show that it is possible to learn perceptually meaningful higher-level representations of music. These learned musical ``embeddings'' are applied to an autonomous music generation system that utilizes unit selection. The embeddings and generative system are evaluated based on objective ranking tasks and a subjective listening study. Third, the method for learning musical semantics is extended to a robot such that its embodiment becomes integral to the learning process. The resulting embeddings simultaneously encode information describing both important musical features and the robot's physical constraints.
  • Item
    Enhancing stroke generation and expressivity in robotic drummers - A generative physics model approach
    (Georgia Institute of Technology, 2015-04-24) Edakkattil Gopinath, Deepak
    The goal of this master's thesis research is to enhance the stroke generation capabilities and musical expressivity in robotic drummers. The approach adopted is to understand the physics of human fingers-drumstick-drumhead interaction and try to replicate the same behavior in a robotic drumming system with the minimum number of degrees of freedom. The model that is developed is agnostic to the exact specifications of the robotic drummer that will attempt to emulate human like drum strokes, and therefore can be used in any robotic drummer that uses actuators with complete control over the motor position angle. Initial approaches based on exploiting the instability of a PID control system to generate multiple bounces and the limitations of this approach are also discussed in depth. In order to assess the success of the model and the implementation in the robotic platform a subjective evaluation was conducted. The evaluation results showed that, the observed data was statistically equivalent to the subjects resorting to a blind guess in order to distinguish between a human playing a multiple bounce stroke and a robot playing a similar kind of stroke.
  • Item
    Musical swarm robot simulation strategies
    (Georgia Institute of Technology, 2011-11-16) Albin, Aaron Thomas
    Swarm robotics for music is a relatively new way to explore algorithmic composition as well as new modes of human robot interaction. This work outlines a strategy for making music with a robotic swarm constrained by acoustic sound, rhythmic music using sequencers, motion causing changes in the music, and finally human and swarm interaction. Two novel simulation programs are created in this thesis: the first is a multi-agent simulation designed to explore suitable parameters for motion to music mappings as well as parameters for real time interaction. The second is a boid-based robotic swarm simulation that adheres to the constraints established, using derived parameters from the multi-agent simulation: orientation, number of neighbors, and speed. In addition, five interaction modes are created that vary along an axis of direct and indirect forms of human control over the swarm motion. The mappings and interaction modes of the swarm robot simulation are evaluated in a user study involving music technology students. The purpose of the study is to determine the legibility of the motion to musical mappings and evaluate user preferences for the mappings and modes of interaction in problem solving and in open-ended contexts. The findings suggest that typical users of a swarm robot system do not necessarily prefer more inherently legible mappings in open-ended contexts. Users prefer direct and intermediate modes of interaction in problem solving scenarios, but favor intermediate modes of interaction in open-ended ones. The results from this study will be used in the design and development of a new swarm robotic system for music that can be used in both contexts.
  • Item
    N-gram modeling of tabla sequences using Variable-Length Hidden Markov Models for improvisation and composition
    (Georgia Institute of Technology, 2011-09-20) Sastry, Avinash
    This work presents a novel approach for the design of a predictive model of music that can be used to analyze and generate musical material that is highly context dependent. The system is based on an approach known as n-gram modeling, often used in language processing and speech recognition algorithms, implemented initially upon a framework of Variable-Length Markov Models (VLMMs) and then extended to Variable-Length Hidden Markov Models (VLHMMs). The system brings together various principles like escape probabilities, smoothing schemes and uses multiple representations of the data stream to construct a multiple viewpoints system that enables it to draw complex relationships between the different input n-grams, and use this information to provide a stronger prediction scheme. It is implemented as a MAX/MSP external in C++ and is intended to be a predictive framework that can be used to create generative music systems and educational and compositional tools for music. A formal quantitative evaluation scheme based on entropy of the predictions is used to evaluate the model in sequence prediction tasks on a database of tabla compositions. The results show good model performance for both the VLMM and the VLHMM while highlighting the expensive computational cost of higher-order VLHMMs.
  • Item
    A generative model of tonal tension and its application in dynamic realtime sonification
    (Georgia Institute of Technology, 2011-07-18) Nikolaidis, Ryan John
    This thesis presents the design and implementation of a generative model of tonal tension. It further describes the application of the generative model in realtime sonification. The thesis discusses related theoretical work in musical fields including generative system design, sonification, and perception and cognition. It highlights a review of the related research from historical to contemporary work. It contextualizes this work in informing the design and application of the generative model of tonal tension. The thesis concludes by presenting a formal evaluation of the system. The evaluation consists of two independent subject-response studies assessing the effectiveness of the generative system to create tonal tension and map it to visual parameters in sonification.