Organizational Unit:
School of Music

Research Organization Registry ID
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 10 of 105
  • Item
    Contours Unveiled: Music Composition for Generative Rhythm Application and Chamber Ensemble
    (Georgia Institute of Technology, 2024-07-27) McCall, Lauren Cagney
    Contours Unveiled is a dissertation project involving the development of a generative rhythm application and music composition for the Georgia Institute of Technology's Laptop Orchestra and Yarn/Wire. The generative application for Contours Unveiled uses the Euclidean rhythm algorithm. This project does not seek to emulate rhythms from around the world or provide an understanding of the cultural significance of historical rhythms but to use the Euclidean rhythm algorithm as a way to generate music. This dissertation covers the literature and previous music compositions and music technology projects created by the author that have prepared her for undertaking this project. Additionally, this dissertation shares the methodology she used to develop and deploy the project's application and the artistic design she used for the compositional aspect of this project. The author also conducted a user study during the development of the application, which helped inform the continual application and music composition design. Also, feedback from students who performed in the Georgia Institute of Technology's Laptop Orchestra provided further insight into editing the application and score. This dissertation concludes with the author's plans for future work on this project.
  • Item
    Interpretable Quantitative Evaluation Metrics Of Generative Models In Symbolic Music
    (Georgia Institute of Technology, 2024-06-05) Sun, Qianyi
    Despite the emergence of innovative architectures claiming improved capabilities in modeling human-level creativity, state-of-the-art generative music systems still struggle with creating musical content that follows technical rules and expectations. The conventional subjective evaluation method for generative models can introduce bias and also lacks transparency, rigor, and reproducibility, emphasizing the need for more quantitative metrics. However, existing approaches to quantitative evaluation have either relied on overly- broad criteria that do not capture higher-level music theoretic properties nor perceptual properties, or are narrowly tailored to the design of a specific model, limiting their generalizability. To address this, this thesis proposes a reproducible and interpretable framework for evaluating the output of symbolic music generation models using musicologically and perceptually informed quantitative metrics. Specifically, I assessed the performance of two prominent models, FolkRNN and Jazz Transformer, by comparing the models’ training data against their generated results through systematic computational music analysis. Benchmark testing revealed this approach to surpass the discriminative capabilities of the widely cited, seminal quantitative metrics proposed by Yang and Lerch, offering a more ecologically-valid assessment of model behavior and highlighting areas for targeted improvement. To further substantiate the perceptual validity of my metrics, I then reported results from a listening study employing a Turing Test disguised as a style-classification task. This experiment tested for the salient musical features that influences individuals’ decision-making in identifying stylistic provenance and provided insights into the perceptual dimensions of style imitation challenges in AI music generation. Together, my findings hold the potential to advance the reliability and validity of AI-generated music by incorporating human perceptual attributes.
  • Item
    Rhythm Recreation Study To Inform Intelligent Pedagogy Systems
    (Georgia Institute of Technology, 2023-08-28) Alben, Noel
    Web-based intelligent pedagogy systems have great potential to provide interactive music lessons to those unable to access conventional, face-to-face music instruction from human experts. A key component of any effective pedagogy system is the expert domain knowledge used to generate, present, and evaluate the teachable content that makes up the ''syllabus'' of the system (Brusilovskiy, 1994). In this work, we investigate the application of computational musicology algorithms to devise the ''syllabus'' of intelligent rhythm pedagogy software. Many computational metrics that quantify and characterize rhythmic patterns have been proposed (Toussaint). We employ Cao et al.'s (2012) family theory of rhythms as a metric of rhythmic similarity and an entropy-based coded-element metric of rhythmic complexity (Thul, 2008). Both metrics have been shown to correlate with human judgments of rhythmic similarity and complexity. A rhythmic syllabus that uses these metrics to determine the order in which rhythmic patterns are learned will be easier for musicians to progress through. We test this hypothesis in a rhythm reproduction study hosted on a custom-designed web-based experimental interface. Our experiment consists of six individual blocks: In each block, a participant listens to five unique rhythmic patterns, which they must then reproduce by clapping into their computer's microphone. Each rhythmic pattern is two measures long on an eighth-note grid, presented at 105 BPM, and looped four times. The order and content of rhythmic patterns within each block are determined using our chosen complexity and similarity metrics. A participant completes a block when they reproduce all the rhythmic patterns of the block within the performance constraints defined by automatic performance assessment built into the experimental interface. Each of our six blocks represents key interactions: the order of the stimuli determined by our prescribed metrics, melodic information added to the rhythmic stimuli, and the presence of a visual representation of the rhythmic pattern. We also have control blocks where the patterns of each block are selected randomly without any theoretically informed metrics. Dependent variables to measure the effectiveness of the syllabus are the number of trials taken to reproduce a given rhythmic stimuli accurately. Participant reproductions are stored to afford future analyses, and the designed interface helps efficiently automate the data collection, making it more accessible for future rhythm reproduction studies. We conducted the rhythm recreation study with 28 participants across the United States, who accessed the experiment through a web-based portal. The data gathered from our experiment implies that computational music theory algorithms can contribute to creating syllabi that align with human perception. However, these results deviate from my initial predictions. Furthermore, It appears that while incorporating visual stimuli aided in learning rhythmic patterns, the introduction of pitched onsets negatively affected reproduction performance.
  • Item
    Mechatronics driven design approach for musical expressivity in robotic musicianship
    (Georgia Institute of Technology, 2023-01-18) Yang, Ning
    In this dissertation, we present mechatronics-driven design approaches that allow robotic musicians to perform with musical expressivity. We introduce three research projects in developing a robotic marimba player, a wearable drumming prosthesis, and a robotic guitarist. The first project, the development of a robotic marimba player, focused on the design of a novel striking system with brushless direct current motors. The new striking system allows the marimba robot to achieve a wider dynamic range, faster speed, and more complex marimba playing techniques. We objectively evaluated the dynamic range, speed, and four marimba techniques of the new striking system, showing that the new system has a wider dynamic range, a higher speed, and more capable techniques than the old striking system. We conducted a listening test to show that the robot was able to achieve human-level expressivity. We also conducted a Turing test to show that the subjects could not differentiate the robot from human players playing four marimba techniques. The second project shifted to a wearable drumming prosthesis. The purpose of this wearable robotic device is to help an amputated drummer to regain his grip and control of drumsticks. We introduced a quasi-passive robotic drumming prosthesis that reads the drummer’s grip from forearm muscles through electromyography sensors and manipulates the drumsticks’ dynamic behavior with corresponding stiffness parameters. We demonstrated that the drummer was able to use switch grips and change natural bouncing patterns among single, double, and triple rolls. We also validated that the drummer was able to utilize the robotic prosthesis to perform advanced drumming techniques, such as paradiddle. The last research project focused on designing a new robotic guitarist. We analyzed and modeled how humans played the guitar and created an expressive robotic guitarist with a right-arm module for strumming and picking and a left-hand apparatus for pressing the strings. We evaluated the dynamic range, speed, microtiming control, noise level, and guitar techniques on this new robotic platform. Through the subjective listening test, we also found that the robotic guitarist was able to perform at human-level expressivity. Through the development of these projects, we have demonstrated that, with human-modulating driven mechatronics design, we can significantly improve the physical capabilities of robotic musicians, supporting more musical expression through a wider dynamic range, a more subtle microtiming control, and a more extensive range of music playing techniques.
  • Item
    Toward Natural Singing Via External Prosthesis
    (Georgia Institute of Technology, 2022-12-15) Irvin, Bryce
    The accessibility of expressive singing is limited by the physical mechanisms that produce speech and singing. For individuals without these physical mechanisms, singing is either difficult or impossible. Through this work, we propose the development of an external electronic prosthesis capable of inducing a natural singing voice in a performer without the need for traditional singing mechanisms. The novelty introduced by this prosthesis will serve as a new way for performers of any background and ability to express themselves and participate in social music activities. Specifically, we first aim to resolve issues with common prosthesis transducers. We then aim to discover methods for inducing the most natural singing voice in users, focusing on the nature of the excitation waveform used to drive the transducer of the prosthesis.
  • Item
    Using music to modulate emotional memory
    (Georgia Institute of Technology, 2021-12-14) Mehdizadeh, Sophia Kaltsouni
    Music is powerful in both affecting emotion and evoking memory. This thesis explores if music might be able to modulate, or change, aspects of our emotional episodic memories. We present a behavioral, human-subjects experiment with a cognitive memory task targeting the reconsolidation mechanism. Memory reconsolidation allows for a previous experience to be relived and simultaneously reframed in memory. Moreover, reconsolidation of emotional, potentially maladaptive, autobiographical episodic memories has become a research focus in the development of new affective psychotherapy protocols. To this end, we propose that music may be a useful tool in driving and reshaping our memories and their associated emotions. This thesis additionally focuses on the roles that affect and preference may play in these memory processes. Through this research, we provide evidence supporting music’s ability to serve as a context for emotional autobiographical episodic memories. Overall, our results suggest that affective characteristics of the music and the emotions induced in the listener significantly influence memory creation and retrieval, and that furthermore, the musical emotion may be equally as powerful as the musical structure in contextualizing and cueing memories. We also find support for individual differences and personal relevance of the musical context playing a determining role in these processes. This thesis establishes a foundation for subsequent neuroimaging work and future clinical research directions.
  • Item
    Machine Learning Driven Emotional Musical Prosody for Human-Robot Interaction
    (Georgia Institute of Technology, 2021-11-18) Savery, Richard
    This dissertation presents a method for non-anthropomorphic human-robot interaction using a newly developed concept entitled Emotional Musical Prosody (EMP). EMP consists of short expressive musical phrases capable of conveying emotions, which can be embedded in robots to accompany mechanical gestures. The main objective of EMP is to improve human engagement with, and trust in robots while avoiding the uncanny valley. We contend that music - one of the most emotionally meaningful human experiences - can serve as an effective medium to support human-robot engagement and trust. EMP allows for the development of personable, emotion-driven agents, capable of giving subtle cues to collaborators while presenting a sense of autonomy. We present four research areas aimed at developing and understanding the potential role of EMP in human-robot interaction. The first research area focuses on collecting and labeling a new EMP dataset from vocalists, and using this dataset to generate prosodic emotional phrases through deep learning methods. Through extensive listening tests, the collected dataset and generated phrases were validated with a high level of accuracy by a large subject pool. The second research effort focuses on understanding the effect of EMP in human-robot interaction with industrial and humanoid robots. Here, significant results were found for improved trust, perceived intelligence, and likeability of EMP enabled robotic arms, but not for humanoid robots. We also found significant results for improved trust in a social robot, as well as perceived intelligence, creativity and likeability in a robotic musician. The third and fourth research areas shift to broader use cases and potential methods to use EMP in HRI. The third research area explores the effect of robotic EMP on different personality types focusing on extraversion and neuroticism. For robots, personality traits offer a unique way to implement custom responses, individualized to human collaborators. We discovered that humans prefer robots with emotional responses based on high extraversion and low neuroticism, with some correlation between the humans collaborator’s own personality traits. The fourth and final research question focused on scaling up EMP to support interaction between groups of robots and humans. Here, we found that improvements in trust and likeability carried across from single robots to groups of industrial arms. Overall, the thesis suggests EMP is useful for improving trust and likeability for industrial, social and robot musicians but not in humanoid robots. The thesis bears future implications for HRI designers, showing the extensive potential of careful audio design, and the wide range of outcomes audio can have on HRI.
  • Item
    Can You Hear My Heartbeat?: Hearing an Expressive Biosignal Elicits Empathy - Supplementary Data
    (Georgia Institute of Technology, 2021-05-07) Winters, R. Michael ; Leslie, Grace ; Walker, Bruce N.
    Interfaces designed to elicit empathy provide an opportunity for HCI with important pro-social outcomes. Recent research has demonstrated that perceiving expressive biosignals can facilitate emotional understanding and connection with others, but this work has been largely limited to visual approaches. We propose that hearing these signals will also elicit empathy, and test this hypothesis with sounding heartbeats. In a lab-based within-subjects study, participants (N = 27) completed an emotion recognition task in different heartbeat conditions. We found that hearing heartbeats changed participants’ emotional perspective and increased their reported ability to “feel what the other was feeling.” From these results, we argue that auditory heartbeats are well-suited as an empathic intervention, and might be particularly useful for certain groups and use-contexts because of its musical and non-visual nature. This work establishes a baseline for empathic auditory interfaces, and offers a method to evaluate the effects of future designs.
  • Item
    Composing and Decomposing Electroacoustic Sonifications: Towards a Functional-Aesthetic Sonification Design Framework
    (Georgia Institute of Technology, 2021-05-01) Tsuchiya, Takahiko
    The field of sonification invites musicians and scientists for creating novel auditory interfaces. However, the opportunities for incorporating musical design ideas into general functional sonifications have been limited because of the transparency and communication issues with musical aesthetics. This research proposes a new design framework that facilitates the use of musical ideas as well as a transparent representation or conveyance of data, verified with two human subjects tests. An online listening test analyzes the effect of the structural elements of sound as well as a guided analytical listening to the perceptibility of data. A design test examines the range of variety the framework affords and how the design process is affected by functional and aesthetic design goals. The results indicate that the framework elements, such as the synthetic models and mapping destinations affect the perceptibility of data, with some contradictions between the designer's general strategies and the listener's responses. The analytical listening nor the listener's musical background show little statistical trends, but instead imply complex relationships of types of interpretations and the structural understanding. There are also several contrasting types in the design and listening processes which indicate different levels of structural transparency as well as the applicability of a wider variety of designs.
  • Item
    Learning to manipulate latent representations of deep generative models
    (Georgia Institute of Technology, 2021-01-14) Pati, Kumar Ashis
    Deep generative models have emerged as a tool of choice for the design of automatic music composition systems. While these models are capable of learning complex representations from data, a limitation of many of these models is that they allow little to no control over the generated music. Latent representation-based models, such as Variational Auto-Encoders, have the potential to alleviate this limitation as they are able to encode hidden attributes of the data in a low-dimensional latent space. However, the encoded attributes are often not interpretable and cannot be explicitly controlled. The work presented in this thesis seeks to address these challenges by learning to manipulate and design latent spaces in a way that allows control over musically meaningful attributes that are understandable by humans. This in turn can allow explicit control of such attributes during the generation process and help users realize their compositional goals. Specifically, three different approaches are proposed to investigate this problem. The first approach shows that we can learn to traverse latent spaces of generative models to perform complex interactive music composition tasks. The second approach uses a novel latent space regularization technique which can encode individual musical attributes along specific dimensions of the latent space. The third approach attempts to use attribute-informed non-linear transformations over an existing latent space such that the transformed latent space allows controllable generation of data. In addition, the problem of disentanglement learning in the context of symbolic music is investigated systematically by proposing a tailor-made dataset for the task and evaluating the performance of several different methods for unsupervised and supervised disentanglement learning. Together, the proposed methods will help address critical shortcomings of deep music generative models and pave the path towards intuitive interfaces which can be used by humans in real compositional settings.