Organizational Unit:
School of Music

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 10 of 10
  • Item
    Regressing dexterous finger flexions using machine learning and multi-channel single element ultrasound transducers
    (Georgia Institute of Technology, 2018-04-27) Hantrakul, Lamtharn
    Human Machine Interfaces or "HMI's" come in many shapes and sizes. The mouse and keyboard is a typical and familiar HMI. In applications such as Virtual Reality or Music performance, a precise HMI for tracking finger movement is often required. Ultrasound, a safe and non-invasive imaging technique, has shown great promise as an alternative HMI interface that addresses the shortcomings of vision-based and glove-based sensors. This thesis develops a first-in-class system enabling real-time regression of individual and simultaneous finger flexions using single element ultrasound transducers. A comprehensive dataset of ultrasound signals is collected is collected from a study of 10 users. A series of machine learning experiments using this dataset demonstrate promising results supporting the use of single element transducers as a HMI device.
  • Item
    Enhancing stroke generation and expressivity in robotic drummers - A generative physics model approach
    (Georgia Institute of Technology, 2015-04-24) Edakkattil Gopinath, Deepak
    The goal of this master's thesis research is to enhance the stroke generation capabilities and musical expressivity in robotic drummers. The approach adopted is to understand the physics of human fingers-drumstick-drumhead interaction and try to replicate the same behavior in a robotic drumming system with the minimum number of degrees of freedom. The model that is developed is agnostic to the exact specifications of the robotic drummer that will attempt to emulate human like drum strokes, and therefore can be used in any robotic drummer that uses actuators with complete control over the motor position angle. Initial approaches based on exploiting the instability of a PID control system to generate multiple bounces and the limitations of this approach are also discussed in depth. In order to assess the success of the model and the implementation in the robotic platform a subjective evaluation was conducted. The evaluation results showed that, the observed data was statistically equivalent to the subjects resorting to a blind guess in order to distinguish between a human playing a multiple bounce stroke and a robot playing a similar kind of stroke.
  • Item
    Supervised feature learning via sparse coding for music information rerieval
    (Georgia Institute of Technology, 2015-04-24) O'Brien, Cian John
    This thesis explores the ideas of feature learning and sparse coding for Music Information Retrieval (MIR). Sparse coding is an algorithm which aims to learn new feature representations from data automatically. In contrast to previous work which uses sparse coding in an MIR context the concept of supervised sparse coding is also investigated, which makes use of the ground-truth labels explicitly during the learning process. Here sparse coding and supervised coding are applied to two MIR problems: classification of musical genre and recognition of the emotional content of music. A variation of Label Consistent K-SVD is used to add supervision during the dictionary learning process. In the case of Music Genre Recognition (MGR) an additional discriminative term is added to encourage tracks from the same genre to have similar sparse codes. For Music Emotion Recognition (MER) a linear regression term is added to learn an optimal classifier and dictionary pair. These results indicate that while sparse coding performs well for MGR, the additional supervision fails to improve the performance. In the case of MER, supervised coding significantly outperforms both standard sparse coding and commonly used designed features, namely MFCC and pitch chroma.
  • Item
    Analog synthesizers in the classroom: How creative play, musical composition, and project-based learning can enhance STEM standard literacy and self-efficacy
    (Georgia Institute of Technology, 2015-04-24) Howe, Christopher David
    The state of STEM education in America's high schools is currently in flux, with billions annually being poured into the NSF to increase national STEM literacy. Hands-on project-based learning interventions in the STEM classroom are ubiquitous but tend to focus on robotics or competition based curriculums. These curricula do not address musical creativity or cultural relevancy to reach under-represented or disinterested groups. By utilizing an analog synthesizer for STEM learning standards this research aims to engage students that may otherwise lack confidence in the field. By incorporating the Maker Movement, a STEAM architecture, and culturally relevant musical examples, this study’s goal to build both self-efficacy and literacy in STEM within under-represented groups through hands-on exercises with a Moog analog synthesizer, specifically the Moog Werkstatt. A quasi-experimental one-group pre-test/post-test design was crafted to determine study validity, and has been implemented in three separate studies. Several age demographics were selected across a variety of classroom models and teaching style. The purpose of this wide net was to explore where a tool like the Werkstatt and its accompanying curriculum would have the biggest impact. Results show that this curriculum and technique are largely ineffective in an inverted Music elective classroom. However, in the STEM classroom, literacy and confidence were built across genders, with females showing greater increases in engineering confidence and music technology interest than their male counterparts.
  • Item
    Audience participation using mobile phones as musical instruments
    (Georgia Institute of Technology, 2012-05-21) Lee, Sang Won
    This research aims at a music piece for audience participation using mobile phones as musical instruments in a music concert setting. Inspired by the ubiquity of smart phones, I attempted to accomplish audience engagement in a music performance by crafting an accessible musical instrument with which audience can be a part of the performance. The research begins by reviewing the related works in two areas, mobile music and audience participation at music performances, builds a charted map of the areas and its intersection to seek an innovation, and defines requisites for a successful audience participation where audience can participate in music making as musicians with their mobile phones. To make accessible audience participation, the concept of a networked multi-user instrument is applied for the system. With the lessons learnt, I developed echobo, a mobile musical instrument application for iOS devices (iPhone, iPad and iPod Touch). With this system, audience can download the app at the concert, play the instrument instantly, interact with other audience members, and contribute to the music by sound generated from their mobile phones. A music piece for echobo and a clarinet was presented in a series of performances and the application was found to work reliably and accomplish audience engagement. The post-survey results indicate that the system was accessible, and helped the audience to connect to the music and other musicians.
  • Item
    Musical swarm robot simulation strategies
    (Georgia Institute of Technology, 2011-11-16) Albin, Aaron Thomas
    Swarm robotics for music is a relatively new way to explore algorithmic composition as well as new modes of human robot interaction. This work outlines a strategy for making music with a robotic swarm constrained by acoustic sound, rhythmic music using sequencers, motion causing changes in the music, and finally human and swarm interaction. Two novel simulation programs are created in this thesis: the first is a multi-agent simulation designed to explore suitable parameters for motion to music mappings as well as parameters for real time interaction. The second is a boid-based robotic swarm simulation that adheres to the constraints established, using derived parameters from the multi-agent simulation: orientation, number of neighbors, and speed. In addition, five interaction modes are created that vary along an axis of direct and indirect forms of human control over the swarm motion. The mappings and interaction modes of the swarm robot simulation are evaluated in a user study involving music technology students. The purpose of the study is to determine the legibility of the motion to musical mappings and evaluate user preferences for the mappings and modes of interaction in problem solving and in open-ended contexts. The findings suggest that typical users of a swarm robot system do not necessarily prefer more inherently legible mappings in open-ended contexts. Users prefer direct and intermediate modes of interaction in problem solving scenarios, but favor intermediate modes of interaction in open-ended ones. The results from this study will be used in the design and development of a new swarm robotic system for music that can be used in both contexts.
  • Item
    N-gram modeling of tabla sequences using Variable-Length Hidden Markov Models for improvisation and composition
    (Georgia Institute of Technology, 2011-09-20) Sastry, Avinash
    This work presents a novel approach for the design of a predictive model of music that can be used to analyze and generate musical material that is highly context dependent. The system is based on an approach known as n-gram modeling, often used in language processing and speech recognition algorithms, implemented initially upon a framework of Variable-Length Markov Models (VLMMs) and then extended to Variable-Length Hidden Markov Models (VLHMMs). The system brings together various principles like escape probabilities, smoothing schemes and uses multiple representations of the data stream to construct a multiple viewpoints system that enables it to draw complex relationships between the different input n-grams, and use this information to provide a stronger prediction scheme. It is implemented as a MAX/MSP external in C++ and is intended to be a predictive framework that can be used to create generative music systems and educational and compositional tools for music. A formal quantitative evaluation scheme based on entropy of the predictions is used to evaluate the model in sequence prediction tasks on a database of tabla compositions. The results show good model performance for both the VLMM and the VLHMM while highlighting the expensive computational cost of higher-order VLHMMs.
  • Item
    Computational modeling of improvisation in Turkish folk music using Variable-Length Markov Models
    (Georgia Institute of Technology, 2011-08-31) Senturk, Sertan
    The thesis describes a new database of uzun havas, a non-metered structured improvisation form in Turkish folk music, and a system, which uses Variable-Length Markov Models (VLMMs) to predict the melody in the uzun hava form. The database consists of 77 songs, encompassing 10849 notes, and it is used to train multiple viewpoints, where each event in a musical sequence are represented by parallel descriptors such as Durations and Notes. The thesis also introduces pitch-related viewpoints that are specifically aimed to model the unique melodic properties of makam music. The predictability of the system is quantitatively evaluated by an entropy based scheme. In the experiments, the results from the pitch-related viewpoints mapping 12-tone-scale of Western classical theory and 17 tone-scale of Turkish folk music are compared. It is shown that VLMMs are highly predictive in the note progressions of the transcriptions of uzun havas. This suggests that VLMMs may be applied to makam-based and non-metered musical forms, in addition to Western musical styles. To the best of knowledge, the work presents the first symbolic, machine-readable database and the first application of computational modeling in Turkish folk music.
  • Item
    A generative model of tonal tension and its application in dynamic realtime sonification
    (Georgia Institute of Technology, 2011-07-18) Nikolaidis, Ryan John
    This thesis presents the design and implementation of a generative model of tonal tension. It further describes the application of the generative model in realtime sonification. The thesis discusses related theoretical work in musical fields including generative system design, sonification, and perception and cognition. It highlights a review of the related research from historical to contemporary work. It contextualizes this work in informing the design and application of the generative model of tonal tension. The thesis concludes by presenting a formal evaluation of the system. The evaluation consists of two independent subject-response studies assessing the effectiveness of the generative system to create tonal tension and map it to visual parameters in sonification.
  • Item
    Towards expressive melodic accompaniment using parametric modeling of continuous musical elements in a multi-attribute prediction suffix trie framework
    (Georgia Institute of Technology, 2010-11-22) Mallikarjuna, Trishul
    Elements of continuous variation such as tremolo, vibrato and portamento enable dimensions of their own in expressive melodic music in styles such as in Indian Classical Music. There is published work on parametrically modeling some of these elements individually, and to apply the modeled parameters to automatically generated musical notes in the context of machine musicianship, using simple rule-based mappings. There have also been many systems developed for generative musical accompaniment using probabilistic models of discrete musical elements such as MIDI notes and durations, many of them inspired by computational research in linguistics. There however doesn't seem to have been a combined approach of parametrically modeling expressive elements in a probabilistic framework. This documents presents a real-time computational framework that uses a multi-attribute trie / n-gram structure to model parameters like frequency, depth and/or lag of the expressive variations such as vibrato and portamento, along with conventionally modeled elements such as musical notes, their durations and metric positions in melodic audio input. This work proposes storing the parameters of expressive elements as metadata in the individual nodes of the traditional trie structure, along with the distribution of their probabilities of occurrence. During automatic generation of music, the expressive parameters as learned in the above training phase are applied to the associated re-synthesized musical notes. The model is aimed at being used to provide automatic melodic accompaniment in a performance scenario. The parametric modeling of the continuous expressive elements in this form is hypothesized to be able to capture deeper temporal relationships among musical elements and thereby is expected to bring about a more expressive and more musical outcome in such a performance than what has been possible using other works of machine musicianship using only static mappings or randomized choice. A system was developed on Max/MSP software platform with this framework, which takes in a pitched audio input such as human singing voice, and produces a pitch track which may be applied to synthesized sound of a continuous timbre. The system was trained and tested with several vocal recordings of North Indian Classical Music, and a subjective evaluation of the resulting audio was made using an anonymous online survey. The results of the survey show the output tracks generated from the system to be as musical and expressive, if not more, than the case where the pitch track generated from the original audio was directly rendered as output, and also show the output with expressive elements to be perceivably more expressive than the version of the output without expressive parameters. The results further suggest that more experimentation may be required to conclude the efficacy of the framework employed in relation to using randomly selected parameter values for the expressive elements. This thesis presents the scope, context, implementation details and results of the work, suggesting future improvements.