Web Audio Conference

Series

Web Audio Conference

Permanent Link

https://hdl.handle.net/1853/70986

Series Type

Event Series

Associated Organization(s)

Organizational Unit

School of Music

Full item page

Publication Search Results

Now showing 1 - 10 of 30

reNotate: The Crowdsourcing and Gamification of Symbolic Music Encoding

(Georgia Institute of Technology, 2016-04) Taylor, Benjamin ; Shanahan, Daniel ; Wolf, Matthew ; Allison, Jesse ; Baker, David John

Musicologists and music theorists have, for quite some time, hoped to be able to make use of computational methods to examine large corpora of music. As far back as the 1940s, an IBM card-sorter was used to implement patternfinding in traditional British folk songs (Bronson 1949, 1959). Alan Lomax famously implemented statistical methods in his Cantometrics project (Lomax, 1968), which sought to collate a large corpus of folk music from across many cultures. In the 1980s and 90s, a number of encoding projects were instituted in an attempt to be able to make searchable music notation on a large scale. The Essen Folksong Collection (Schaffrath, 1995) collected ethnographic transcriptions, whereas projects at the Center for Computer Assisted Research in the Humanities (CCARH) focused on scores in the Western Art Music tradition (Bach chorales, Mozart sonatas, instrumental themes, etc.). Recently, scholars have focused on improving Optical Music Recognition, in the hopes of facilitating the acquisition of large numbers of musical scores (Fujinaga, et al., 2014), but non-notated music, such as improvisational jazz, is often overlooked. While there have been many advances in music information retrieval in recent years, parameters that would facilitate in-depth musicological analysis are still out of reach (for example, stream segregation to examine specific melodic lines, or the analysis of harmony at a resolution that would allow for an analysis of specific chord voicings). Our project seeks to implement methods similar to those used in CAPTCHA and RECAPTCHA technology to crowdsource the symbolic encoding of musical information through a web-based gaming interface. The introductory levels ask participants to tap along with an audio recording's tempo, giving us an approximate BPM, while the second level asks for participants to tap with onsets. The third level asks them to match a contour of a three-note segment, and the final stage asks for specific note matching within that contour. A social-gaming interface allows for users to compete against one another. It is our hope that this work can be generalized to many types of musical genres, and that a web-based framework might facilitate the encoding of musicological and music-theoretic datasets that might be underrepresented by current MIR work.
Live Looping Electronic Music Performance with MIDI Hardware

(Georgia Institute of Technology, 2016-04) McKegg, Matt

I have spent much of the last three years building a Web Audio based desktop application for live electronic music performance called Loop Drop. It was inspired by my own frustration with existing tools for live performance. I wanted a tool that would give me, as a performer, the level of control and expression desired but still feel like playing a musical instrument instead of programming a computer. My application was built using web technologies such as JavaScript and HTML5, leveraging existing experience as a web developer and providing an excellent workflow for quick prototyping and user interface design. In combination with Electron, a single developer can build a desktop audio application very efficiently. Loop Drop uses Web MIDI to interface with hardware such as Novation Launchpad. The software allows creation of sounds using synthesis and sampling, and arranges these into “chunks” which may be placed in any configuration across the midi controller’s button grid. These sounds may be triggered directly, or played quantised to the current tempo at a given rate using “beat repeat”. Everything the performer plays is collected in a buffer that at any time may be turned into a loop. This allows the performer to avoid recording anxiety — a common problem with most live looping systems. They can jam out ideas, then once happy with the sequence, press the loop button to lock it in. In my performance, I will use Loop Drop in conjunction with multiple Novation Launchpad midi controllers, to improvise 15 minutes of electronic music using sounds that I have organised ahead of time. The user interface will be visible to the audience as a projection. I will also be demonstrating the power of hosting Web Audio in Electron by interfacing with an external LED array connected over Serial Peripheral Interface Bus (SPI) to be used as an audio visualiser light show.
rMIXr: how we learned to stop worrying and love the graph

(Georgia Institute of Technology, 2016-04) Fields, Ben ; Phippen, Sam

In this talk we present a case study in the use of the web audio APIs. Specifically, our use of them for the creation of a rapidly developed prototype application. The app, called rMIXr (http://rmixr.com), is a simple digital audio workstation (DAW) for fan remix contests. We created rMIXr in 48 hours at the Midem Hack Day in June 2015. We’ll give a brief demo of the app and show multi-channel sync. We'll also show various effects as well as cutting/time-slicing.
A Novel Approach to Streaming and Client Side Rendering of Multichannel Audio with Synchronised Metadata

(Georgia Institute of Technology, 2016-04) Paradis, Matthew ; Pike, Chris ; Day, Richard ; Melchior, Frank

Object based audio broadcasting is an approach which combines audio with metadata that describes how the audio should be rendered. This metadata can include spatial positioning mixing parameters and descriptors to define the type of audio represented by the object. In this talk we show an approach to enabling the streaming of multichannel audio and synchronised metadata to the browser. Audio is rendered in the browser to multiple formats based on the information contained in the synchronised metadata channel. This allows adaptive mixing and rendering of content and user interaction. Based on the MPEG/DASH standard this approach allows an arbitrary number of audio channels to be presented as discrete inputs to the Web Audio API (dependent on any channel limit imposed by the browser). Binaural, 5.1 and stereo renders can be generated and selected for output by the user in real time without any change to the source media stream. Channels marked as being interactive can have their properties exposed to the user to adjust based on their preferences. The audio and metadata is originated from a single BWF file compliant with ITU-R BS 2076 (Audio Definition Model) with the audio being encoded using AAC (as per the MPEG/DASH standard) and the metadata represented in JSON format to the browser. This approach provides a flexible framework for the prototyping and presentation of new audio experiences to online audiences and provides a platform for delivery object based audio to online users.
Constellation: A Musical Exploration of Phone-Based Audience Interaction Roles

(Georgia Institute of Technology, 2016-04) Madhavan, Nihar ; Snyder, Jeff

Constellation designs various relationships between audience and performers by using mobile devices to empower communication during a performance. We direct audience members to a website, which changes throughout the piece and controls the interaction among the performers and audience. We explore several paradigms of interaction, using them as movements of a larger piece. (1) We first explore audience members producing a soundscape using their own actions, through a mobile visual interface that encourages motion, and produces sounds such as bells and controlled noise. (2) We then allow the audience to control onstage performers, through an interface by which an audience can vote on projected notes that performers attempt to follow. This can easily be adjusted to be synthesized sounds without performers. (3) A final paradigm of interaction is of performers directly controlling the audience, using instrumentation that echoes out through the phones of the performers. We move throughout three different sections of the piece, exploring these different interactions and blending them together musically. Constellation was designed and built with the target of performance in an April 2015 concert with the Princeton Laptop Orchestra (PLOrk), with source material from medieval piece “Stella Splendens”. Constellation uses primarily uses socket.io, node.js, WebAudio, and Full Tilt.
Cross-Town Traffic 2.0

(Georgia Institute of Technology, 2016-04) Walker, William ; Belet, Brian

Cross-Town Traffic 2.0 is an ensemble music performance environment for any number of audience performers, a principal performer, and a conductor. The performers use their own mobile devices running a performance interface based on the Web Audio API. The conductor leads the performers through a fully composed musical structure. Sixteen previously recorded audio files (eight Hammond B3 samples, performed and recorded by Walker; and eight viola samples, performed and recorded by Belet) are arranged into four groups, with the audience performers similarly arranged in four corresponding performance sections. Following cues from the conductor, the ensuing performance immerses humans in the midst of cellphone speakers and the flow of the musical structure. Individual performers can shape their own audio contribution within the confines of the larger composed structure, providing an element of playful participation. The resulting distributed cellphone audio challenges the performance roles of the humans in the room, as opposed to the number and quality of loudspeakers in the space. Using mobile web audio offers very low barriers to audience participation, in contrast to logging into an app store, searching and finding the appropriate native app, installing and launching the app, all prior to the start of the performance.
Live Coding With EarSketch

(Georgia Institute of Technology, 2016-04) Freeman, Jason

EarSketch combines a Python / JavaScript API, a digital audio workstation (DAW) visualization, an audio loop library, and an educational curriculum into a web-based music programming environment. While it was designed originally as a classroom educational tool for music technology and computer science, it has recently been expanded to support live coding in concert performance. This live coding performance explores the artistic potential of algorithmic manipulations of audio loops in a multi-track DAW paradigm and explores the potential of DAW-driven visualizations to demystify live coding and algorithms for a concert audience.
meSing.js: A JavaScript Singing Synthesis Library

(Georgia Institute of Technology, 2016-04) Su, David

meSing.js is a JavaScript singing synthesis library that uses the Web Audio API's DSP capabilities in conjunction with the meSpeak.js speech synthesis library to provide a vocal synthesizer for the web. First, the lyrics with corresponding MIDI notes are parsed and fed to meSpeak.js; the resulting text-to-speech output is then converted into a series of AudioBufferSourceNodes, which are subsequently processed and adjusted for pitch, rhythm, and expression. Pitchshifting techniques currently implemented are: feeding the synthesized audio through a multiband vocoder (based on Chris Wilson's 2012 demo), directly adjusting the audio playback rate, and manipulating the "pitch" parameter of the meSpeak.js synthesizer. Rhythmic adjustments occur directly on the PCM level via slicing and concatenating the Float32Arrays containing audio channel data, as well as using the Web Audio API’s clock to schedule vocoder events. The demo showcases an example usage of meSing.js: a songwriting tool that provides both lyrical and melodic suggestions, using the singing synthesis to rapidly prototype the vocal line. The step sequencer-like input grid layout is derived from musicologist Kyle Adams' approach to interpreting and analyzing hip hop, and is particularly well suited to lyric and rhyme analysis. Multiple approaches were taken and experimented with while developing meSing.js, each with its own performance and usability benefits and drawbacks; a discussion of these approaches can provide insight into creating libraries atop the Web Audio API.
Hooking up Web Audio to WebGL Typography

(Georgia Institute of Technology, 2016-04) Lee, Sang Won ; Essl, Georg

This demo introduces programmable text rendering that enables temporal typography in web browsers. Textual interaction is seen not only as a dynamic but interactive process facilitating both scripted and live musical expression in various contexts such as audio-visual performance using keyboards and live coding visualization. We transform plain text into a highly audiovisual medium and a musical interface which is visually expressive by transforming textual properties using real-time web audio signal. Technical realization of the concept uses Web Audio API, WebGL and GLSL shaders. We show a number of examples that illustrate instances of the concept in various scenarios ranging from simple textual visualization, live coding environments and interactive writing platform.
Alternatives to Lookahead Audio Scheduling

(Georgia Institute of Technology, 2016-04) Sullivan, Joe

The scheduling of web audio events occurs in the UI thread, which is optimized to respond to user input and to provide visual feedback. The setTimeout and setInterval interfaces provide an imprecise method of scheduling, and in background tabs the UI thread virtually ceases. Lookahead scheduling (à la “A Tale of Two Clocks”) is an established audio scheduling strategy, though it relies on the UI thread running continually. This talk surveys alternative scheduling strategies, including all-at-once scheduling and the pre-rendering of audio using the OfflineAudioContext (as described in “A Tale of No Clocks”), which tie the burden on the UI thread closely to user interactions. I discuss the general pattern pre-rendering implies through a demonstration of a proof-of-concept implementation, and explore the range of applications that suit pre-rendering, including the smallest of loop-based web tools (e.g. metronomes) and large-scale DAW projects where pre-rendering provides the added benefit of reducing computational demand during playback.