Instrument Timbres and Pitch Estimation in Polyphonic Music

Thumbnail Image
Loeffler, Dominik B.
Lee, Chin-Hui
Associated Organizations
Supplementary to
In the past decade, the availability of digitally encoded, downloadable music has increased dramatically, pushed mainly by the release of the now famous MP3 compression format (Fraunhofer-Gesellschaft, 1994). Online sales of music in the US doubled in 2005, according to a recent news article (*), while the number of files exchanged on P2P platforms is much higher, but hard to estimate. The existing and coming informational flood in digital music prompts the need for sophisticated content-based information retrieval. Query-by-Humming is a prototypical technique aimed at locating pieces of music by melody; automatic annotation algorithms seek to enable finer search criteria, such as instruments, genre, or meter. Score transcription systems strive for an abstract, compressed form of a piece of music understandable by composers and musicians. Much research still has to be performed to achieve these goals. This thesis connects essential knowledge about music and human auditory perception with signal processing algorithms to solve the specific problem of pitch estimation. The designed algorithm obtains an estimate of the magnitude spectrum via STFT and models the harmonic structure of each pitch contained in the magnitude spectrum with Gaussian density mixtures, whose parameters are subsequently estimated via an Expectation-Maximization (EM) algorithm. Heuristics for EM initialization are formulated mathematically. The system is implemented in MATLAB, featuring a GUI that provides for visual (spectrogram) and numerical (console) verification of results. The algorithm is tested using an array of data ranging from single to triple superposed instrument recordings. Its advantages and limitations are discussed, and a brief outlook over potential future research is given. (*) "Online and Wireless Music Sales Tripled in 2005"; Associated Press; January 19, 2006
Date Issued
839430 bytes
Resource Type
Resource Subtype
Rights Statement
Rights URI