Organizational Unit:

School of Music

Permanent Link

https://hdl.handle.net/1853/70774

Parent Organization

Organizational Unit

College of Design

ArchiveSpace Name Record

https://archivesspace.library.gatech.edu/agents/agent_corporate_entity/134/edit

Full item page

Publication Search Results

Now showing 1 - 2 of 2

Addressing the data challenge in automatic drum transcription with labeled and unlabeled data

(Georgia Institute of Technology, 2018-07-23) Wu, Chih-Wei

Automatic Drum Transcription (ADT) is a sub-task of automatic music transcription that involves the conversion of drum-related audio events into musical notations. While noticeable progress has been made in the past by combining pattern recognition methods with audio signal processing techniques, many systems are still impeded by the lack of a meaningful amount of labeled data to support the data-driven algorithms. To address this data challenge in ADT, this work presents three approaches. First, a dataset for ADT tasks is created using a semi-automatic process that minimizes the workload of human annotators. Second, an ADT system that requires minimum training data is designed to account for the presence of other instruments (e.g., non-percussive or pitched instruments). Third, the possibility of improving generic ADT systems with a large amount of unlabeled data from online resources is explored. The main contributions of this work include the introduction of a new ADT dataset, the methods for realizing ADT systems under the constraint of data insufficiency, and a scheme for data-driven methods to benefit from the abundant online resources and might have impact on other audio and music related tasks traditionally impeded by small amounts of labeled data.
Supervised feature learning via sparse coding for music information rerieval

(Georgia Institute of Technology, 2015-04-24) O'Brien, Cian John

This thesis explores the ideas of feature learning and sparse coding for Music Information Retrieval (MIR). Sparse coding is an algorithm which aims to learn new feature representations from data automatically. In contrast to previous work which uses sparse coding in an MIR context the concept of supervised sparse coding is also investigated, which makes use of the ground-truth labels explicitly during the learning process. Here sparse coding and supervised coding are applied to two MIR problems: classification of musical genre and recognition of the emotional content of music. A variation of Label Consistent K-SVD is used to add supervision during the dictionary learning process. In the case of Music Genre Recognition (MGR) an additional discriminative term is added to encourage tracks from the same genre to have similar sparse codes. For Music Emotion Recognition (MER) a linear regression term is added to learn an optimal classifier and dictionary pair. These results indicate that while sparse coding performs well for MGR, the additional supervision fails to improve the performance. In the case of MER, supervised coding significantly outperforms both standard sparse coding and commonly used designed features, namely MFCC and pitch chroma.

Organizational Unit:

School of Music

Permanent Link

Research Organization Registry ID

Description

Previous Names

Parent Organization

Parent Organization

Includes Organization(s)

ArchiveSpace Name Record

Filters

Author

Advisor

Date

Organization

Series

Resource Type

Resource Subtype

Has files

Record Type

Settings

Sort By

Results per page

Publication Search Results

Georgia Tech Library

Organizational Unit: School of Music

Permanent Link

Research Organization Registry ID

Description

Previous Names

Parent Organization

Parent Organization

Includes Organization(s)

ArchiveSpace Name Record

Filters

Author

Advisor

Date

Organization

Series

Resource Type

Resource Subtype

Has files

Record Type

Settings

Sort By

Results per page

Publication Search Results

Organizational Unit:

School of Music