Title:
ATTENTION-BASED CONVOLUTIONAL NEURAL NETWORK MODEL AND ITS COMBINATION WITH FEW-SHOT LEARNING FOR AUDIO CLASSIFICATION
ATTENTION-BASED CONVOLUTIONAL NEURAL NETWORK MODEL AND ITS COMBINATION WITH FEW-SHOT LEARNING FOR AUDIO CLASSIFICATION
Author(s)
Wang, You
Advisor(s)
Anderson, David V.
Editor(s)
Collections
Supplementary to
Permanent Link
Abstract
Environmental sound and acoustic scene classification are crucial tasks in audio signal
processing and audio pattern recognition. In recent years, deep learning methods such as
convolutional neural networks (CNN), recurrent neural networks (RNN), and their com-
binations, have achieved great success in such tasks. However, there are still numerous
challenges left to be addressed in this domain. For example, in most cases, the sound
events of interest will be present through only a portion of the entire audio clip, and the clip
can also suffer from the background noise. Furthermore, in many application scenarios
where the amount of labelled training data can be very limited, the application of few-
shot learning methods especially prototypical networks have achieved great success. But
metric learning methods such as prototypical networks often suffer from bad feature em-
beddings of support samples or outliers, or may not perform well on noisy data. Therefore,
the proposed work seeks to overcome the above limitations by introducing a multi-channel
temporal attention-based CNN model and then introduce a hybrid attention module into the
framework of prototypical networks. Additionally, a Π-model is integrated into our model
to improve performance on noisy data, and a new time-frequency feature is explored. Var-
ious experiments have shown that our proposed framework is capable of dealing with the
above mentioned issues and providing promising results.
Sponsor
Date Issued
2022-07-30
Extent
Resource Type
Text
Resource Subtype
Dissertation