Adaptation of hybrid deep neural network-hidden Markov model speech recognition system using a sub-space approach

Rizwan, Muhammad

Title:

Adaptation of hybrid deep neural network-hidden Markov model speech recognition system using a sub-space approach

Files

RIZWAN-DISSERTATION-2017.pdf (1.13 MB)

Author(s)

Rizwan, Muhammad

Advisor(s)

Anderson, David V.

Advisor(s)

Person

Anderson, David V.

Associated Organization(s)

Organizational Unit

School of Electrical and Computer Engineering

Organizational Unit

College of Engineering

Collections

Theses and Dissertations

Permanent Link

http://hdl.handle.net/1853/60171

Abstract

The performance of automatic speech recognition (ASR) system can be enhanced by adaptation of the ASR for a particular speaker or a group of speakers. In ASR, training and testing data often do not follow the same statistics; they are often mismatched, which leads to a gap in performance. The difference between training and testing statistics can be minimized by speaker adaptation techniques, which require adaptation data from a target speaker to optimize system performance. In many cases, only a limited amount of adaptation data is available for the target speaker. This thesis proposes multiple methods for the adaptation of speech recognition system by using a limited amount of data (a few words). The first method classifies accent of a speaker to identify variability in speaking style. Results indicated that using multiple words from a speaker can be efficient and can provide better accent classification accuracy. Next adaptive phoneme classification method is proposed based on target speaker similarity with speakers in the training data. DNNs last hidden layer activations are found to be more useful in identifying phoneme classes of frames as compared with traditional raw Mel-frequency cepstral coefficients as features. Finally, speaker adaptation of ASR is presented by augmenting the speech features with the speaker features. The universal background sparse coding can provide useful speaker information for the speaker adaptation. These methods may lead to some new opportunities for research for the adaptation of the ASR.

Date Issued

2017-08-02

Resource Type

Text

Resource Subtype

Dissertation

Full item page

Title:

Adaptation of hybrid deep neural network-hidden Markov model speech recognition system using a sub-space approach

Files

Author(s)

Authors

Advisor(s)

Advisor(s)

Editor(s)

Associated Organization(s)

Series

Collections

Supplementary to

Permanent Link

Abstract

Sponsor

Date Issued

Extent

Resource Type

Resource Subtype

Rights Statement

Rights URI

Georgia Tech Library

Title: Adaptation of hybrid deep neural network-hidden Markov model speech recognition system using a sub-space approach

Files

Author(s)

Authors

Advisor(s)

Advisor(s)

Editor(s)

Associated Organization(s)

Series

Collections

Supplementary to

Permanent Link

Abstract

Sponsor

Date Issued

Extent

Resource Type

Resource Subtype

Rights Statement

Rights URI

Title:

Adaptation of hybrid deep neural network-hidden Markov model speech recognition system using a sub-space approach