Title:
A Perturbation Approach to Differential Privacy for Deep Learning based Speech Processing
A Perturbation Approach to Differential Privacy for Deep Learning based Speech Processing
Author(s)
Yang, Chao-Han Huck
Advisor(s)
Lee, Chin-Hui
Editor(s)
Collections
Supplementary to
Permanent Link
Abstract
In the pursuit of high-performance speech applications, deep neural networks (DNNs) trained on speech data have become the cornerstone. As new data regulations, such as GDPR and CCPA, demand enhanced privacy protections, it is crucial to thwart security threats arising from query-based malevolent attacks. Differential privacy (DP), a technique offering a mathematical definition for privacy loss evaluation under these attacks, becomes invaluable.
This dissertation embarks on the journey of developing a perturbation-based machine learning framework, ensuring that DNNs maintain speech model performance whilst satisfying DP constraints. We devise mechanisms for vector-space distortion, achieved by introducing bounded noise to speech training data, and subsequently scrutinize the perturbations through statistical measurements and model estimations.
Utilizing both Laplace and Gaussian noise, our theoretical explorations guarantee the privacy budget within a bounded max-divergence measurement, using ensemble learning. We exploit the Lipschitz continuity properties of DNNs (e.g., w2v2-RNN-T) to estimate model robustness under varying vector-space perturbations. To substantiate our theoretical underpinnings, we employ speech recognition in isolated speech commands and large vocabulary continuous speech settings.
We extend our proposed framework to popular speech and acoustic modeling tasks, probing the delicate balance between model performance and DP budgets. Two training frameworks are introduced to counteract model degradation: (i) teacher-ensemble learning and (ii) sub-sampling-based training. Our discoveries reveal that, despite cutting-edge speaker protection techniques, speaker identity information may still be compromised without DP-based data extraction or training methods. Finally, we assimilate decentralized and federated training pipelines of DNNs into our privacy-preserving speech processing mechanism, examining potential solutions for on-device or cloud-based speech applications catering to end-users.
Sponsor
Date Issued
2023-04-17
Extent
Resource Type
Text
Resource Subtype
Dissertation