Title:
UnMask: Adversarial Detection and Defense in Deep Learning Through Building-Block Knowledge Extraction
UnMask: Adversarial Detection and Defense in Deep Learning Through Building-Block Knowledge Extraction
Author(s)
Freitas, Scott
Chen, Shang-Tse
Chau, Duen Horng
Chen, Shang-Tse
Chau, Duen Horng
Advisor(s)
Editor(s)
Collections
Supplementary to
Permanent Link
Abstract
Deep learning models are being integrated into a wide range of
high-impact, security-critical systems, from self-driving cars to biomedical
diagnosis. However, recent research has demonstrated that
many of these deep learning architectures are highly vulnerable
to adversarial attacks—highlighting the vital need for defensive
techniques to detect and mitigate these attacks before they occur.
To combat these adversarial attacks, we developed UnMask,
a knowledge-based adversarial detection and defense framework.
The core idea behind UnMask is to protect these models by verifying
that an image’s predicted class (“bird”) contains the expected
building blocks (e.g., beak, wings, eyes). For example, if an image is
classified as “bird”, but the extracted building blocks are wheel, seat
and frame, the model may be under attack. UnMask detects such
attacks and defends the model by rectifying the misclassification,
re-classifying the image based on its extracted building blocks. Our
extensive evaluation shows that UnMask (1) detects up to 92.9%
of attacks, with a false positive rate of 9.67% and (2) defends the
model by correctly classifying up to 92.24% of adversarial images
produced by the current strongest attack, Projected Gradient Descent,
in the gray-box setting. Our proposed method is architecture
agnostic and fast. To enable reproducibility of our research, we
have anonymously open-sourced our code and large newly-curated
dataset (~5GB) on GitHub (https://github.com/unmaskd/UnMask).
Sponsor
Date Issued
2019
Extent
Resource Type
Text
Resource Subtype
Technical Report