Title:
Developing Robust Models, Algorithms, Databases and Tools With Applications to Cybersecurity and Healthcare

dc.contributor.advisor Chau, Duen Horng
dc.contributor.author Freitas, Scott
dc.contributor.committeeMember Kumar, Srijan
dc.contributor.committeeMember Yang, Diyi
dc.contributor.committeeMember Tong, Hanghang
dc.contributor.committeeMember Prakash, B. Aditya
dc.contributor.department Computational Science and Engineering
dc.date.accessioned 2022-01-14T16:11:25Z
dc.date.available 2022-01-14T16:11:25Z
dc.date.created 2021-12
dc.date.issued 2021-12-13
dc.date.submitted December 2021
dc.date.updated 2022-01-14T16:11:26Z
dc.description.abstract As society and technology becomes increasingly interconnected, so does the threat landscape. Once isolated threats now pose serious concerns to highly interdependent systems, highlighting the fundamental need for robust machine learning. This dissertation contributes novel tools, algorithms, databases, and models—through the lens of robust machine learning—in a research effort to solve large-scale societal problems affecting millions of people in the areas of cybersecurity and healthcare. (1) Tools: We develop TIGER, the first comprehensive graph robustness toolbox; and our ROBUSTNESS SURVEY identifies critical yet missing areas of graph robustness research. (2) Algorithms: Our survey and toolbox reveal existing work has overlooked lateral attacks on computer authentication networks. We develop D2M, the first algorithmic framework to quantify and mitigate network vulnerability to lateral attacks by modeling lateral attack movement from a graph theoretic perspective. (3) Databases: To prevent lateral attacks altogether, we develop MALNET-GRAPH, the world’s largest cybersecurity graph database—containing over 1.2M graphs across 696 classes—and show the first large-scale results demonstrating the effectiveness of malware detection through a graph medium. We extend MALNET-GRAPH by constructing the largest binary-image cybersecurity database—containing 1.2M images, 133×more images than the only other public database—enabling new discoveries in malware detection and classification research restricted to a few industry labs (MALNET-IMAGE). (4) Models: To protect systems from adversarial attacks, we develop UNMASK, the first model that flags semantic incoherence in computer vision systems, which detects up to 96.75% of attacks, and defends the model by correctly classifying up to 93% of attacks. Inspired by UNMASK’s ability to protect computer visions systems from adversarial attack, we develop REST, which creates noise robust models through a novel combination of adversarial training, spectral regularization, and sparsity regularization. In the presence of noise, our method improves state-of-the-art sleep stage scoring by 71%—allowing us to diagnose sleep disorders earlier on and in the home environment—while using 19× less parameters and 15×less MFLOPS. Our work has made significant impact to industry and society: the UNMASK framework laid the foundation for a multi-million dollar DARPA GARD award; the TIGER toolbox for graph robustness analysis is a part of the Nvidia Data Science Teaching Kit, available to educators around the world; we released MALNET, the world’s largest graph classification database with 1.2M graphs; and the D2M framework has had major impact to Microsoft products, inspiring changes to the product’s approach to lateral attack detection.
dc.description.degree Ph.D.
dc.format.mimetype application/pdf
dc.identifier.uri http://hdl.handle.net/1853/66145
dc.language.iso en_US
dc.publisher Georgia Institute of Technology
dc.subject machine learning
dc.subject data mining
dc.subject graph mining
dc.subject graphs
dc.subject adversarial machine learning
dc.subject cybersecurity
dc.subject healthcare
dc.title Developing Robust Models, Algorithms, Databases and Tools With Applications to Cybersecurity and Healthcare
dc.type Text
dc.type.genre Dissertation
dspace.entity.type Publication
local.contributor.advisor Chau, Duen Horng
local.contributor.corporatename College of Computing
local.contributor.corporatename School of Computational Science and Engineering
relation.isAdvisorOfPublication fb5e00ae-9fb7-475d-8eac-50c48a46ea23
relation.isOrgUnitOfPublication c8892b3c-8db6-4b7b-a33a-1b67f7db2021
relation.isOrgUnitOfPublication 01ab2ef1-c6da-49c9-be98-fbd1d840d2b1
thesis.degree.level Doctoral
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
FREITAS-DISSERTATION-2021.pdf
Size:
12.39 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
3.87 KB
Format:
Plain Text
Description: