Title:
Improving the Understanding of Malware using Machine Learning
Improving the Understanding of Malware using Machine Learning
Author(s)
Downing, Evan
Advisor(s)
Lee, Wenke
Editor(s)
Collections
Supplementary to
Permanent Link
Abstract
When a security organization receives a sample (whether it be a binary, script, etc.)
from their customers, their goal is to determine if it is malicious or benign. Because samples can be received in large volumes, automated triage and analysis is required to keep up. Broadly speaking, these automated solutions are composed of statistical models and heuristic rulesets, which use distinctive attributes from malicious samples observed in the past. In response, attackers will evolve their samples to evade analysis and detection over time. To evade static analysis, malware binaries can obfuscate themselves by removing system calls
and strings from plain view. This prevents reverse engineers from statically identifying
binary functions of interest to trigger during dynamic analysis. To evade dynamic analysis detection, malware can randomize their artifacts (such as filenames, process names,
etc.), which makes automatically mining behaviors which generalize for future variations
difficult.
To address these challenges, this thesis proposes a framework to identify malicious
functions in static malware binaries for analysis, and behavior combinations in dynamic
analysis reports for detection. The framework takes incoming sample binaries submitted
to the organization to be analyzed as input. First, DeepReflect localizes malicious functions within the unpacked malware binaries (statically), allowing analysts to target specific
regions for further dynamic analysis. DeepReflect increases the malicious function detection Area Under the Curve (AUC) value by 6-10% compared to four state-of-the-art
approaches on a dataset of 36k unique, unpacked malware binaries. After executing the
samples in a controlled sandbox, BCRAFTY uses its dynamic report to extract and generalize behavior combinations to detect similar malware samples in the future. Compared to
using analyst-defined behaviors alone, BCRAFTY increases the malware detection True
Positive Rate (TPR) value by 7.5% while keeping the False Positive Rate (FPR) value near
0.3% .
Sponsor
Date Issued
2023-12-06
Extent
Resource Type
Text
Resource Subtype
Dissertation