Towards Performance-Aware Allocation for Accelerated Machine Learning on GPU-SSD Systems
Author(s)
Gundawar, Ayush
Advisor(s)
Editor(s)
Collections
Supplementary to:
Permanent Link
Abstract
The exponential growth of data-intensive machine learning workloads has exposed significant limita- tions in conventional GPU-accelerated systems, es- pecially when processing datasets exceeding GPU DRAM capacity. We propose MQMS, an augmented in-storage GPU architecture and simulator that is aware of internal SSD states and operations, en- abling intelligent scheduling and address allocation to overcome performance bottlenecks caused by CPU-mediated data access patterns. MQMS in- troduces dynamic address allocation to maximize internal parallelism and fine-grained address map- ping to efficiently handle small I/O requests without incurring read-modify-write overheads. Through extensive evaluations on workloads ranging from large language model inference to classical machine learning algorithms, MQMS demonstrates orders-of- magnitude improvements in I/O request throughput, device response time, and simulation end time com- pared to existing simulators.
Sponsor
Date
Extent
Resource Type
Text
Resource Subtype
Undergraduate Research Option Thesis