Early-bird training and hierarchical hardware-aware model compression towards green and ubiquitous artificial intelligence

Thumbnail Image
Author(s)
You, Haoran
Editor(s)
Associated Organization(s)
Organizational Unit
Organizational Unit
School of Computer Science
School established in 2007
Supplementary to:
Abstract
Artificial intelligence (AI) has made remarkable breakthroughs across various applications, such as image perception, augmented and virtual reality (AR/VR), and AI-generated content (AIGC). However, critical research gaps remain between the powerful yet large-scale AI models and the computational constraints of both edge and cloud platforms, including (1) the efficiency gap, which hinders the acceleration of AI training and the ability to iterate quickly; (2) the scalability gap, which challenges efficient scaling and deployment of AI models on cloud GPUs; and (3) the accessibility gap, which limits the feasibility of running small-scale AI models on resource-constrained edge devices like smartphones, AR/VR headsets, and IoT sensors. In this thesis, I will introduce three key strategies to enable efficient, scalable, and accessible AI across cloud and edge. First, I will introduce Early-Bird Tickets, a method that identifies efficient subnetworks from a large AI model during early training stages, achieving 5~10x training efficiency. Second, I will present ShiftAddNet, a hardware-aware AI algorithm that replaces costly multiplications with hardware-efficient shift and add operators, improving scalability and efficiency for large-scale vision and language models. Third, I will advocate for a holistic AI system co-design approach that optimizes both algorithms and hardware. For example, I will showcase ViTCoD, the first Vision Transformer (ViT) algorithm-hardware co-design framework that leverages the unique characteristics of ViTs to boost system performance. By combining these approaches, this thesis enables efficient, scalable, and accessible AI training and deployment, closing the three gaps towards ubiquitous AI across cloud and edge platforms.
Sponsor
Date
2025-04-29
Extent
Resource Type
Text
Resource Subtype
Dissertation
Rights Statement
Rights URI