Title:
Energy-aware DNN Quantization for Processing-In-Memory Architecture

Thumbnail Image
Authors
Kang, Beomseok
Authors
Advisors
Mukhopadhyay, Saibal
Yu, Shimeng
Krishna, Tushar
Advisors
Associated Organizations
Series
Supplementary to
Abstract
With increasing computational cost of deep neural network (DNN), many efforts to develop energy-efficient intelligent system have been proposed from dedicated hardware platforms to model compression algorithms. Recently, hardware-aware quantization algorithms have shown further improvement in the energy efficiency of DNN by considering hardware architectures and algorithms together. In this work, a genetic algorithm-based energy-aware DNN quantization framework for Processing-In-Memory (PIM) architectures, named EGQ, is presented. The key contribution of the research is to design a fitness function that can reduce the number of analog-to-digital converter (ADC) access, which is one of the main energy overhead in PIM. EGQ automatically optimizes layer-wise weight and activation bitwidth with negligible accuracy loss while considering the dynamic energy in PIM. The research demonstrates the effectiveness of EGQ on several DNN models VGG-19, ResNet-18, ResNet-50, MobileNet-V2, and SqueezeNet. Also, the area, dynamic energy, and energy efficiency in the compressed models with various memory technologies are analyzed. EGQ shows 15%-103% higher energy efficiency with 2% accuracy loss than other PIM-aware quantization algorithms.
Sponsor
Date Issued
2022-05-13
Extent
Resource Type
Text
Resource Subtype
Thesis
Rights Statement
Rights URI