Energy-aware DNN Quantization for Processing-In-Memory Architecture
Loading...
Author(s)
Kang, Beomseok
Editor(s)
Collections
Supplementary to:
Permanent Link
Abstract
With increasing computational cost of deep neural network (DNN), many efforts to develop energy-efficient intelligent system have been proposed from dedicated hardware platforms to model compression algorithms. Recently, hardware-aware quantization algorithms have shown further improvement in the energy efficiency of DNN by considering hardware architectures and algorithms together. In this work, a genetic algorithm-based energy-aware DNN quantization framework for Processing-In-Memory (PIM) architectures, named EGQ, is presented. The key contribution of the research is to design a fitness function that can reduce the number of analog-to-digital converter (ADC) access, which is one of the main energy overhead in PIM. EGQ automatically optimizes layer-wise weight and activation bitwidth with negligible accuracy loss while considering the dynamic energy in PIM. The research demonstrates the effectiveness of EGQ on several DNN models VGG-19, ResNet-18, ResNet-50, MobileNet-V2, and SqueezeNet. Also, the area, dynamic energy, and energy efficiency in the compressed models with various memory technologies are analyzed. EGQ shows 15%-103% higher energy efficiency with 2% accuracy loss than other PIM-aware quantization algorithms.
Sponsor
Date
2022-05-13
Extent
Resource Type
Text
Resource Subtype
Thesis