Title:
Efficient pipelined ReRAM-based processing-in-memory architecture for convolutional neural network inference
Efficient pipelined ReRAM-based processing-in-memory architecture for convolutional neural network inference
Authors
Ko, Sho
Authors
Advisors
Yu, Shimeng
Advisors
Person
Associated Organizations
Organizational Unit
Organizational Unit
Series
Collections
Supplementary to
Permanent Link
Abstract
This research work presents a design of an analog ReRAM-based PIM (processing-in-memory) architecture for fast and efficient CNN (convolutional neural network) inference. For the overall architecture, we use the basic hardware hierarchy such as node, tile, core, and subarray. On the top of that, we design intra-layer pipelining, inter-layer pipelining, and batch pipelining to further exploit parallelism in the architecture and increase overall throughput for the inference of an input image stream. Our simulator also optimizes the performance of the NoC (network-on-chip) using SMART (single-cycle multi-hop asynchronous repeated traversal) flow control. Finally, we experiment with weight replications for different CNN layers and report throughput, energy efficiency, and speedup of VGG (A-E) for large-scale data set ImageNet.
Sponsor
Date Issued
2020-04-07
Extent
Resource Type
Text
Resource Subtype
Thesis