Title:
Efficient pipelined ReRAM-based processing-in-memory architecture for convolutional neural network inference

Thumbnail Image
Author(s)
Ko, Sho
Authors
Advisor(s)
Yu, Shimeng
Advisor(s)
Person
Editor(s)
Associated Organization(s)
Series
Supplementary to
Abstract
This research work presents a design of an analog ReRAM-based PIM (processing-in-memory) architecture for fast and efficient CNN (convolutional neural network) inference. For the overall architecture, we use the basic hardware hierarchy such as node, tile, core, and subarray. On the top of that, we design intra-layer pipelining, inter-layer pipelining, and batch pipelining to further exploit parallelism in the architecture and increase overall throughput for the inference of an input image stream. Our simulator also optimizes the performance of the NoC (network-on-chip) using SMART (single-cycle multi-hop asynchronous repeated traversal) flow control. Finally, we experiment with weight replications for different CNN layers and report throughput, energy efficiency, and speedup of VGG (A-E) for large-scale data set ImageNet.
Sponsor
Date Issued
2020-04-07
Extent
Resource Type
Text
Resource Subtype
Thesis
Rights Statement
Rights URI