Data- and communication-centric approaches to model and design flexible deep neural network accelerators

Thumbnail Image
Kwon, Hyouk Jun
Krishna, Tushar
Associated Organization(s)
Organizational Unit
Organizational Unit
Supplementary to
Deep neural network (DNN) accelerators, which are specialized hardware for DNN inferences, enabled energy-efficient and low-latency DNN inferences. To maximize the efficiency (energy efficiency, latency, and throughput) of DNN accelerators, DNN accelerator designers optimize DNN accelerator and mapping of target DNN models on the accelerator. However, designing DNN accelerators for recent DNN models that contain diverse layer operations and size is challenging since optimizing accelerator and mapping for the average case of the layers in target DNN workloads often lead to uniformly inefficient design points. Therefore, this thesis proposes to design flexible mapping DNN accelerators that can run multiple mappings to adapt to diverse DNN layers in DNN workloads. This thesis first quantifies the costs and benefits of mapping using a data-centric approach. Based on the observation that no single mapping is ideal for all layers, this thesis explores two approaches to design flexible mapping accelerators: reconfigurability and heterogeneity. Reconfigurable accelerators are based on communication-centric approach that implements flexible network-on-chip (NoC) to enable to configure accelerator at runtime for any mapping styles. Heterogeneous accelerators employ multiple sub-accelerators with fixed but diverse mapping styles within an accelerator chip to provide coarser-grained flexibility with lower area and power cost than the reconfigurability. Case studies show that both approaches provide Pareto-optimal design points with different strengths.
Date Issued
Resource Type
Resource Subtype
Rights Statement
Rights URI