Hardware Dynamical System for Solving Optimization Problems

Author(s)
Chang, Muya
Editor(s)
Associated Organization(s)
Series
Supplementary to:
Abstract
Optimization problems form the basis of a wide gamut of computationally challenging tasks in signal processing, machine learning, resource planning and so on. Out of these, convex optimization, and in particular least square optimization, covers a vast majority; and recent advances in iterative algorithms to solve such problems of large dimensions have gained traction. Multi-core designs with systolic or semi-systolic architectures can be a key enabler for implementing discrete dynamical systems and realize massively scalable architectures to solve such optimization algorithms. In the first part of the thesis, we propose a platform architecture implemented in programmable FPGA hardware to solve a template problem in distributed optimization, namely signal reconstruction from non-uniform sampling. This is a quintessential problem with wide-spread applications in signal processing, computational imaging etc. We expect such an architectural exploration to open up promising opportunities to solve distributed optimizations that are becoming increasingly important in real-world applications. The complete system design, mapping and optimization into an FPGA architecture as well as analysis of convergence and scalability have been presented. Due to the limitation of FPGA, we were motivated to move on to ASIC design, the next project. In the second part of the thesis, we present OPTIMO, a 65nm, 16-b, fully-programmable, spatial-array processor with 49-cores and a hierarchical multi-cast network for solving distributed optimizations via the alternating direction method of multipliers (ADMM). ADMM is a projection based method for solving generic constrained optimizations problems. In essence, it relies upon decomposing the decision vector into subvectors, updating sequentially by minimizing an augmented Lagrangian function, and eventually updating the Lagrange multiplier. The ADMM algorithm has typically been used for solving problems in which the decision variable is decomposed into two or multiple subvectors. We demonstrate six template algorithms and their applications and we measure a peak energy efficiency of 279 GOPS/W. In the last part, we switch to another side of optimization, combinatorial optimization, and present AC-SAT, an analog based circuits using traditional CMOS technology for solving a representative NP-complete optimization problem, the Boolean Satisfiability (SAT) problem. AC-SAT is based on the deterministic continuous-time dynamical system (CTDS) and finds SAT solutions in analog polynomial time with the expense of auxiliary variables growing exponentially when needed. The overall design is programmable, modular, and has been validated through multiple stages, from high level simulation on the general purpose CPUs, low level simulation through Simulation Program with Integrated Circuit Emphasis (SPICE), all the way to the measurement on the fabricated chip. The system is capable of solving up the problem within 50 variables and 212 clauses. Through the measurement result, we demonstrate the relationship between optimization hardness as transient chaos and show that this architecture is highly scalable and configurable.
Sponsor
Date
2020-12-06
Extent
Resource Type
Text
Resource Subtype
Dissertation
Rights Statement
Rights URI