# Lita Yang

### From Murmann Mixed-Signal Group

BSEE, California Institute of Technology, 2012

**Email**: yanglita AT stanford DOT edu

**Research**: A Design Space Exploration of Implementing Deep Convolutional Neural Networks in Hardware

Machine learning algorithms have yielded state-of-the art performance in applications such as object recognition, autonomous systems, and data mining. Unfortunately, practical use of these feature extraction/classification algorithms is limited by time-consuming, power-hungry computations and the large dimensionality of data. To overcome this, we propose implementation in dedicated hardware, which can be orders of magnitude more efficient in terms of speed, energy, and area.

Current top performing machine learning/computer vision algorithms are deep Convolutional Neural Networks (CNNs), which can achieve a unique combination of high efficiency and broad application scope. Since CNNs are inherently stochastic, we can leverage both the advantages of using analog/mixed-signal circuits for efficient computation and the tolerability of noise in these networks. Unfortunately, CNNs require billions of parameters and network connections and thus, researchers have resorted to either small-scale ASICs with little to no scalable potential, or inefficient FPGA/GPU implementations because it would be too time-consuming to build a custom IC. Our goal is to provide a realistic assessment on the hardware design space (digital vs. analog) for the critical components of this algorithm, while being aware of how these components will fit into the full-scale system.

We have targeted the weighted sum function, which is a major computational primitive in feed-forward CNNs, as our primary design objective. Along with designing a highly efficient analog weighted sum, we want to generate weighted sum architectures in digital to 1) provide a fair comparision between the two hardware design choices, and 2) create highly efficient computational units for portions of the algorithm which require higher precision. With this assesment, we plan to develop a mixed-signal framework for interfacing with a full-scale digital system, with which we can better understand how to reduce the energy-intensive memory requirements of the network and optimize wiring for high energy efficiency.