From Murmann Mixed-Signal Group
BSEE, Stanford University, 2010
MSEE, Stanford University, 2011
Admitted to Ph.D. Candidacy: 2011-2012
Smart CMOS Image Sensor for Mobile Computer Vision
Four decades of Moore's Law has increased computing power to the point that often the bottleneck is no longer the technology itself but how users interact with it. As mobile devices incorporate ever increasing numbers of sensors, they become better able to address this bottleneck by providing a richer HCI platform. But while recent advances in computer vision make it a promising candidate for seamless sensing on mobile phones and heads up displays, existing mobile vision implementations consume too much power to be practical. The aim of this project is to explore the sources of power inefficiency in current mobile vision implementations in order to design a low-power smart CMOS image sensor for object detection.
A typical computer vision pipeline consists of a digital camera, made up of an image sensor and image signal processor, and a digital processor. The digital camera captures raw images, processes them to improve their visual quality, and compresses them to JPEG format. The digital processor then extracts features from the captured JPEG images and feeds them into a trained classifier. Since low-level feature extraction can make up a significant fraction of the total computation, a mixed-signal implementation of this step in hardware, before the digital processor, could help reduce power consumption. However, such an approach is valid only if it does not greatly reduce the performance of the overall pipeline. While low-power "Vision Sensors" remain an active area of research, they often achieve power savings by capturing images of lower quality than typical digital camera JPEGs, and their performance in the context of a computer vision pipeline is generally unreported. To address this gap, we have developed software to simulate the object detection performance of arbitrary hardware implementations of image sensors with mixed-signal preprocessing. We are currently leveraging this simulator, to design a custom readout circuit for a low-power smart CMOS image sensor.
To date, we have created a database of over 4,000 annotated RAW images, modeled after the PASCAL VOC database . Unlike processed JPEG images, RAW images are composed of high resolution (12 bit) raw photosensor outputs, which approximate analog scene illumination levels. We have also developed software tools for custom database creation. We plan to open source this database and corresponding software tools later this year (2015).
We would like to acknowledge undergraduate researcher David Ta (firstname.lastname@example.org) for his contributions to this project.
Email: alexoz AT stanford DOT edu