Bookmark and Share

Evaluating and Implementing Deep Learning Processor Alternatives for Vision

This blog post was originally published at Vision Systems Design's website. It is reprinted here with the permission of PennWell.

Convolutional neural networks (CNNs) and other deep learning techniques, as I've recently noted, are quickly becoming legitimate options for implementing object recognition and other computer vision capabilities. With the growing popularity of CNNs, there’s a growing range of processor options being used to deploy these algorithms. Three talks presented at the recent Embedded Vision Summit showcase several of these processor options, and provide numerous ideas for creating efficient CNN implementations.  I’ll introduce the three presentation videos here – and for those who want a hands-on technical introduction to deep learning for computer vision, see the information about an upcoming live tutorial at the end of this column.

In "Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," Yair Siegel of CEVA discusses how to implement deep learning on specialized DSP architectures. Image recognition capabilities enabled by deep learning are benefitting an increasing number of applications, Siegel notes, including automotive safety, surveillance and drones. This application expansion is driving a shift towards running neural networks not only on servers but also inside embedded devices. Numerous challenges exist, however, in squeezing deep learning into resource-limited devices. Siegel's presentation details how to transition a neural network from the research lab into a production embedded implementation based on a CEVA vision processor core, making use of the company's neural network software framework. Siegel explains how CEVA's framework integrates with existing deep learning development environments like Caffe, as well as how it can be used to create low-power embedded systems with neural network capabilities. Here's a preview:

Next is "Accelerating Deep Learning Using Altera FPGAs," from Bill Jenkins of Intel. While large strides have recently been made in the development of high-performance systems for neural networks based on multi-core processors, Jenkins suggests that significant challenges remain in power, cost and, performance scaling. Field-programmable gate arrays (FPGAs) are a natural choice for implementing neural networks, he believes, because they combine computing, logic, and memory resources in a single device. Intel's Programmable Solutions Group (formerly Altera) has developed a scalable CNN reference design using the OpenCL programming language in combination with the company's OpenCL SDK. Using OpenCL kernels to construct the CNN enables straightforward scaling between smaller and larger devices; designs can be sized using different numbers of kernels at each layer, for example. It also enables smooth design transitions from one FPGA generation to another, in the process benefitting from architectural advancements such as floating-point engines and clock frequency increases. Here's a preview:

Finally, there's "Efficient Convolutional Neural Network Inference on Mobile GPUs," presented by Paul Brasnett of Imagination Technologies. Brasnett notes that graphics processors (GPUs) have already become established as a key tool for the initial training of deep learning algorithms. Deploying trained deep learning algorithms on embedded systems is a key enabler to their commercial success, and his company believes that mobile GPUs are proving to be equally capable processors for such tasks, with the added benefit that they are needed in many designs to handle graphics functions. Brasnett explores key primitives for CNN inference on mobile GPUs, along with various implementation strategies. He works through alternative options and trade-offs, and provides performance analysis using the company's PowerVR graphics architecture as a case study. Here's a preview:

To get a deeper understanding of deep learning techniques for vision, attend the hands-on tutorial, "Deep Learning for Vision Using CNNs and Caffe," on September 22, 2016 in Cambridge, Massachusetts. This full-day tutorial is focused on convolutional neural networks for vision and the Caffe framework for creating, training, and deploying them. Presented by the primary Caffe developers from the U.C. Berkeley Vision and Learning Center, it takes participants from an introduction to the theory behind convolutional neural networks to their actual implementation, and includes hands-on labs using Caffe. Go to the event page for details and to register.

Brian Dipert
Editor-in-Chief, Embedded Vision Alliance