October 2013 Embedded Vision Summit East Presentations

Engineers: Learn to Design Machines That See!
October 2, 2013
Westford Regency Inn and Conference Center
219 Littleton Road
Westford, MA 01886

Below are abstracts for some of the presentations at the now-concluded October 2013 Embedded Vision Summit East.

"Targeting Computer Vision Algorithms to Embedded Hardware" by Mario Bergeron, Avnet
This session explores how computer vision algorithms can be targeted to embedded hardware, including FPGAs. Topics include efficient movement of high pixel-rate video data, use of hardware accelerators, as well as use cases showing what is currently available with existing technology for high-performance image processing.

"What Can You Do With Embedded Vision?" by Jeff Bier, Embedded Vision Alliance
For decades, computer vision was a niche technology, because computer vision equipment was big, expensive, and complex to use. Recently, however, products like the Microsoft Kinect and vision-based automotive safety systems have demonstrated that computer vision can now be deployed even in cost-sensitive applications, and in ways that are easy for non-specialists to use. We use the term "embedded vision" to refer to the incorporation of visual intelligence into a wide range of systems, creating "machines that see and understand." This presentation is intended for those new to embedded vision, and those seeking ideas for new embedded vision applications and technologies. Jeff Bier, Founder of the Embedded Vision Alliance and President of BDTI, will present some of the most imaginative and compelling new products incorporating embedded vision, and will highlight how advances in enabling technologies, such as processors and sensors, are fueling accelerated innovation in embedded vision products.

"Porting Applications to High-Performance Imaging DSPs" by Gary Brown, Cadence (Tensilica)
We discuss the challenges of porting and tuning applications to a high-performance imaging digital signal processor. An imaging-specific processor can make a huge difference in the performance of computationally intensive algorithms. But you can’t expect to just quickly port those algorithms, you need to tune them. We will highlight the example of a bilateral filter, where a 123X speedup was first achieved and then increased to 300X after further tuning the code. We will also discuss software development environments.

"Embedded Lucas-Kanade Tracking: How It Works, How to Implement It, and How to Use It" by Goksel Dedeoglu, Texas Instruments
This tutorial is intended for technical audiences interested in learning about the Lucas-Kanade (LK) tracker, also known as the Kanade-Lucas-Tomasi (KLT) tracker. Invented in the early 80s, this method has been widely used to estimate pixel motion between two consecutive frames. We present how the LK tracker works and discuss its advantages, limitations, and how to make it more robust and useful. Using DSP-optimized functions from TI's Vision Library (VLIB), we will also show how to detect feature points in real-time and track them from one frame to the next using the LK algorithm.

"Embedded 3D Stereo Vision: How it Works, How to Implement It, and How to Use It" by Goksel Dedeoglu, Texas Instruments
This tutorial is intended for technical audiences interested in learning about stereo vision for 3D depth perception. Starting with a brief description and comparison of depth sensing modalities, we present how a stereo vision camera works and discuss its advantages and limitations. We then demonstrate TI Stereo Module (TISMO), which is a DSP-optimized SW solution for embedded applications. We demonstrate how stereo depth information can help in various computer vision problems, including motion detection for video security, and obstacle detection for automotive and industrial safety.

"Using Synthetic Image Generation to Reduce the Cost of Vision Algorithm Development" by Clark Dorman, Next Century Corporation
One of the greatest challenges in developing computer vision applications is the development and maintenance of high-quality training and testing data. Annotated data that covers the range of object variations, poses, and environmental situations is needed to ensure that a system will perform successfully in operational situations. However, obtaining sufficient data is time consuming and expensive. The Synthetic Image Generation Harness for Training and Testing (SIGHTT) project creates annotated images by combining rendered 3D objects with real backgrounds. This talk will discuss the use of synthetic data and its generation, combined with live data, to alleviate the data problem .

"Tools for 'Democratizing' Computer Vision" by Jayan Eledath, SRI International
Computer vision has matured to a point where it is beginning to be widely deployed in several real-world applications. This has led to significant growth in the number of vision algorithm and application developers and their communities. However, this growth has resulted in a vast and cluttered landscape of algorithms, many of which have limited capabilities. What is needed, is an efficient means of assessing the performance of these algorithms across imaging domains, and of identifying the best algorithms for specific applications. Under the DARPA Visual Media Reasoning program, SRI has developed automated performance characterization (APC) tools for just this purpose. This talk will describe our framework and system for answering the following questions for detection algorithms:

  1. How well will an algorithm perform for a given image at a chosen parameter setting; and
  2. What parameters should be used for a particular algorithm and image.

We will also describe how detection probabilities can be modeled as a function of both algorithm parameters and image characteristics. Finally, we will show a live demonstration of the APC tool.

"Finding Objects Using Canny Edge Detection" by Eric Gregori, BDTI
This presentation dives into the Canny edge detection algorithm; how it works and how to use it. Topics include a Canny algorithm walk through, connecting edges with contours, and classifying contours with mathematical models. Finally, the topics learned are applied to describe a system for counting dots on dice.

"Feature Detection: How It Works, When to Use It, and a Sample Implementation" by Marco Jacobs, videantis
Feature detection and tracking are key components of many computer vision applications. In this talk, we give an overview of commonly used feature detectors, and explain in detail how the Harris feature detector works. We then present a pyramidal implementation of the Lucas-Kanade algorithm to track these features across a series of images. Finally, we show how we have optimized and parallelized the OpenCV versions of these algorithms, resulting in a real-time, power efficient embedded implementation on a videantis unified video/vision processor.

"Algorithms for Object Detection and Tracking" by Tim Jones, MathWorks
In this presentation we introduce several popular approaches to object detection and tracking for embedded vision, including Kanade-Lucas-Tomasi (KLT) point tracking and background subtraction using Gaussian Mixture Models. We also explore handling occlusions while tracking an object using Kalman filtering, assigning object detections to tracks using the Hungarian algorithm, and creating a multiple object tracking system. These techniques are useful in a range of applications including embedded surveillance cameras, auto-focus, driver drowsiness detection and privacy filtering.

"Using FPGAs to Accelerate 3D Vision Processing: A System Developer’s View" by Ken Lee, VanGogh Imaging
Embedded vision system designers must consider many factors in choosing a processor. This is especially true for 3D vision systems, which require complex algorithms, lots of processing power, and large amounts of memory. When we started development of our first 3D vision system at VanGogh Imaging three years ago for object recognition and measurement, it was based on a PC platform. Over the last two years, we have converted our software to run on an ARM-based system running Linux and Android. The conversion to the embedded system reduced cost but at the expense of performance, despite significant effort to reduce algorithmic and data structure complexity. Now, in order to improve performance, we are implementing the same design on an FPGA-SoC (Zynq from Xilinx). Our analysis indicates that this approach will allow us to increase performance dramatically with minimal additional cost. We will present our new implementation approach and how it yields performance improvements, as well as lessons learned during the PC-to-ARM and ARM-to-SoC conversion process via design examples.

"Using Heterogeneous Computing for Mobile and Embedded Vision" by Rick Maule, Qualcomm
A single vision application typically incorporates multiple algorithms requiring many different types of computation. This diversity makes it difficult for a single type of processing engine, such as the CPU, to execute vision applications with maximum efficiency. In this presentation, we explore the potential for implementing vision applications on heterogeneous processor chips – that is, chips consisting of multiple processing engines of different architecture types, such as the CPU, GPU, and DSP . Using mobile devices as an example, we investigate the opportunities, challenges, trade-offs and techniques associated with mapping vision applications onto heterogeneous processing engines, particularly for applications requiring maximum energy and thermal efficiency.

"Designing a Multi-Core Architecture Tailored for Pedestrian Detection Algorithms" by Tom Michiels, Synopsys
Pedestrian detection is an important function in a wide range of applications, including automotive safety systems, mobile applications, and industrial automation. A popular algorithm for pedestrian detection is HOG (Histogram of Oriented Gradients). Several variants of the algorithm have been proposed. The complexity and diversity of these algorithms demands a programmable implementation. We will compare the suitability of different embedded processor architectures to implement the HOG algorithm, with a focus on performance and power consumption. We will cover MCUs, DSPs, and embedded GPUs. Looking beyond standard processor architectures, we will discuss the advantages that can be achieved by tailoring an architecture to the specific requirements of the application, in the case of HOG resulting in a heterogeneous multi-core solution. In contrast to choosing among standard processors, building a tailored multi-core solution is a design task. We will explain the methodology we applied to partition the algorithm and map it to heterogeneous special-purpose processor cores. We will discuss how to parallelize the design process among a team of algorithm, hardware and software engineers; the role of a virtual prototype; and the need for an FPGA-based prototyping system.

"Better Image Understanding through Better Sensor Understanding" by Michael Tusch, Apical
One of the main barriers to widespread use of embedded vision is its reliability. For example, systems which detect people some of the time, or which produce frequent false detections, are of limited use. Why is it that algorithms which work well in the lab don’t work so well in real-world conditions? Cameras perform a great deal of image processing in order to make video look natural and realistic. Often this leads to unpredictable variations in the input data to embedded vision algorithms. We show that an understanding of the specific image sensor characteristics coupled with information gleaned by image processing methods, has a very significant impact on accuracy of modern embedded vision algorithms in difficult visual environments.

"Implementing Real-Time Hyerspectral Imaging" by Kalyanramu Vemishetty, National Instruments
Hyperspectral imaging enables vision systems to use many spectral bands rather than just the typical red, green, blue bands. This can be very useful in a variety of applications, such as separating agricultural produce from contaminants. Fast Fourier transforms (FFTs) are often used in implementing hyperspectral imaging, but it can be challenging to attain the necessary performance in cost-, power-and size-constrained systems such as self-contained smart cameras for industrial use. In this presentation, we introduce hyperspectral imaging for embedded vision applications, explain how FFTs are used in hyperspectral imaging, and explore two different streaming architectures for implementing a column FFT required for hyperspectral imaging in an FPGA, balancing hardware resource utilization and processing throughput.

"Efficiently Computing Disparity Maps for Low-Cost 3D Stereo Vision" by Tom Wilson, CogniVue
The ability to detect and determine the position of objects in 3D is important for many vision applications, such as gesture recognition, automotive safety and augmented reality. Various sensor technologies can be used to provide 3D images, with stereo vision being the best established approach. Disparity map generation is particularly important part of stereo vision processing, and is very demanding in terms of processor performance, creating challenges for cost- and power-constrained systems. This presentation will describe disparity map generation and why it is challenging. It will also present approaches to reduce the computational load if disparity map generation and present an example embedded implementation.

"Efficient Super-Resolution Algorithms and Implementation Techniques for Constrained Applications" by Ilan Yona, CEVA
Image quality is a critical challenge in many applications, including smart phones, especially when using low quality sensors or when using digital zoom for enlarging part of the image. Super-resolution is a set of techniques that can address this challenge by combining multiple images to produce a single, higher quality image. However, super-resolution can be extremely computationally demanding, so when implementing it on a constrained platform (such as a smart phone), the algorithm should be carefully chosen, balancing image quality, speed, and power consumption. We tested variety of known super-resolution algorithms and found that they were not efficient for cost- and power-constrained systems. We then developed a new algorithm that produces good quality images and is suitable for constrained systems. In this talk, we explain how super-resolution works, introduce the previously known algorithms, and present our new algorithm and a sample implementation of it.

"Vision-Based Automotive Driver Assistance Systems: Challenges and Approaches" by Paul Zoratti, Xilinx
In recent years, the automotive industry has made remarkable advances in the research and development of driver assistance (DA) systems that truly enrich the driving experience, enhance roadway safety, and provide drivers with unprecedented information about the vehicle environment. Government legislation, strong consumer interest in safety features, and innovations in remote sensor technology and processing algorithms are placing the industry on the cusp of a dramatic increase in DA system deployment. Also playing a major role in the manifestation of the DA deployment rise are cost-effective, high performance electronic devices which provide the computational power required by complex processing algorithms. In addition to requiring high computational horsepower, advanced DA systems will process and fuse data from multiple on-board remote sensors as well as integrate information from future vehicle-to-infrastructure and vehicle-to-vehicle communications systems necessitating electronic devices to provide multiple parallel processing pipelines. Finally these DA processing devices will also need to be flexible for adapting to evolving algorithms and unique vehicle platform requirements as well as scalable to adjust the system bill of materials for various sets of DA feature bundles. This presentation describes the objectives and results of a project to realize a multi-camera, multi-feature DA system processor on an all programmable SoC architecture.

Additional Information

If you have other questions about the Embedded Vision Summit, please contact us at summit@embedded-vision.com.

Please plan to join us May 22-24, 2018 in Santa Clara, California.