Bookmark and Share

AMD's Upcoming Developer Summit: Vision Is Central To It

Embedded Vision Alliance member AMD's yearly Developer Summit (APU 13, for short) will take place next month, November 11-13 to be exact, in San Jose, California. Alliance founder Jeff Bier will one of the featured speakers at the conference, delivering a presentation entitled "Creating Smarter Applications and Systems Through Visual Intelligence." Here's an abstract:

Computer vision enables systems and applications to learn more about their environments, which in turn enables them to be more responsive and intelligent.  For many years, computer vision was an expensive technology relegated to a few niche applications like factory automation.  Now, thanks largely to improvements in processors — such as heterogeneous architectures – vision is proliferating into a rapidly expanding range of applications, including automotive safety, gaming, consumer electronics, education, healthcare, and retail point-of-sale marketing.  In this presentation, we use application examples to analyze the essential characteristics of typical vision applications – including algorithms, data types and computation and bandwidth requirements – and illustrate how such applications can be efficiently mapped onto heterogeneous processors to create cost-effective and energy efficient “machines that see.”

Conference organizers have also assembled the following list of additional planned presentations that have vision-related themes:


eyeSight's Gesture recognition technology - Introducing the developer SDK (Shmuel Gideon, eyeSight)

eyeSight's gesture recognition service is now reaching an install base of millions of devices, offering gesture recognition capabilities bundled out-of-the-box with windows 8 from leading manufacturers.

eyeSight is launching a developer SDK which will enable developers to hook into the full capabilities of the gesture service and achieve a deeper level of natural interaction with their applications. No need for special hardware or peripherals. No need to become a machine vision specialist.If the gesture service is there, you can call up on it's power to learn where the user is positioned, where he is looking, is there a finger pointing at the screen? where? perhaps a hand waving toward the device or a different motion...?

We wish to enable developers to build new gesture powered applications for various use case and windows based devices:

  • Everyday PC app and casual games
  • Drive time apps for the tablet
  • Digital signage application for stores
  • Touch free manuals/ cooking guides
  • Touch free medical consoles for clinical applications
  • Educational games
  • and more

The presentation will cover the capabilities of eyeSight's gesture recognition service, and show some API and demo examples of what can developers do with it.


3D Reconstruction using Matterport (Matt Bell, Matterport)

Matterport allows users to quickly and automatically make 3D models of physical places and share them online.  In order to make this possible, Matterport's 3D camera needs to quickly process, align, and compress large amounts of 3D data locally so that users can get feedback as they scan.


Setting New Standards for Open Source 3D Processing (Radu Rusu, Open Perception)

This is the story of PCL, or the Point Cloud Library project. Started as both a hobby and a need for experimenting with 3D point clouds by a young PhD researcher, PCL surpassed all expectations, and in less than 3 years became a global phenomenon in the international open source community, attracting more than 500 developers, many thousands of users, and winning the Grand Prize at the Open Source Software (OSS) World Challenge after just 1 year of existence, in 2011.

This is one of those "lessons learned" talks, where we dissect and look back at how PCL got created, and how did it evolve from a humble set of C++ classes to an international movement, with a consortium of large and powerful commercial companies supporting it.  We\'ll discuss current efforts in standardization with respect to 3D processing, as well as show impressive 3D point cloud perception demos, which have scaled from solitary desktops to use cloud processing in just a few years.


Hardware accelerated real-time facial motion capture (Emiliano Gambaretto, Mixamo)

Mixamo democratizes 3D content creation, making it painless and fun, as opposed to traditional approaches that require specific knowledge and hardware. To this point Mixamo recently developed a facial capture and animation tool that helps game developers create 3D facial animations in real-time by using a webcam and acting out the expressions they would like applied to their character.

This technology is based on recent advances in both Computer Vision research and consumer hardware. In the past two decades the Computer Vision community intensively studied problems related to the detection, recognition and tracking of human facial expressions. Such studies where in part promoted by the entertainment industry's growing interest in 3D animation and innovative human-machine interfaces. In the meantime the broad diffusion of acceleration hardware, combined with the introduction of computation APIs, enabled the real-time solution of complex Computer Vision problems.

Mixamo's approach to facial Markerless Motion Capture is based on a mixture of global and local models of human facial expressions and shape achieved through Machine Learning methods. Global models aim to represent the whole face as a deformable object that can be tracked using standard Computer Vision approaches. Local models act as rigid patch experts locating specific facial features on the input image. Taken separately these approaches show their own specific strengths and shortcomings, but when combined together they allow for great accuracy and robustness.

The goal of this presentation is to describe how these approaches can be extensively parallelized on GPUs and APUs allowing for real-time experience on consumer hardware.


Accelerating OpenVL for Heterogeneous Platforms (Gregor Miller, University of British Columbia)

OpenVL is a high-level task-based abstraction for computer vision which does not require extensive knowledge or experience with vision methods, unlike most frameworks which present APIs as lists of specific techniques. OpenVL requires developers to have enough knowledge of a task to accurately describe it using our API; the description is analyzed and an appropriate method is invoked to provide a solution. We present our methodology for accelerating OpenVL on heterogeneous platforms using OpenCL for fundamental operations such as segmentation and correspondence. The accelerated methods are combined with CPU-only to offer greater functionality; due to the effectiveness of the abstraction, all of the methods are hidden to the developer, leading to an efficient and mainstream-developer friendly computer vision API. An evaluation on AMD OpenCL-compatible APU and GPU is presented to demonstrate the advantage of an abstraction which provides mainstream developers with performance gain and energy reduction by utilizing the resources efficiently.


OpenCV-CL (Harris Gasparakis, AMD)

We present an OpenCV 3.0 preview, emphasizing new architectural aspects, such as tight, transparent, yet efficient OpenCL integration and support: if a device supporting openCL is present in the system, openCV code is executed on the openCL device (unless explicitly prohibited). The same code base can therefore be executed on the best path (CPU vs OpenCL) supported by the underlying system. The simplified role of HSA is discussed in detail. OpenCV algorithms currently implemented using HSA extensions are discussed, and the HSA advantage highlighted.


For more information on the AMD Developer Summit, including an online registration form, please see the event page.