Bookmark and Share

"Efficient Deployment of Quantized ML Models at the Edge Using Snapdragon SoCs," a Presentation from Qualcomm

Register or sign in to access the Embedded Vision Academy's free technical training content.

The training materials provided by the Embedded Vision Academy are offered free of charge to everyone. All we ask in return is that you register, and tell us a little about yourself so that we can understand a bit about our audience. As detailed in our Privacy Policy, we will not share your registration information, nor contact you, except with your consent.

Registration is free and takes less than one minute. Click here to register, and get full access to the Embedded Vision Academy's unique technical training content.

If you've already registered, click here to sign in.

See a sample of this page's content below:

Felix Baum, Director of Product Management for AI Software at Qualcomm, presents the "Efficient Deployment of Quantized ML Models at the Edge Using Snapdragon SoCs" tutorial at the May 2019 Embedded Vision Summit.

Increasingly, machine learning models are being deployed at the edge, and these models are getting bigger. As a result, we are hitting the constraints of edge devices: bandwidth, performance and power. One way to reduce ML computation demands and increase power efficiency is quantization—a set of techniques that reduce the number of bits needed, and hence reduce bandwidth, computation and storage requirements.

Qualcomm Snapdragon SoCs provide a robust hardware solution for deploying ML applications in embedded and mobile devices. Many Snapdragon SoCs incorporate the Qualcomm Artificial Intelligence Engine, comprised of hardware and software components to accelerate on-device ML.

In this talk, Baum explores the performance and accuracy offered by the accelerator cores within the AI Engine. He also highlights the tools and techniques Qualcomm offers for developers targeting these cores, utilizing intelligent quantization to deliver optimal performance with low power consumption while maintaining algorithm accuracy.