Bookmark and Share

"Performance Analysis for Optimizing Embedded Deep Learning Inference Software," a Presentation from Arm

Register or sign in to access the Embedded Vision Academy's free technical training content.

The training materials provided by the Embedded Vision Academy are offered free of charge to everyone. All we ask in return is that you register, and tell us a little about yourself so that we can understand a bit about our audience. As detailed in our Privacy Policy, we will not share your registration information, nor contact you, except with your consent.

Registration is free and takes less than one minute. Click here to register, and get full access to the Embedded Vision Academy's unique technical training content.

If you've already registered, click here to sign in.

See a sample of this page's content below:


Gian Marco Iodice, Staff Compute Performance Software Engineer at Arm, presents the "Performance Analysis for Optimizing Embedded Deep Learning Inference Software" tutorial at the May 2019 Embedded Vision Summit.

Deep learning on embedded devices is currently enjoying significant success in a number of vision applications—particularly smartphones, where increasingly prevalent AI cameras are able to enhance every captured moment. However, the considerable number of deep learning network architectures proposed every year has led to real challenges for software developers who need to implement these demanding algorithms very efficiently.

In this presentation, Iodice presents a structured approach for performance analysis of deep learning software implementations. He examines the fundamentals of performance analysis for deep learning, presenting metrics and methodologies. He then shows how a top-down approach can be used to detect and fix performance bottlenecks, creating efficient deep neural network software implementations. He also illustrates typical software optimizations that can be used to make the best use of available computational resources.