Bookmark and Share

"Achieving 15 TOPS/s Equivalent Performance in Less Than 10 W Using Neural Network Pruning," a Presentation from Xilinx

Nick Ni, Director of Product Marketing for AI and Edge Computing at Xilinx, presents the "Achieving 15 TOPS/s Equivalent Performance in Less Than 10 W Using Neural Network Pruning on Xilinx Zynq" tutorial at the May 2018 Embedded Vision Summit.

Machine learning algorithms, such as convolution neural networks (CNNs), are fast becoming a critical part of image perception in embedded vision applications in the automotive, drone, surveillance and industrial vision markets. Applications include multi-object detection, semantic segmentation and image classification. However, when scaling these networks to modern image resolutions such as HD and 4K, the computational requirements for real-time systems can easily exceed 10 TOPS/s, consuming hundreds of watts of power, which is simply unacceptable for most edge applications.

In this talk, Ni describes a network/weight pruning methodology that achieves a performance gain of over 10 times on Zynq Ultrascale+ SoCs with very small accuracy loss. The network inference running on Zynq Ultrascale+ has achieved performance equivalent to 20 TOPS/s in the original SSD network, while consuming less than 10 W.