# Architecting Always-On, Context-Aware, On-Device Al Using Flexible Low-power FPGAs

Deepak Boppana – Senior Director Product & Segment Marketing Gordon Hands – Director Solutions Marketing



# Rapidly Emerging Edge Computing Trend

Driven by Latency, Privacy, and Bandwidth Limitations



Unit growth for edge devices with AI will explode increasing over 110% CAGR over the next five years – Semico Research

## Always-on, On-device Al Applications

Human Presence Detection Example













# Always-on, On-device Al Applications

Other Examples













## Always-on, On-device Al Requirements

Unmet Need for Ultra-Low Power, Scalable, and Flexible Inferencing















#### **CUSTOM DESIGN SERVICES**

Mobile

**Smart Home** 

**Smart City** 

**Smart Factory** 

**Smart Car** 



#### **REFERENCE DESIGNS / DEMOS**

Face Hand Gesture Detection

Gesture Key Phrase ection Detection

Human Presence Detection Face Tracking

**Object Counting** 

Speed Sign Detection



#### **SOFTWARE TOOLS**

LATTICE RADIANT DESIGN SOFTWARE

**Neural Network Compiler** 

Caffe







#### **IP CORES**

**Neural Network Accelerators** 

**CNN Compact Accelerator** 



**CNN Accelerator** 



#### HARDWARE PLATFORMS

Mobile Development Platform – iCE40 UltraPlus FPGA



Video Interface Platform
– ECP5 FPGA

1 mW, 5.5 mm<sup>2</sup>, 1/16 bits

1 W, 100 mm<sup>2</sup>, 1/8/16 bits



# Flexible and Scalable Inferencing at the Edge

From under 1 mW to 1 W with Lattice sensAl



# Stand-alone, Integrated FPGA Solution



- Always-on, integrated solutions on ECP5 or iCE40 UltraPlus FPGA
- Low latency and secure implementation
- Small form factor packages from 5.5 mm<sup>2</sup> to 100 mm<sup>2</sup>



# FPGA as Activity Gate to ASIC/ASSP



- iCE40 UltraPlus FPGA for always-on detection of key-phrases or objects
- Wakes-up a high performance ASIC/ASSP for further analytics only when required
- Reduces overall system power consumption



### FPGA as a Co-Processor to MCU



- Scalable performance/power with ECP5 based neural network acceleration
- ECP5 based IO flexibility to seamlessly interface to on-board legacy devices including sensors
- Low-end MCU for flexible system control



# Delivering Edge CNN Acceleration in Lattice FPGA



















FPGA Bitstream





### **CNN** Accelerator IP Architecture



## **CNN Compact Accelerator IP Architecture**



# Translating Trained Neural Network Into Lattice CNN Accelerator Instructions













## On-device AI – Complex Optimization

| Design Factors       | Device          |                 | Network       |                          |            |
|----------------------|-----------------|-----------------|---------------|--------------------------|------------|
| Attributes           | # of<br>Engines | Local<br>Memory | Input<br>Size | Number of<br>Multipliers | Bit Widths |
| Power (W)            |                 |                 |               |                          |            |
| Device Size          |                 |                 |               |                          |            |
| Performance (fps)    |                 |                 |               |                          |            |
| Accuracy (%)         |                 |                 |               |                          |            |
| Small Object (% fov) |                 |                 |               |                          |            |

**Correlation Between Design Factors and Product Attributes** 

# **Examples for Illustration**

|                             | Architecture | Number of Multiplications | Input Size | Quantization       |
|-----------------------------|--------------|---------------------------|------------|--------------------|
| Face<br>Detection           | VGG style    | 290,816                   | 32*32*3    | 16-bit fixed point |
|                             | VGG style    | 14,353,920                | 90*90*3    | 16-bit fixed point |
| Human Presence<br>Detection | VGG style    | 8,570,880                 | 64*64*3    | 16-bit fixed point |
|                             | VGG style    | 338,558,976               | 128*128*3  | 16-bit fixed point |

# Image Based Neural Networks on Lattice FPGAs





# **Image Based Neural Networks Lattice Hardware**



Himax HM01B0 UPduino Shield



**Embedded Vision Development Kit** 



# **Face Detect Implementations**





### 90 x 90 Input





# **Human Presence Detect Implementations**

64 x 64 Input



ECP5-45

ECP5-85

ECP5-25 \* Running at 5 frames per second

128 x 128 Input



**UltraPlus** 

# **Bringing It Together**

|                                          |                    | Device Size / Power / Performance             |                                         |                                         |                                         |
|------------------------------------------|--------------------|-----------------------------------------------|-----------------------------------------|-----------------------------------------|-----------------------------------------|
| Network                                  | Smallest<br>Object | UltraPlus<br>1 – 7 mW*<br>5.5 mm <sup>2</sup> | ECP5-25<br>0.5 W<br>100 mm <sup>2</sup> | ECP5-45<br>0.6 W<br>100 mm <sup>2</sup> | ECP5-85<br>0.8 W<br>100 mm <sup>2</sup> |
| Face Detection<br>32 x 32 Input          | 50%                | 465                                           | 3360                                    | 4511                                    | 5251                                    |
| Face Face Detection<br>90 x 90 Input     | 20%                |                                               | 28                                      | 82                                      | 101                                     |
| Human Presence Detect<br>64 x 64 Input   | 20%                | 18                                            | 115                                     | 161                                     | 338                                     |
| Human Presence Detect<br>128 x 128 Input | 10%                |                                               | 2.3                                     | 3.5                                     | 5.4                                     |

<sup>\*</sup> Running at 5 frames per second

# **Summary**

- Al at the edge solves real world problems
- FPGAs can implement AI standalone or in conjunction with other components
- sensAl stack components provide edge Al building blocks
  - Silicon, soft IP, tools, development boards & reference designs
- Configurable engine size and bit widths coupled with multiple target devices allows system optimization
  - 1 mW 1 W
  - 5.5 mm<sup>2</sup> 100 mm<sup>2</sup>

### Resources

Please visit <u>latticesemi.com/sensAl</u> for more information and downloads

- 4 ECP5 Based Reference Designs / Demonstrations Free
- 4 iCE40 Based Reference Designs / Demonstrations Free
- CNN Accelerator IP Free Evaluation
- CNN Compact Accelerator IP Free
- Neural Network Compiler Free
- Embedded Vision Development Kit \$199 Promotional Price
- Himax HM01B0 UPduino Shield Available November ~\$49

# **Empowering Product Creators to Harness Embedded Vision**



The Embedded Vision Alliance (<a href="www.Embedded-Vision.com">www.Embedded-Vision.com</a>) is a partnership of 90+ leading embedded vision technology and services suppliers, and solutions providers



Mission: Inspire and empower product creators to incorporate visual intelligence into their products



The Alliance provides low-cost, high-quality technical educational resources for product developers

Register for updates at <a href="https://www.Embedded-Vision.com">www.Embedded-Vision.com</a>

The Alliance enables vision technology providers to grow their businesses through leads, ecosystem partnerships, and insights

Embedded Vision Insights
The Latest Developments on Designing Machines that

For membership, email us: membership@Embedded-Vision.com

# Join us at the Embedded Vision Summit May 20-23, 2019—Santa Clara, California



The only industry event focused on enabling product creators to create "machines that see"

- "Awesome! I was very inspired!"
- "Fantastic. Learned a lot and met great people."
- "Wonderful speakers and informative exhibits!"

### **Embedded Vision Summit 2019 highlights:**

- Inspiring keynotes by leading innovators
- High-quality, practical technical, business and product talks
- Exciting demos of the latest apps and technologies

Visit <a href="https://www.EmbeddedVisionSummit.com">www.EmbeddedVisionSummit.com</a> to sign up for updates



## Q & A



