Bookmark and Share

Up to Our Eyeballs in Deep Learning Processors

This blog post was originally published in the late June 2017 edition of BDTI's InsideDSP newsletter. It is reprinted here with the permission of BDTI.

At the recent Embedded Vision Summit, I was struck by the number of companies talking up their new processors for deep neural network applications. Whether they’re sold as chips, modules, systems, or IP cores, by my count there are roughly 50 companies offering processors for deep learning applications. That’s a staggering figure, considering that there were none just a few years ago.

Even NVIDIA, which has enjoyed wide adoption of its GPUs for deep learning applications, introduced a specialized deep learning engine last month. And, not content with just one deep learning processor, several suppliers offer multiple deep learning architectures – some emphasizing high performance, others targeting low power consumption, extra flexibility, or lower cost.

Is this a brief passing fad, or something more important? It’s the start of something big.

What we’re seeing today are the early – messy – stages of a fundamental shift in computing. Yes, we’ll still have CPUs running databases and GPUs rendering graphics. But, increasingly, the most important compute tasks will utilize artificial neural network algorithms.

With billions of sensors deployed each year, we’re awash in data, but short on insights. Over the past 50 years, important machine perception problems like speech recognition and image classification have attracted thousands of man-years of research by some of the best minds on the planet. Researchers made slow, steady progress on these problems by creating extremely specialized, complex algorithms, and eventually began to field practical solutions. More recently, deep learning algorithms have been applied to these applications, often yielding shockingly good results.

Another surprising thing about deep learning algorithms is that rather than developing highly specialized algorithms for each task, a fairly small set of algorithms can solve a wide range of problems. With traditional, highly specialized algorithms, only a few of the most important problems (like speech recognition) could be addressed, because the process was so labor-intensive. Deep learning algorithms offer the possibility of creating solutions with much less engineering effort, enabling people to solve niche problems that previously couldn’t justify the investment required for a one-off solution. (A great example of this is Jacques Mattheij’s LEGO-sorting machine.)

So, in summary, deep neural networks are outperforming painstakingly honed classical techniques on “grand challenge” problems like image classification and speech recognition, and are enabling viable solutions for thousands of other problems that don’t merit huge investments in algorithm development. As a result, we should expect deep learning to be deployed everywhere: In home assistants like the Amazon Echo, in cars for safety and user interface enhancements, in mobile phones for improved photography, and of course in video cameras for security, retail analytics, energy management, and more.

Will all of these deep learning applications require new, specialized processors? No; those with modest processing demands (such as those processing occasional still images rather than video streams), and those with generous cost and power consumption budgets will get by just fine with existing types of general-purpose processors.

But for the many deep learning applications that combine high processing demands with significant cost or power constraints, specialized processors are a no-brainer. Over decades, processor designers have found that for demanding, parallelizable workloads, specialized processors easily gain an order of magnitude advantage in cost- and energy-efficiency over general-purpose processors. The initial generations of deep learning processors bear this out.

So, there really is a huge opportunity here. Actually there are two huge sets of opportunities: First, by using specialized processors, system developers will be able to deploy machine perception in places where it otherwise wouldn’t fit, bringing powerful new perception capabilities to many types of devices. Second, processor suppliers will find many homes for specialized deep learning processors.

Does this mean that the market can support 50 suppliers of specialized deep learning processors? No. We’re in an initial phase of experimentation and innovation, which will inevitably be followed by a winnowing and consolidation, just as we’ve seen before in other processor categories such as CPUs. However, the winners of this round of processor competition won’t necessarily be the same as the winners of earlier rounds. Deep learning is quite different from other types of workloads, and as a result, many of the legacy advantages of established processor vendors, such as computability and widespread programmer familiarity, don’t count for much. (Of course, other advantages of legacy suppliers, such as customer relationships, still apply.)

Regardless of which processor suppliers prevail, system designers and their customers will reap big rewards of systems that are safer, more autonomous, easier to use and more capable.

If you’re developing vision algorithms or applications, check out the new full-day, hands-on training class, “Deep Learning for Computer Vision with TensorFlow,” presented by the Embedded Vision Alliance in Santa Clara, California on July 13th, and in Hamburg, Germany on September 7th. For details, visit the event web page.

Jeff Bier
Co-Founder and President, BDTI
Founder, Embedded Vision Alliance