Page 12 - EE Times Europe Magazine | June2020
P. 12

12 EE|Times EUROPE



            ARTIFICIAL  INTELLIGENCE
           Startup Reinvents Neural Network Math,

           Launches 20-mW Edge AI Chip


           By Sally Ward-Foxton


                  Silicon Valley startup claims that it has reinvented
                  the mathematics of neural networks and has
                  produced a complementary edge AI chip, already
           A sampling, that does not use the usual large array
           of multiply-accumulate (MAC) units. The chip can run the
           equivalent of 4 TOPS, with impressive power consumption
           of 55 TOPS/W, and achieves data-center class inference in
           under 20 mW (YOLOv3 at 30 frames per second), according
           to the company.
             San Jose-based Perceive has been in super-stealth mode
           until now; as a spin-out from Xperi, it has been funded
           entirely by its parent since officially forming two years ago.
           The team comprises 41 people, with a similar number within
           Xperi working on apps for the chip. Founding CEO Steve Teig
           is also CTO of Xperi; he was previously founder and CTO
           of Tabula, a 3D programmable logic startup that closed its
           doors five years ago, and prior to that was CTO of Cadence.
             Teig explained that the initial idea was to combine   Perceive claims that its Ergo chip’s efficiency is up to 55 TOPS/W, running
           Xperi’s classical knowledge of image and audio processing   YOLOv3 at 30 fps with just 20 mW. (Image: Perceive)
           with machine learning. Xperi owns brands such as DTS,
           IMAX Enhanced, and HD Radio. Its technology portfolio
           includes image-processing software for features like photo red-eye   efficient, achieving 55 TOPS/W. This figure is an order of magnitude
           reduction and image stabilization, which are widely used in digital   above what some competitors are claiming. Perceive’s figures have it
                                    cameras, plus audio-processing soft-  running YOLOv3, a large network with 64 million parameters, at 30 fps
                                    ware for Blu-Ray disc players.  while consuming just 20 mW.
                                     “We started with a clean sheet of   This power efficiency comes down to some aggressive power-gating
                                    paper and used information theory to   and clock-gating techniques, which exploit the deterministic nature
                                    ask: What computations are neural   of neural network processing; unlike other types of code, there are no
                                    networks actually doing, and is there   branches, so timings are known at compile time. This allows Perceive to
                                    a different way of approaching that   be precise about what needs to be switched on and when.
                                    computation that could change what   “In a battery-powered setting, [the chip] can be literally off — zero
                                    is possible [at the edge]?” Teig said.   milliwatts — and have some kind of microwatt motion sensor or analog
                                    “After a couple of years of doing this   microphone to detect something that might be of interest,” Teig said.
                                    work, we discovered [there] was and   “We can wake up from off, load a giant neural network of data-center
                                    then decided … we should make a   class, and be running it in about 50 milliseconds, including decryption.
                                    chip that embodies these ideas.”  So we leave only about two frames of video on the floor.”
                                     The idea that Teig presented to   But careful hardware design is only part of the picture.
                                    the Xperi board was to spin out a
           Perceive’s Steve Teig    company to make a chip that could   INFORMATION THEORY
                                    do meaningful inference in edge   “We’ve come up with a different way of representing the underlying
           devices with a power budget of 20 mW. The result, a 7 × 7-mm chip   computation itself and the arithmetic that goes with it,” Teig said. “We
           named Ergo, can run 4 TOPS without external RAM (in fact, it is running   are representing the network itself in a new way, and that’s where our
           the equivalent of what a GPU rated at 4 TOPS can achieve, Teig said).   advantage comes from.”
           Ergo supports many styles of neural networks, including convolutional   Perceive started with information theory — a branch of science that
           networks (CNNs) and recurrent neural networks (RNNs), in contrast with   includes mathematical ways to distinguish signal from noise — and used
           many solutions on the market that are tailored for CNNs. Ergo can even   its concepts to look at how much computation is required to pull the sig-
           run several heterogeneous networks simultaneously.    nal from the noise. Teig uses an object-detection network as an example.
             “The only thing that limits how many networks we can run is the   “You hand the network millions of pixels, and all you want to know
           total memory that’s required for the combination,” Teig said, adding   is: Is there a dog in this picture or not?” he said. “Everything else in the
           that Perceive has demonstrated simultaneously running YOLOv3 or   picture is noise, except dog-ness [the signal]. Information theory makes
           M2Det — with 60 million or 70 million parameters — plus ResNet   it quantifiable — how much do you have to know [to tell whether there is
           28 with several million parameters, plus a long short-term memory   a dog in the picture]? You can actually make it precise, mathematically.”
           (LSTM) network or RNN to do speech and audio processing. In    As Teig describes it, mainstream neural networks are able to generalize
           an application, this might correspond to enabling imaging and audio   based on seeing many pictures of dogs because they have found at least
           inference at the same time.                           some of the signal in the noise, but this has been done in an empirical way
             Perceive also claims that its Ergo chip is extraordinarily power-   rather than with a mathematically rigorous approach. This means noise is

           JUNE 2020 | www.eetimes.eu
   7   8   9   10   11   12   13   14   15   16   17