Page 17 - EE Times Europe Magazine | February 2020
P. 17

EE|Times EUROPE   15
                                                           eC n    t s  ea     ard to   cceed  it    otic  ard are



            operation into a matrix multiplication, includ-
            ing convolutions and fully connected nets,”   A Neural-Network Processing
            said LeCun. “[It] is a challenge for the hardware
            community to create architectures that don’t   Timeline
            lose performance by using batch size = 1. That
            applies to training, of course; the optimal size   Late 1980s: Resistor arrays are used to do matrix multiplication. By the late
            of batch for training is 1. We use more because       s  the arrays have gained ampli ers and converters around them  ut are
            our hardware forces us to do so.”      still  uite primitive  y today s standards.  he limitation is how  ast data can  e
                                                    ed into the chip.
            SELF-SUPERVISED LEARNING               1991:  he  rst chip designed  or convolutional neural networks     s  is  uilt.
            Another challenge for hardware is that the    he chip is capa le o      giga operations per second        on  inary data
            learning paradigms we currently use will   with digital shi t registers that minimi e the amount o  e ternal tra c needed
            change, and this will happen imminently,   to per orm a convolution  there y speeding up operation.  he chip does not see
            according to LeCun.                    use  eyond academia.
              “There is a lot of work [being done] on
            trying to get machines to learn more like   1992: A  A  an analog neural network A   chip  de uts.  esigned  or    s with
            humans and animals, and humans and animals      it weights and    it activations  A  A contains         transistors in  .   m
            don’t learn by supervised learning or even by    M  .  t is used  or optical character recognition o  handwritten te t.
            reinforcement learning,” he said. “They learn   1996:   A A  a digital version o  A  A  is released.  ut with neural networks
            by something I call self-supervised learning,    alling out o   avor  y the mid     s    A A is eventually repurposed  or signal
            which is mostly by observation.”       processing in cellphone towers.
              LeCun described a common approach to   2009–2010:  esearchers demonstrate a hardware neural network accelerator
            self-supervised learning in which a piece   on an    A  the  ilin   irte    .  t runs a demo  or semantic segmentation  or
            of the sample is masked and the system is   automated driving and is capa le o           at a out  .   .  he team   rom
            trained to predict the content of the masked    urdue  niversity  tries to make an A     ased on this work   ut the pro ect
            piece based on the part of the sample that’s   proves unsuccess ul.
            available. This is commonly used with images,
            wherein part of the image is removed, and     ource   ann  e un  ace ook
            text, with one or more words blanked out.
            Work so far has shown that it is particularly
            effective for NLP; the type of networks used,   abundant, we can train very large networks in   consumption, and he questioned whether this
            transformers, have a training phase that uses   terms of data. Hardware requirements for the   will always be possible.
            self-supervised learning.           final system will be much, much bigger than   LeCun described himself as “skeptical” of
              The trouble from a hardware perspective   they currently are. The hardware race will not   futuristic new approaches such as spiking
            is that transformer networks for NLP can be   stop any time soon.”      neural networks and neuromorphic comput-
            enormous: The biggest ones today have                                   ing in general. There is a need to prove that
            5 billion parameters and are growing fast,   HARDWARE TRENDS            the algorithms work before building chips for
            said LeCun. The networks are so big that they   New hardware ideas that use techniques such   them, he said.
            don’t fit into GPU memories and have to be   as analog computing, spintronics, and optical   “Driving the design of such systems
            broken into pieces.                 systems are on LeCun’s radar. He cited com-  through hardware, hoping that someone will
              “Self-supervised learning is the future —   munication difficulties — problems converting   come up with an algorithm that will use this
            there is no question [about that],” he said.   signals between novel hardware and the rest   hardware, is probably not a good idea,” LeCun
            “But this is a challenge for the hardware com-  of the required computing infrastructure — as   said. ■
            munity because the memory requirements are   a big drawback. Analog implementations, he
            absolutely gigantic. Because these systems   said, rely on making activations extremely   Sally Ward-Foxton is a staff correspondent
            are trained with unlabeled data, which is   sparse in order to gain advantages in energy   at AspenCore.


























                                                                                       www.eetimes.eu | FEBRUARY 2020
   12   13   14   15   16   17   18   19   20   21   22