Page 24 - EE Times Europe Magazine – November 2023
P. 24

24 EE|Times EUROPE



         GREENER ELECTRONICS | PROCESSING
        How to Make Generative AI Greener


        By Anne-Françoise Pelé


                 rtificial intelligence is an unstoppable force that is starting to per-  Second, all neural network models,
                 meate all aspects of our society.                              no matter which ones they are, must be
                                                                                trained to perform their intended tasks.
                   The advent of ChatGPT and similar generative AI tools has taken   The developer feeds their model a curated
        A the world by storm. While many have raved about the capabilities of   dataset so that it can “learn” everything
        these generative AI tools, the environmental costs and impact of these models   it needs to about the type of data it will
                                                                                analyze. ChatGPT [generative pre-trained
        are too often ignored. The development and use of these systems have been   transformer] excels at analyzing and then
        extremely energy-intensive, and their physical infrastructure requires a great   generating human-like text. ChatGPT was
        deal of energy.                                                         trained with all the data from the internet.
                                                                                Once it consumed all that internet and found
                                                                                all the connection points between different
                                                                                letters and words, all that data became struc-
                                                            NeuReality’s        tured inside ChatGPT.
                                                            Moshe Tanach         Third, once it is frozen and using new
                                                                                context or input, you are doing inference—the
                                                                                process of using a trained model. To under-
                                                                                stand inference, imagine teaching someone to
                                                                                identify musical instruments by their sound.
                                                                                You start by playing a guitar, a violin and a
                                                                                ukulele, and you explain that these instru-
                                                                                ments produce different sounds. Later, when
                                                                                you introduce a banjo, the person can infer
                                                                                that it produces a unique sound similar to
                                                                                the guitar, violin and ukulele since they’re all
                                                                                string instruments.
                                                                                 NeuReality is specifically focused on the
                                                                                inference phase, not the training of complex
                                                                                AI models. Instead, we create the under-
                                                                                lying architecture and technology stack
                                                                                for AI-centric inference in the data center
                                                                                to achieve the best performance at lower
                                                                                cost and energy consumption—and make it
                                                                                easy to use and deploy so all businesses can
                                                                                benefit.
          Deploying AI creates massive technical   EE TIMES EUROPE: What exactly is
        challenges for the traditional CPU-centric   inference AI, and how does it relate   EETE: How does NeuReality’s inference
        computing architecture. Data is moved   to generative AI with large language   AI solution help solve generative AI
        multiple times between the network, CPU   models [LLMs] like ChatGPT?   problems?
        and deep-learning accelerator (DLA) with   Moshe Tanach: I’ll break it all down to   Tanach: Imagine billions of AI queries made
        software-based management and data   explain why inference AI and NeuReality’s   daily on an LLM like ChatGPT and others
        control. This creates multiple conflicts   specific technology system is relevant to   like it.
        between parallel commands, which limits the   the economic viability of generative AI and   The amount of computer power required to
        DLA’s utilization, wastes valuable hardware   ChatGPT—and other LLMs like it.  classify, analyze and answer those AI queries
        resources and increases costs and power   First, any neural network model always   is astronomical, as are the system costs,
        consumption.                        complies with an underlying architecture,   inefficiencies and carbon emissions compared
          How can we harness the benefits of AI   such as CNN [convolutional neural network],   with traditional models. It’s well-publicized
        while mitigating its carbon footprint? In a   RNN [recurrent neural network], LSTM [long   from Microsoft and OpenAI themselves that
        discussion with EE Times Europe, Moshe   short-term memory] and now     it costs millions of dollars per day to run
        Tanach, CEO and co-founder of       transformer-based models [encoder/decoder]   ChatGPT alone.
        NeuReality, said the key to reducing AI’s   used in LLMs and generative AI. With it, you   In fact, generative AI requires 10× less
        carbon emissions lies in streamlining oper-  can generate language, images and other   input than general-purpose CPU-centric sys-
        ations and bolstering efficiency. He argued   possibilities in the future, and you can let it   tems. NeuReality has designed its networked
        that the transition from a resource-intensive   run as long as you want, giving it new context   addressable processing units [NAPUs] to
        CPU-centric model to NeuReality’s    or new input. That’s why in ChatGPT, you see   operate on far less power. Therefore, we help
        AI-centric model and server-on-a-chip   the “regenerate” function. So generative AI   companies save resources while softening
        solution can lower cost, reduce energy con-  is yet another example of a neural network   the burden on the world’s energy systems—
        sumption and increase throughput.   model or AI category.               validated in test cases with IBM Research.

        NOVEMBER 2023 | www.eetimes.eu
   19   20   21   22   23   24   25   26   27   28   29