Page 25 - EE Times Europe Magazine

Page 25 - EE Times Europe Magazine – November 2023

P. 25

EE|Times EUROPE 25

How to Make Generative AI Greener

EETE: Why is mitigating inference’s 150,000 server-nodes-hours, ChatGPT report- around building higher-performing, less
environmental impact crucial to scaling edly emits about 55 tons of CO 2 equivalent expensive inference AI solutions that also
generative AI models in business daily. This is comparable to the lifetime reduce our carbon footprint. It’s an “and,” not
applications effectively? emissions of an average car, accumulating to an “or.” In this way, we can sustainably meet
Tanach: Generative AI is suffering from the the equivalent of 365 cars’ lifetime emissions the current and future demands of generative
same CPU-centric architecture as other each year, assuming steady usage. AI and other AI applications for fraud detec-
models, including image classification, Below are three studies outlining the tion, translation services, chatbots and more.
natural-language processing, recommenda- current and negative environmental impacts Today’s infrastructure lacks in two main
tion systems and anomaly-detection models. of today’s CPU- and GPU-centric generative ways:
NeuReality is reinventing inference AI AI models: • The system architecture uses non-AI-
to meet the current and future demands of • In 2019, University of Massachusetts specific hardware; therefore, it can’t do
generative AI and all other models that rely Amherst researchers trained several the real job of the inference server.
on inference to scale without bleeding money. LLMs and found that training a single • Even though the deep-learning model off-
When a company relies on a CPU to manage AI model can emit over 626,000 pounds loads software to hardware, there are still
inference in deep-learning models—no matter [about 283,948.59 kg] of CO 2 —equivalent too many surrounding functions running
how powerful the DLA—that CPU will reach to the emissions of five cars over their in the software. It’s not completely off-
an optimum threshold. lifetimes—as shared as far back as 2019 in loading to the extent needed to be more
In contrast, NeuReality’s AI solution stack MIT Technology Review energy-efficient.
does not buckle under the weight. The system (tinyurl.com/z28erurk). These system deficiencies lower the
architecture runs far more efficiently and • A more recent study (tinyurl.com/bdz9rtx5) utilization of the GPUs and DLAs used today,
effectively with less energy consumed. made a similar analogy. It reported that and the lack of efficiency takes a heavier toll
training GPT-3 with 175 billion parame- on energy consumption and therefore on the
EETE: What is the carbon footprint from ters consumed 1,287 MWh of electricity environment.
training generative AI models? and resulted in carbon emissions of NeuReality makes these models perform
Tanach: NeuReality’s AI-centric architecture 502 metric tons of carbon. That’s like better and more affordably while actually
with more energy-efficient NAPUs—which is a driving 112 gasoline-powered cars for a decreasing the impact on the environment.
new custom AI chip—reduces power con- year. We designed our system architecture for AI
sumption significantly. • Microsoft outlines the cost of Azure as opposed to modifying the old architecture.
In contrast, today’s generative AI and LLMs instances for calculations Our new NAPU offloads the leftover com-
pose significant environmental concerns due (tinyurl.com/rvp46re2). puting functions waterfalling to Arm cores,
to their high energy usage and resulting car- which are less expensive and power-hungry.
bon emissions. Analysts suggest the carbon EETE: How can we make these By removing that CPU bottleneck, we also
footprint of a single AI query could be 4× to models more performant than their increase the DLA utilization.
5× that of a typical search engine query. predecessors without imposing a heavier Taken together, all these contributors make
With daily consumption estimated at toll on the environment? the AI-centric solutions run better without
1.17 million GPU-hours, equating to Tanach: We have a strong sense of urgency imposing a heavier toll on the environment.

www.eetimes.eu | NOVEMBER 2023

20 21 22 23 24 25 26 27 28 29 30