Page 14 - EE Times Europe Magazine – November 2023
P. 14
14 EE|Times EUROPE
CTO INTERVIEWS
AMD’s Mark Papermaster: ‘We Reengineered
Our Engineering Processes’ to Enable Modular
Design
ark Papermaster joined AMD in
October 2011 and is now CTO and “Supercomputing is
EVP. He juggles responsibilities about heterogeneous
Mranging from product develop-
ment to technical direction in areas including computing,” says
microprocessor design, I/O and memory,
system-on-chip methodology and advanced Papermaster.
research. He also oversees the IT department
that delivers the company’s computing infra-
structure and services.
During his more than 40 years in the
industry, Papermaster has held leadership
roles at Cisco, where he led the silicon
engineering group, and Apple, where he was
SVP of devices hardware engineering. He also
worked at IBM for 26 years, where he per-
formed several roles in technology and server
development.
At AMD, Papermaster led the redesign of
engineering processes and the development
of the award-winning Zen high-performance weather forecasting and computational fluid used. We think the right approach is what we
x86 CPU family and high-performance GPUs. dynamics. call holistic design—thinking about energy
We now offer this technology commercially, efficiency and high performance together.
EE TIMES EUROPE: AMD has been and we’ve had tremendous success in the When you design a new computer chip,
enjoying success in the supercomputer market. In fact, we grew 29% year over year on you consider everything from the manufac-
market. Can you tell us about that? the TOP500 supercomputer list, and we power turing process to application development
Mark Papermaster: Supercomputing has seven of the top 10 supercomputers among and deployment. You need to work closely
been a major focus at AMD. We began restor- the TOP500 green supercomputers. That’s with your manufacturer already during the
ing our CPU roadmap about a decade ago. We because we’re also very focused on being design phase—for example, as you architect
reengineered our engineering processes, and energy-efficient as we provide that highest in controls that cause transistors to shut off
one of the things we settled on was a more computer performance. and stop consuming energy when they are not
modular design approach, where we develop This is the story of the AMD turnaround needed by the task you’re running.
reusable pieces that we could then put you’ve witnessed in recent years, and we You also need to consider applications
together based on [an application’s] needs. don’t plan to slow down. We have a roadmap during the design phase so you can develop
We invested in a new line of high- that will lead us to bigger and better things. the circuitry they need. And once you deliver
performance CPUs, and we also launched an the hardware, you have to help application
effort that brought our GPUs up to higher EETE: You mentioned green computing, developers make the most of it. In the case of
performance. Both types of processing units which is especially important here AI, we have advanced algorithms and math
are important because supercomputing is in Europe. Can you tell us a little bit formats that run approximations, which leads
about heterogeneous computing. It’s about more about how you’ve become more to more energy-efficient AI—and it goes all
CPUs and GPUs working together in harmony energy-efficient? the way up the stack. So holistic design means
to take on the heaviest lifting out there. Papermaster: First and foremost, energy that you’re thinking about both performance
The first big demonstration that we had the efficiency is part and parcel of our design and energy efficiency on every aspect of the
right strategy was with the U.S. Department process—and that’s a different way of think- design process all the way up to application.
of Energy, where we presented the underlying ing. As you’ll recall, Moore’s Law is the adage An example of what I just described is LUMI
concepts that would allow us to do what they that transistor density—and consequently, the in Finland—the most powerful supercomputer
needed. They really liked it, and we ended up performance of the devices that use in Europe and third in the world, according to
winning the bid for what is now the world’s transistors—would double about every the most recent TOP500 list. LUMI is an AMD
largest supercomputer. That’s the computer 24 months. The energy efficiency would CPU- and AMD GPU-based supercomputer.
called Frontier in Oak Ridge National Lab. increase accordingly. We have great partnerships with LUMI, the
It’s over an exaFLOPS of computing, which That slowed down because of physics— University of Turku and the Allen Institute.
is a thousand times a thousand FLOPS of the transistors are hitting molecular limits, Working with these partners, we were able to
computing. It’s really a monster. You need which means the old way of putting transis- gear up LUMI to efficiently run AI workloads.
that kind of computing for the most diffi- tors together doesn’t scale like it used to. It They are now using LUMI to train large lan-
cult simulations, such as highly accurate demands more innovation in how energy is guage models on Finnish and other languages.
NOVEMBER 2023 | www.eetimes.eu