Page 40 - EE Times Europe Magazine – June 2024
P. 40
40 EE|Times EUROPE
Nvidia GTC 2024: Why Nvidia Dominates AI
Nvidia is reaping similar advantages from its activities in chip and weeks or months at a time and to reduce operating costs.
hardware design and the CUDA software platform with additional AI Nvidia said Blackwell is being adopted by every major global cloud
software development. services provider, pioneering AI companies, system and server vendors
and regional cloud service providers. Nvidia believes Blackwell will
KEY ANNOUNCEMENTS AT GTC 2024 be its most successful product launch in its history. The cloud players
GTC 2024 was primarily a technical conference for GPU developers, as need high-performance systems and connections based on Blackwell.
evidenced by its more than 900 technical sessions and over 20 work- Some of these are summarized below.
shops. More than 300 exhibitors showcased their hardware, software
and services focused on GPU-based market opportunities. DGX SuperPOD
The many announcements at GTC 2024 covered all of Nvidia’s prod- Nvidia announced its next-generation AI supercomputer: the Nvidia
uct lines. Table 2 summarizes most of the announcements. DGX SuperPOD. It is powered by Nvidia DG GB200 Grace Blackwell
superchips and can process trillion-parameter models for large-scale
Blackwell GPU generative AI training and inference workloads. It provides
Topping the list of announcements was the Blackwell GPU, which 11.5 exaFLOPS using FP4 data types and 240 TB of fast memory. It can
Nvidia claims is the most powerful processor chip available. Nvidia is scale to higher performance with additional racks of DGX systems.
positioning Blackwell as the next generation of accelerated comput-
ing and as the processor for the generative AI era. Blackwell has a NVLink Switch chip
second-generation transformer engine with support for FP4–FP6 data NVLink Switch chip is Nvidia’s high-speed network link for connecting
types in the Tensor cores. Most generative AI models need primarily multiple GPU processors and systems. It is a complex chip that has
low-precision calculations, and Blackwell performance is greatly 50 billion transistors and is manufactured by TSMC using 4-nm design
improved under such conditions. rules. Each NVLink Switch can connect four NVLink interconnects at
The new GPU architecture is named after David Harold Blackwell. A 1.8 TB/s.
University of California, Berkeley mathematician specializing in game NVLink Switch and GB200 are key components for creating giant
theory and statistics, he was the first Black scholar inducted into the GPUs. The Nvidia GB200 NVL72 is a multi-node, liquid-cooled, rack-
National Academy of Sciences. scale system that harnesses Blackwell to offer supercharged compute
Blackwell GPUs include a dedicated engine for reliability, for trillion-parameter models with 720 petaFLOPS of AI training
availability and serviceability (RAS). The RAS feature is especially performance and 1.4 exaFLOPS of AI inference performance in a
important for automotive and other systems in which a failure can single rack.
lead to a loss of life. Blackwell also adds chip-level capabilities for
using AI-based preventative maintenance to run diagnostics and NIM runtime software
forecast reliability issues. This improves system uptime and resil- Nvidia Inference Microservices (NIM) are secure software packages
iency for massive-scale AI deployments to run uninterrupted for built from Nvidia’s accelerated computing libraries and generative
Table 2: Announcement Summary
Topic Highlights
• Latest GPU microarchitecture: 16th generation since 1995
Blackwell GPU
• Named after David Harold Blackwell, a UC Berkeley mathematician
• Blackwell is the processor for the generative AI era
Nvidia’s positioning of Blackwell
• Blackwell is the next generation of accelerated computing
• 2.5× faster in FP8 AI training and 5× faster in FP4 AI inferencing; up to 576 GPUs
Blackwell vs. Hopper
• 25× better energy efficiency; first time Nvidia has focused on energy consumption
DG GB200 systems • Nvidia Quantum-X800 InfiniBand and Spectrum-X800 Ethernet: 800 GB/s
• Built from DG GB200 systems into large rack systems
DGX SuperPOD
• 11.5-exaFLOPS performance (FP4); trillion-parameter AI models; 240-TB memory
NVLink Switch • High-speed network link of multiple GPUs: up to 900 GB/s
Omniverse Cloud API • Nvidia Omniverse Cloud as APIs, providing digital twin simulation capabilities
• Nvidia Inference Microservices: secure packages of AI software services
NIM runtime software
• Built from Nvidia’s accelerated computing libraries and generative AI models
6G Research Cloud • Generative AI and Omniverse-powered platform for 6G technology simulation
• Nvidia Earth Climate Digital Twin cloud platform; available now, 2-km scale
Weather prediction
• Earth-2 APIs for Nvidia CorrDiff generative AI weather model
• Drive Thor: the next-generation AV platform and successor to Drive Orin
Blackwell-based Thor
• Jetson Thor: the next-generation robotics platform and successor to Jetson
FP4 = floating point with 4-bit accuracy
(Source: VSI Labs, April 2024)
JUNE 2024 | www.eetimes.eu