Page 48 - EE Times Europe Magazine – June 2024
P. 48
48 EE|Times EUROPE
Cineca’s Bassini on the Secrets of Leonardo’s Supercomputing Power
The motivation for this upgrade is driven
by an increasing demand for resources to
train LLMs [large language models]. We are
currently working on two big projects. One is
for a text-based LLM for Italian; the other is
for a multimodal LLM for Italian, which would
include text, sounds and images. Leonardo
has also been used to train other European
languages—for example, Mistral, the
French-language LLM.
We also have requests to train domain-
specific LLMs. One is for weather forecasting,
based on radar images. Another is for an LLM IMAGE: LEONARDO CINECA
related to the “Beyond 1 Million Genomes”
[B1MG] project, which is a European
initiative.
All these projects—the general LLMs and
the more domain-specific ones—require
very, very significant resources, including Then we’ve had users who wanted to assess it was built. It was the first system designed
both compute and data storage and transfer. agricultural risk associated with climate with four GPUs per server node. Before [Leon-
We think that one way of improving how we change for insurance purposes. This analysis ardo], the maximum was two GPUs per server
accommodate AI applications is by making required big data and artificial intelligence to node. We designed it with Atos, the system
it easier to access huge amounts of data. To find patterns that could be used to make pre- integrator, and Nvidia.
that end, we plan to make available a large dictions. A lot of data and a lot of computing AMD technology has two core GPUs per
data lake repository based on multiprotocol power are needed. socket, which means that two sockets would
S3 technology both for training and inference Many of the scientific users want to run have four GPUs. But it’s different from the
phases. sophisticated simulations. For example, some point of view of data movement and access to
The third partition—the one designed for have used Leonardo to simulate the behav- shared memory. So it’s very effective for LIN-
AI—will be fully integrated with the bigger ior of new materials that might be used to PACK but not for AI, because data movement
system. We will also integrate quantum com- improve computers. Another example is to becomes a very big issue with AI.
puting technology. run simulations of the behavior of black hole We are beginning to think about the design
binaries in astrophysics; there is also a big of a post-Leonardo system. And we are of
EE TIMES EUROPE: What are some of the need to run simulations to predict plasma the mind that this cluster of SMP GPU nodes
applications Leonardo runs now? behavior for nuclear fusion. would be a good foundation. We would want
Bassini: It’s easy to get a broad overview Of course, AI is a big use case for several to go beyond the eight GPUs currently avail-
because we show the distribution of the different domains, including automotive. able from Nvidia, for example, toward maybe
workload in our annual report. The big appli- Both training and inference require huge a cluster of 16 GPUs per server node, some-
cation areas we reported for the year 2023 are amounts of data and computing power and thing like that. Of course, that would require
condensed-matter physics, computational will be a big part of what drives the evolution us to overcome the challenges with respect
chemistry, computational fluid dynamics, of Leonardo. to memory storage layers. We would probably
nuclear fusion, computational engineering, need HBM [high-bandwidth memory] capacity
astro and plasma physics, earth and climate, EE TIMES EUROPE: Can you tell us more and eventually even some CLX [Compute
life sciences and computational biology, life about the planned upgrades and the Express Link] kind of capacity.
sciences and bioinformatics, particle physics, evolution of Leonardo? The new system would add to Leonardo’s
and AI and machine learning. Bassini: One of the first things we plan to current design—the general-purpose partition
Some of the life sciences applications do is connect Leonardo with the two S3 data and the booster partition. We might com-
support the idea of personalized and pre- lake repositories we plan to set up—one in plement those partitions with an AI-LLM
cision medicine and require a lot of data, Bologna and the other in Naples. Since we’re partition and an inference service partition,
some of which is highly personal. To protect embracing AI for science and innovation, we for example.
data privacy, very high standards of secu- need a huge amount of data, and we do need As I mentioned before, we also plan to con-
rity are needed. Cineca created a working to provide easy access to the data. Leonardo nect quantum computers to Leonardo. In fact,
environment that complies with standards has 100 petabytes [PB] of scratch storage we plan to integrate two quantum computers.
for information security management in full capacity. We want to provide a permanent One is an educational system, which will be
compliance with GDPR [General Data Protec- home for data that can be used for different installed soon. This will be around 10 qubits,
tion Regulation]. applications. We are planning at least 100 PB and the procurement is in progress. The idea
Some of the recent use cases involved ana- of data lake repository, multiprotocol S3, for is that the technology for the educational
lyzing the suitability of land for vegetation. managing data of any kind. system will be based mostly on superconduct-
This is particularly important in the Piedmont Another step will be to implement what we ing qubits.
region of Italy, where agriculture makes up a call AI factory, a system that will support AI The other will be a production quan-
large portion of the economy. With climate workloads. The idea is that a cluster of SMPs tum computer, which will be co-funded
change, policymakers and scientists need a [shared-memory multiprocessors] would be by EuroHPC. The total investment will be
new set of tools to make predictions. This better than a cluster of the server nodes we around €200 million. The procurement will
kind of analysis requires the power of a use under the current architecture. By the be open soon. The idea is to use neutral atom
supercomputer. way, Leonardo was ahead of its time when technology. ■
JUNE 2024 | www.eetimes.eu