Page 48 - EE Times Europe Magazine – June 2024
P. 48

48 EE|Times EUROPE

        Cineca’s Bassini on the Secrets of Leonardo’s Supercomputing Power


          The motivation for this upgrade is driven
        by an increasing demand for resources to
        train LLMs [large language models]. We are
        currently working on two big projects. One is
        for a text-based LLM for Italian; the other is
        for a multimodal LLM for Italian, which would
        include text, sounds and images. Leonardo
        has also been used to train other European
        languages—for example, Mistral, the
        French-language LLM.
          We also have requests to train domain-
        specific LLMs. One is for weather forecasting,
        based on radar images. Another is for an LLM                                                               IMAGE: LEONARDO CINECA
        related to the “Beyond 1 Million Genomes”
        [B1MG] project, which is a European
        initiative.
          All these projects—the general LLMs and
        the more domain-specific ones—require
        very, very significant resources, including   Then we’ve had users who wanted to assess   it was built. It was the first system designed
        both compute and data storage and transfer.   agricultural risk associated with climate   with four GPUs per server node. Before [Leon-
        We think that one way of improving how we   change for insurance purposes. This analysis   ardo], the maximum was two GPUs per server
        accommodate AI applications is by making   required big data and artificial intelligence to   node. We designed it with Atos, the system
        it easier to access huge amounts of data. To   find patterns that could be used to make pre-  integrator, and Nvidia.
        that end, we plan to make available a large   dictions. A lot of data and a lot of computing   AMD technology has two core GPUs per
        data lake repository based on multiprotocol   power are needed.         socket, which means that two sockets would
        S3 technology both for training and inference   Many of the scientific users want to run   have four GPUs. But it’s different from the
        phases.                             sophisticated simulations. For example, some   point of view of data movement and access to
          The third partition—the one designed for   have used Leonardo to simulate the behav-  shared memory. So it’s very effective for LIN-
        AI—will be fully integrated with the bigger   ior of new materials that might be used to   PACK but not for AI, because data movement
        system. We will also integrate quantum com-  improve computers. Another example is to   becomes a very big issue with AI.
        puting technology.                  run simulations of the behavior of black hole   We are beginning to think about the design
                                            binaries in astrophysics; there is also a big   of a post-Leonardo system. And we are of
        EE TIMES EUROPE: What are some of the   need to run simulations to predict plasma   the mind that this cluster of SMP GPU nodes
        applications Leonardo runs now?     behavior for nuclear fusion.        would be a good foundation. We would want
        Bassini: It’s easy to get a broad overview   Of course, AI is a big use case for several   to go beyond the eight GPUs currently avail-
        because we show the distribution of the   different domains, including automotive.   able from Nvidia, for example, toward maybe
        workload in our annual report. The big appli-  Both training and inference require huge   a cluster of 16 GPUs per server node, some-
        cation areas we reported for the year 2023 are   amounts of data and computing power and   thing like that. Of course, that would require
        condensed-matter physics, computational   will be a big part of what drives the evolution   us to overcome the challenges with respect
        chemistry, computational fluid dynamics,   of Leonardo.                 to memory storage layers. We would probably
        nuclear fusion, computational engineering,                              need HBM [high-bandwidth memory] capacity
        astro and plasma physics, earth and climate,   EE TIMES EUROPE: Can you tell us more   and eventually even some CLX [Compute
        life sciences and computational biology, life   about the planned upgrades and the   Express Link] kind of capacity.
        sciences and bioinformatics, particle physics,   evolution of Leonardo?   The new system would add to Leonardo’s
        and AI and machine learning.        Bassini: One of the first things we plan to   current design—the general-purpose partition
          Some of the life sciences applications   do is connect Leonardo with the two S3 data   and the booster partition. We might com-
        support the idea of personalized and pre-  lake repositories we plan to set up—one in   plement those partitions with an AI-LLM
        cision medicine and require a lot of data,   Bologna and the other in Naples. Since we’re   partition and an inference service partition,
        some of which is highly personal. To protect   embracing AI for science and innovation, we   for example.
        data privacy, very high standards of secu-  need a huge amount of data, and we do need   As I mentioned before, we also plan to con-
        rity are needed. Cineca created a working   to provide easy access to the data. Leonardo   nect quantum computers to Leonardo. In fact,
        environment that complies with standards   has 100 petabytes [PB] of scratch storage   we plan to integrate two quantum computers.
        for information security management in full   capacity. We want to provide a permanent   One is an educational system, which will be
        compliance with GDPR [General Data Protec-  home for data that can be used for different   installed soon. This will be around 10 qubits,
        tion Regulation].                   applications. We are planning at least 100 PB   and the procurement is in progress. The idea
          Some of the recent use cases involved ana-  of data lake repository, multiprotocol S3, for   is that the technology for the educational
        lyzing the suitability of land for vegetation.   managing data of any kind.  system will be based mostly on superconduct-
        This is particularly important in the Piedmont   Another step will be to implement what we   ing qubits.
        region of Italy, where agriculture makes up a   call AI factory, a system that will support AI   The other will be a production quan-
        large portion of the economy. With climate   workloads. The idea is that a cluster of SMPs   tum computer, which will be co-funded
        change, policymakers and scientists need a   [shared-memory multiprocessors] would be   by EuroHPC. The total investment will be
        new set of tools to make predictions. This   better than a cluster of the server nodes we   around €200 million. The procurement will
        kind of analysis requires the power of a   use under the current architecture. By the   be open soon. The idea is to use neutral atom
        supercomputer.                      way, Leonardo was ahead of its time when   technology. ■

        JUNE 2024 | www.eetimes.eu
   43   44   45   46   47   48   49   50   51   52   53