Page 50 - EE Times Europe Magazine – June 2024
P. 50

50 EE|Times EUROPE

        BSC’s Girona on MareNostrum’s Evolution and Delivering World-Class Compute from Barcelona

        problems. More services are needed for the   processing. At the same time, it continues   EE TIMES EUROPE: How do you decide
        operation of MareNostrum 5—including   to serve commodity requirements. It has a   which applications to run, and what are
        refrigeration and electrical transformers.   general-purpose partition, which consists of   some of those applications?
        These new services occupy almost 3× as much   90 racks with 72 nodes. Each node has two   Girona: The total cost of ownership for
        space as they did in MareNostrum 4: about   sockets of Sapphire Rapids 56 cores. And   MareNostrum 5 is about €205 million over a
        2,000 square meters.                each of those 90 racks uses about 60 kW.   five-year period. This includes both the initial
          By the way, BSC does much more than just   We made the decision to continue using a   investment [capex] and the cost of opera-
        MareNostrum. It’s a research and service   significant part of our investment in money,   tions [opex]. EuroHPC contributes 50% of the
        center, with more than 1,000 people working   power and space for this commodity cluster,   total costs, and Spain, Portugal and Turkey
        in different domains—including computer   because there is still a big demand to run   together contribute the remaining 50%.
        science, life sciences, Earth sciences and   legacy code that requires these processors.  Access time is allocated based on financial
        engineering. We created the supercomputer   But we have a smaller partition in terms   contribution, so EuroHPC decides on half the
        within BSC not only to provide access to the   of physical size, which consists of 35 racks,   access time, and Spain, Portugal and Turkey
        users at the Spanish and European level but   with four Nvidia Hoppers [H100s] in each   decide on the other half.
        also to support the development of special   node. Each rack uses about 80 kW. So the   In each case, an access committee is used.
        applications that have a worldwide impact.   general-purpose partition uses a total of   People submit applications, which are ranked
        BSC is a joint public consortium, which   about 4.5 MW, and the accelerated partition   and evaluated for approval. The system is
        includes the Polytechnical University of   uses about 2.5 MW. The compute capacity   mostly devoted to open science, but there is
        Catalonia, the Catalan government and the   is significantly different between the two   remaining capacity for industrial applications.
        Spanish Ministry of Science, Innovation and   partitions.               So EuroHPC has an access committee—as for
        Universities.                         MareNostrum 5’s general-purpose par-  Spain, Turkey and Portugal, they each have
                                            tition is the world’s largest based on the   their own.
        EE TIMES EUROPE: What are the main   popular x86 computing architecture, with a   In the case of Spain, decisions are made
        architectural features of MareNostrum 5?  peak performance of 45.4 petaFLOPS. The   based on external advice. It’s not the center
        Girona: For MareNostrum 5, we decided to   accelerated partition, the third most power-  that makes access decisions, but external
        continue with what we started in    ful in Europe and eighth in the world, has a   users who do peer reviews to decide on access
        MareNostrum 4, which is to base the system   peak performance of 260 petaFLOPS. It has   time. We let the scientists guide us on how
        on several connected clusters. No single   4,480 state-of-the-art Nvidia Hopper100   MareNostrum is used. But we do make sure
        architecture solves all user problems, and this   chips, each about 8 cm  in size. To give you   it is not oversubscribed, so people who are
        concept gives us the flexibility we need to   a sense of how far we’ve come, each    granted time don’t have to wait.
        address a range of application domains.  of those chips is more powerful than the   The system is huge, so we don’t have to
          But while MareNostrum 4 was designed   entire MareNostrum 1 installed in 2004,   limit the use to one domain at the expense of
        mostly for general-purpose processing,   which occupied the entire 180-square-   others. For example, it’s used for large lan-
        MareNostrum 5 is a bigger system with   meter Torre Girona chapel and was the   guage models, as well as biomedical, energy,
        more capacity to support the most intensive   fourth-most-powerful in the world.  industrial and automotive applications. But

        MareNostrum 4 supercomputer (Source: BSC-CNS)

        JUNE 2024 |
   45   46   47   48   49   50   51   52   53   54   55