Page 50 - EE Times Europe Magazine – June 2024
P. 50
50 EE|Times EUROPE
BSC’s Girona on MareNostrum’s Evolution and Delivering World-Class Compute from Barcelona
problems. More services are needed for the processing. At the same time, it continues EE TIMES EUROPE: How do you decide
operation of MareNostrum 5—including to serve commodity requirements. It has a which applications to run, and what are
refrigeration and electrical transformers. general-purpose partition, which consists of some of those applications?
These new services occupy almost 3× as much 90 racks with 72 nodes. Each node has two Girona: The total cost of ownership for
space as they did in MareNostrum 4: about sockets of Sapphire Rapids 56 cores. And MareNostrum 5 is about €205 million over a
2,000 square meters. each of those 90 racks uses about 60 kW. five-year period. This includes both the initial
By the way, BSC does much more than just We made the decision to continue using a investment [capex] and the cost of opera-
MareNostrum. It’s a research and service significant part of our investment in money, tions [opex]. EuroHPC contributes 50% of the
center, with more than 1,000 people working power and space for this commodity cluster, total costs, and Spain, Portugal and Turkey
in different domains—including computer because there is still a big demand to run together contribute the remaining 50%.
science, life sciences, Earth sciences and legacy code that requires these processors. Access time is allocated based on financial
engineering. We created the supercomputer But we have a smaller partition in terms contribution, so EuroHPC decides on half the
within BSC not only to provide access to the of physical size, which consists of 35 racks, access time, and Spain, Portugal and Turkey
users at the Spanish and European level but with four Nvidia Hoppers [H100s] in each decide on the other half.
also to support the development of special node. Each rack uses about 80 kW. So the In each case, an access committee is used.
applications that have a worldwide impact. general-purpose partition uses a total of People submit applications, which are ranked
BSC is a joint public consortium, which about 4.5 MW, and the accelerated partition and evaluated for approval. The system is
includes the Polytechnical University of uses about 2.5 MW. The compute capacity mostly devoted to open science, but there is
Catalonia, the Catalan government and the is significantly different between the two remaining capacity for industrial applications.
Spanish Ministry of Science, Innovation and partitions. So EuroHPC has an access committee—as for
Universities. MareNostrum 5’s general-purpose par- Spain, Turkey and Portugal, they each have
tition is the world’s largest based on the their own.
EE TIMES EUROPE: What are the main popular x86 computing architecture, with a In the case of Spain, decisions are made
architectural features of MareNostrum 5? peak performance of 45.4 petaFLOPS. The based on external advice. It’s not the center
Girona: For MareNostrum 5, we decided to accelerated partition, the third most power- that makes access decisions, but external
continue with what we started in ful in Europe and eighth in the world, has a users who do peer reviews to decide on access
MareNostrum 4, which is to base the system peak performance of 260 petaFLOPS. It has time. We let the scientists guide us on how
on several connected clusters. No single 4,480 state-of-the-art Nvidia Hopper100 MareNostrum is used. But we do make sure
architecture solves all user problems, and this chips, each about 8 cm in size. To give you it is not oversubscribed, so people who are
2
concept gives us the flexibility we need to a sense of how far we’ve come, each granted time don’t have to wait.
address a range of application domains. of those chips is more powerful than the The system is huge, so we don’t have to
But while MareNostrum 4 was designed entire MareNostrum 1 installed in 2004, limit the use to one domain at the expense of
mostly for general-purpose processing, which occupied the entire 180-square- others. For example, it’s used for large lan-
MareNostrum 5 is a bigger system with meter Torre Girona chapel and was the guage models, as well as biomedical, energy,
more capacity to support the most intensive fourth-most-powerful in the world. industrial and automotive applications. But
MareNostrum 4 supercomputer (Source: BSC-CNS)
JUNE 2024 | www.eetimes.eu