Page 45 - EE Times Europe Magazine – June 2024
P. 45
EE|Times EUROPE 45
SUPERCOMPUTERS
Lumi: CSC’s Manninen on Managing Europe’s
Biggest Supercomputer—and AI’s Expectations
By Pat Brans
oused in the Advanced Computing Facility at CSC’s IT Center for we named it Lumi, it was retrofitted with a
Science in Finland, Lumi has been in operation since 2021 and slightly clumsy “backronym”: Large Unified
reached its full capacity in 2023. According to the most recent edi- Modern Infrastructure.
Lumi is housed in what used to be a paper
Htion of the Top500 list, it’s now the fifth-largest supercomputer in mill, which was owned by UPM, one of the
the world and the biggest in Europe. biggest pulp and paper producers in the
world. UPM decided to close the plant in 2008
To find out more about the creation of Europe’s fastest supercomputer, its because of a worldwide overproduction of
management and the applications it runs, we spoke to Pekka Manninen, who, paper. Because the paper mill required over
as director of science and technology at the Advanced Computer Facility, is 230 MW of energy, the power capacity we
needed was already there when we moved in.
responsible for the underlying technology. By the way, the local district heating plant
is in the same area, so we push excess heat to
EE TIMES EUROPE: What is the story countries that supported the idea and agreed that system. The result is that Lumi heats 15%
behind Lumi, and how did it get its name? to contribute funding. [These were Finland, to 20% of Kajaani’s homes—and during certain
Pekka Manninen: Obviously, many people Denmark, Estonia, Sweden, Norway, Belgium, times, such as when heating plant mainte-
were involved in the decision-making and Czech Republic, Switzerland and Poland.] In nance breaks, we can heat the whole city.
preparation in many countries and at many Finland, the key contributors were the CSC Finland operates the facility and takes
levels. But if I had to name a single per- Ministry of Education and Culture and the care of system administration. Keeping the
son, it would be CSC’s managing director, Ministry of Economic Affairs and data center open requires only around
Kimmo Koski, who has relentlessly advo- Employment, which both helped with fund- 10 people. Then there are a lot of other people
cated for pan-European investments in HPC ing. Thanks to the efforts of all these groups, doing other work to ensure the usefulness
[high-performance computing], starting well the EuroHPC JU granted the Hosting Agree- and productivity of the supercomputer. Our
before the European Commission introduced ment to the Lumi consortium. Two other consortium countries run user support, and
plans for a European HPC infrastructure in countries [Iceland and the Netherlands] have special interest and collaboration groups
the form of the Europe HPC Joint since joined the consortium, which pays half address things like public relations, AI and
Undertaking [EuroHPC JU]. of the total cost of over €202 million. The EU cybersecurity.
Once EuroHPC JU was launched, Kimmo contributes the other half.
harnessed the interest of our organization As for the name Lumi, I was actually the EE TIMES EUROPE: What are the main
and of our partners. We all got behind the one who came up with that. It means “snow” architectural features of Lumi, and how
idea of proposing CSC’s data center in the city in Finnish. I thought a white supercomputer does it compare with other world-class
of Kajaani, which was already the home of would look cool, and then figured that snow supercomputers?
Finnish supercomputers. would be a strong reference to northern Manninen: Lumi consists of around
Back then, there was a consortium of nine Europe and white at the same time. After 3,000 GPU nodes, each with four GPUs, and
around 2,000 CPU nodes, with two CPUs each.
Its sustained compute capability toward a
single binary is around 380 petaFLOPS at
fp64 precision. Due to ambitious rack engi-
neering, it’s possible to condense this and all
the necessary storage into a rather compact
space, which in terms of square meters is
around the size of two tennis courts.
Lumi uses all-AMD node technology, with
AMD MI250X GPUs and AMD Milan 64-core
CPUs. Both computing partitions and all
the auxiliary resources—like data analyt-
ics partitions plus all the storage and data
management solutions—are tightly intercon-
nected in the same HPE Slingshot network.
Slingshot is based on high-radix switches,
which enable exascale and hyperscale data
center networks with at most three switch-
to-switch hops. It uses an optimized Ethernet
protocol, which allows it to interoperate with
standard Ethernet devices while providing
Lumi supercomputer (Source: Fade Creative) high performance to HPC applications.
www.eetimes.eu | JUNE 2024