Lenovo expands HPC system at LRZ’s supercomputer, integrating AI for phase two

22/06/2021

Lenovo is expanding its high-perfomance computing (HPC) with Leibniz Supercomputing Centre(LRZ) by integrating artificial intelligence (AI) for phase two next generation supercomputer. The system will deliver high performance integrated solutions to the LRZ user community, utilising artificial intelligence to implement advanced simulations, modelling, and data analysis that will accelerate research to help solve humanity’s greatest challenges.

With funding from the Free State of Bavaria and the German Federal Ministries of Education and Research, Phase Two sees the expansion of SuperMUC-NG, part of the Gauss Centre for Supercomputing (GCS), so as to continue as one of the fastest, most energy-efficient supercomputers in the world.

The Leibniz Supercomputing Centre (LRZ) stands in the field as a world-class IT service and computing user facility serving Munich’s top universities as well as research institutions in Bavaria, Germany and Europe. As an institute of the Bavarian Academy of Sciences and Humanities, LRZ has provided a robust, holistic IT infrastructure for its users throughout the scientific community for nearly sixty years.

AI enabling Integrated Solutions

Since SuperMUC-NG Phase One was launched, practitioners have used the supercomputer not only for traditional simulation and modelling, but also to automate image and pattern recognition in planet observations, climate data from satellites, medical visuals and health records, and data demographics. Given the successful utilisation of SuperMUC-NG in these projects, the demand for high performance data analytics, machine learning and fast memory performance has further increased.

To meet these demands and ensure researchers are supported, SuperMUC-NG will now be enhanced with next-generation Intel Xeon Scalable processors (codenamed Sapphire Rapids) and Intel’s upcoming HPC GPUs based on the X^e HPC architecture, codenamed “Ponte Vecchio”.

Phase Two will also use distributed asynchronous object storage (DAOS), leveraging 3^rdGen Intel Xeon Scalable processors (codename “Ice Lake”) integrated into Lenovo’s ThinkSystemSR630 V2 platform. DAOS provides 1 peta byte of data storage, and will enable fast throughput of large data volumes, while the system architecture can deliver highly scalable, compute and data-intensive workloads and artificial intelligence applications. Overall, the SuperMUC-NG Phase Two compute nodes will deliver four times higher performance per Watt (High Performance Linpack) than Phase One.

“The Leibniz Supercomputing Centre has long been an important innovation partner for both Lenovo and Intel. Phase Two is an exciting opportunity to share our expertise in what Lenovo calls ‘Exascale for Everyscale’- solutions using advanced exascale technologies in any size cluster- and provide researchers with the specialist resources needed to accelerate projects,” explains Scott Tease, Vice President, HPC and AI, Lenovo Infrastructure Solutions Group.

Also said that, “Through the implementation of our Neptune™ warm-water cooling and a smarter integrated system for artificial intelligence and deep learning, LRZ can continue to be a thought leader in advanced technologies for many years to come, and set new standards for research and development.”

Ensuring a sustainable approach

The enhancements made in Phase Two will ensure SuperMUC-NGis now capable of performing additional tasks in a way that’s as energy-efficient as possible. The key to this is the integration of 240 Intel compute nodes into Lenovo’s ThinkSystem SD650leveraging Neptune™ warm watercooling and connected to the DAOS storage system via a high-speed network.Lenovo’s innovative Neptune™ direct water-cooling technology removes approximately 90% of the heat from the computer system, reducing overall energy consumption, significantly increasing overall efficiency and ultimately allowing the processors to perform at their peak.

In addition, the components for SuperMUC-NG Phase Two will be manufactured within Europe, in Lenovo’s new dedicated manufacturing facility in Hungary, to help further improve the eco-footprint of the project’s supply chain.

“Delivering resources and services that empower researchers to accelerate their projects is at the heart of everything we do at LRZ,” says Prof. Dr. Dieter Kranzlmüller, Director of the LRZ.

He also said, “Our work with Lenovo and other partners to integrate advanced AI capabilities into this next phase will help the centre better achieve this, and ensure researchers are given what they need to excel in their scientific fields. Not only that, but with Lenovo’s warm-water cooling technology we’re able to deliver these enhancements in a way that’s as sustainable, and energy-efficient as possible.”

Phase Two kick off

LRZ will receive the DAOS storage system in the last quarter of 2021, and the computer system will follow in the 2^nd quarter of 2022. The LRZ team are preparing their user community for Phase Two’s enhancements by offering support and consultations for adapting and optimizing codes and AI algorithms, and giving researchers access to GPU systems specialized in AI applications. The LRZ training program also offers a wide variety of machine and deep learning courses, educating users in how they can adapt existing algorithms or develop and train their own.