Japanese system powered by ARM CPUs is the world’s fastest supercomputer
The 55th edition of the TOP500 has been published with the top spot now taken by an Arm powered Japanese system Fugaku. This represents a huge change in the computational power available at RIKEN for its research community and its partners.
The latest rankings also reflect steady growth in aggregate performance and power efficiency across the TOP500 list.
Fugaku turned in a High Performance Linpack (HPL) result of 415.5 petaflops, besting the now second-place Summit system by a factor of 2.8. Fugaku is powered by Fujitsu’s 48-core A64FX SoC, becoming the first number one system on the list to be powered by ARM processors.
In single or further reduced precision, which is often used in machine learning and AI applications, Fugaku’s peak performance is more than 1 exaflops. The new system is installed at RIKEN Center for Computational Science (R-CCS) in Kobe, Japan.
Number two on the list is Summit, an IBM-built supercomputer that delivers 148.8 petaflops on HPL. The system has 4,356 nodes, each equipped with two 22-core Power9 CPUs, and six NVIDIA Tesla V100 GPUs. The nodes are connected with a Mellanox dual-rail EDR InfiniBand network. Summit is running at Oak Ridge National Laboratory (ORNL) in Tennessee and remains the fastest supercomputer in the US.
At number three is Sierra, a system at the Lawrence Livermore National Laboratory (LLNL) in California achieving 94.6 petaflops on HPL. Its architecture is very similar to Summit, equipped with two Power9 CPUs and four NVIDIA Tesla V100 GPUs in each of its 4,320 nodes. Sierra employs the same Mellanox EDR InfiniBand as the system interconnect.
Sunway TaihuLight, a system developed by China’s National Research Center of Parallel Computer Engineering & Technology (NRCPC) drops to number four on the list. The system is powered entirely by Sunway 260-core SW26010 processors. Its HPL mark of 93 petaflops has remained unchanged since it was installed at the National Supercomputing Center in Wuxi, China in June 2016.
At number five is Tianhe-2A (Milky Way-2A), a system developed by China’s National University of Defense Technology (NUDT). Its HPL performance of 61.4 petaflops is the result of a hybrid architecture employing Intel Xeon CPUs and custom-built Matrix-2000 coprocessors. It is deployed at the National Supercomputer Center in Guangzhou, China.
A new system on the list, HPC5, captured the number six spot, turning in an HPL performance of 35.5 petaflops. HPC5 is a PowerEdge system built by Dell and installed by the Italian energy firm Eni S.p.A, making it the fastest supercomputer in Europe. It is powered by Intel Xeon Gold processors and NVIDIA Tesla V100 GPUs and uses Mellanox HDR InfiniBand as the system network.
Another new system, Selene, is in the number seven spot with an HPL mark of 27.58 petaflops. It is a DGX SuperPOD, powered by NVIDIA’s new ‘Ampere’ A100 GPUs and AMD’s EPYC ‘Rome’ CPUs. Selene is installed at NVIDIA in the US.
Frontera, a Dell C6420 system installed at the Texas Advanced Computing Center (TACC) in the US is ranked eighth on the list. Its 23.5 HPL petaflops is achieved with 448,448 Intel Xeon cores.
The second Italian system in the top 10 is Marconi-100, which is installed at the CINECA research centre. It is powered by IBM Power9 processors and NVIDIA V100 GPUs, employing dual-rail Mellanox EDR InfiniBand as the system network. Marconi-100’s 21.6 petaflops earned it the number nine spot on the list.
Rounding out the top 10 is Piz Daint at 19.6 petaflops, a Cray XC50 system installed at the Swiss National Supercomputing Centre (CSCS) in Lugano, Switzerland. It is equipped with Intel Xeon processors and NVIDIA P100 GPUs.
Aggregate list performance is now 2.23 exaflops, up from 1.65 exaflops six months ago. The majority of that increase is the result of the new number one Fugaku supercomputer. The new entry point on the list (system number 500) is 1.24 petaflops, only a slight increase from the previous list. Overall the number of new systems in the list is only 51, a record low since the beginning of the TOP500 in 1993.
China continues to dominate the TOP500 with regard to system count, claiming 226 supercomputers on the list. The US is number two with 114 systems; Japan is third with 30; France has 18; and Germany claims 16. Despite coming in second on system count, the US continues to edge out China in aggregate list performance with 644 petaflops to China’s 565 petaflops. Japan, with its significantly smaller system count, delivers 530 petaflops.
A total of 144 systems on the list are using accelerators or coprocessors, which is nearly the same as the 145 reported six months ago. As has been the case in the past, the majority of the systems equipped with accelerator/coprocessors (135) are using NVIDIA GPUs.
The x86 continues to be the dominant processor architecture, being present in 481 of the 500 systems. Intel claims 469 of these, with AMD installed in 11 and Hygon in the remaining one. Arm processors are present in just four TOP500 systems, three of which employ the new Fujitsu A64FX processor, with the remaining one powered by Marvell’s ThunderX2 processor.
The breakdown of system interconnect share is largely unchanged from six months ago. Ethernet is used in 263 systems, InfiniBand is used in 150, and the remainder employ custom or proprietary networks. Despite Ethernet’s dominance in sheer numbers, those systems account for 471 petaflops, while InfiniBand-based systems provide 803 petaflops. Due to their use in some of the list’s most powerful supercomputers, systems with custom and proprietary interconnects together represent 790 petaflops.
The most energy-efficient system on the Green500 is the MN-3, based on a new server from Preferred Networks. It achieved a record 21.1 gigaflops/watt during its 1.62 petaflops performance run. The system derives its superior power efficiency from the MN-Core chip, an accelerator optimized for matrix arithmetic. It is ranked number 395 in the TOP500 list.
In second position is the new NVIDIA Selene supercomputer, a DGX A100 SuperPOD powered by the new A100 GPUs. It occupies position seven on the TOP500.
In third position is the NA-1 system, a PEZY Computing/Exascaler system installed at NA Simulation in Japan. It achieved 18.4 gigaflops/watt and is at position 470 on the TOP500.
The number nine system on the Green500 is the top-performing Fugaku supercomputer, which delivered 14.67 gigaflops per watt. It is just behind Summit in power efficiency, which achieved 14.72 gigaflops/watt.
The TOP500 list has incorporated the High-Performance Conjugate Gradient (HPCG) Benchmark results, which provided an alternative metric for assessing supercomputer performance and is meant to complement the HPL measurement.
The number one TOP500 supercomputer, Fugaku, is also now the leader on the HPCG benchmark with a record 13.4 HPCG-petaflops. The two US Department of Energy systems, Summit at ORNL and Sierra at LLNL, are now second and third, respectively, on the HPCG benchmark. Summit achieved 2.93 HPCG-petaflops and Sierra 1.80 HPCG-petaflops. All the remaining systems achieved less than one HPCG-petaflops.