Key challenges of the post-petascale era
Pushing the frontiers of science and technology will require extreme-scale computing with machines that are 500-to-1,000 times more powerful than today’s supercomputers. As researchers refine their models and push for increased resolutions, the demand for more parallel computation and advanced networking capabilities becomes more important. As a result of the ubiquitous data explosion and the ascendance of big data, especially unstructured data, even today’s systems need to move enormous amounts of data as well as perform more sophisticated analyses. The interconnect becomes the critical element of enabling the use of data today; and for tomorrow’s systems, it becomes paramount.
Reducing power consumption
As larger Exascale-class systems are being designed to solve the even larger scope of problems being addressed by high-performance computing (HPC), perhaps the most important issue we are all faced with is reducing the power consumption. This is not just a challenge for the Exascale deployments that are several years away, but it affects the entire HPC community today. As advances are being made in lowering the processor’s power consumption, similar efforts are underway in optimising other sub-system components including the network elements. Mellanox is working to address the power challenges, including continued work within the Econet consortium, a project dedicated to advancing dynamic power scaling and focused on network-specific energy saving capabilities. The goal is to reduce energy requirements of the network by 50 to 80 per cent, while ensuring end-to-end quality of service. Mellanox is helping reduce the carbon footprint of next-generation HPC deployments.
Accelerating GPU computing
Most can agree that in the past decade one of the most transformative technologies in HPC has been GPGPU computing. Lately, GPUs have become commonplace in the HPC space. Earlier this year, a new technology called GPUDirect RDMA became available. This technology allows direct peer-to-peer communication between remote GPUs over the InfiniBand fabric; completely bypassing the need for CPU and host memory intervention to move data. This capability reduces latency for internode GPU communication by upwards of 70 per cent. GPUDirect RDMA is another step in the direction of taking full advantage of the high-speed network in order to increase the efficiency of the over-all system.
Coupled with the advanced customisation abilities of cloud computing, the HPC community is pursuing the use of clouds which incorporate GPUs for their computing needs. Enablement of Infrastructure as a Service with GPUs into cloud infrastructures will enable researchers to rapidly adopt and deploy their own private HPC clouds using GPUs more effectively. This again translates to the need of advanced network capabilities – using less hardware more efficiently and further reducing power consumption.
Advanced network capabilities
One key metric for value of an HPC installation is its efficiency rating. InfiniBand is dominant in the Top500 list, primarily because it is the only industry-standard interconnect that delivers RDMA capabilities, and advanced network capabilities that improve the efficiency of the system. InfiniBand is proven to be capable of delivering up to 99.8 per cent efficiency for HPC systems in the list. Additionally, with advanced offloading of collective communication onto the network fabric, we can free up the processors to do meaningful computation instead of spending time on network communication. This again translates to additional power savings.
Additional critical capabilities to reach Exascale class environments include robust adaptive routing, advanced support for new topologies and InfiniBand routing to reach larger system node counts. Offloading the processor from network-capable tasks is a key element to increase the overall efficiency of an Exascale class system; but what of the future of microprocessor architectures?
In all probability, there will not be a single dominant microprocessor architecture for next-generation Exascale installations, especially as workflows for computational science are driving the requirements. Alternative architectures to x86, such as Power and 64-bit ARM are already picking up adoption. The advanced, lower-power architectures capable of handling demanding HPC workflows will rely upon a best-in-class interconnect that is scalable, sustainable, and able to exploit application performance at extreme scale.