Spend $3bn on exascale, report tells US Government
By Robert Roe
The United States Government should invest $3 billion over the next decade to create an exascale system, capable of delivering between 1 and 10 exaflops, according to a report published by the US Department of Energy (DOE), last week.
The ‘Report of the Task Force on High Performance Computing of the Secretary of Energy Advisory Board’ recommends a programme established and managed by the National Nuclear Security Administration (NNSA) and the Office of Science. Both organisations are part of the DOE.
Following the Comprehensive Test Ban Treaty that ended underground nuclear explosive tests, the NNSA is a heavy user of computer simulation in order to fulfill its role of ensuring that the US stockpile of nuclear weapons is kept operational. The DOE's Office of Science funds civil scientific research. Both need access to the next class of leading-edge computing.
Beyond this, the Task Force's report goes on to state that data-centric computing will play an increasingly large role in HPC. Therefore the task force recommends that exascale machines ‘should be developed through a co-design process that balances classical computational speed and data-centric memory and communications architectures to deliver performance at the 1-10 exaflop level, with addressable memory in the exabyte range.’
The report outlines the rationale for a continued US presence at the forefront of HPC development. It states: ‘The historical NNSA mission (simulation for stewardship), multiple industrial applications and basic science all have applications that demonstrate real need and real deliverables from a significant performance increase in classical high performance computing at several orders of magnitude beyond the tens of petaflop performance delivered by today’s leadership machines.’
However classical HPC has generally uses multiple different architectures for different applications. This has meant many organisations, such as, the US National Labs for instance, deciding to utilise multiple system architectures with varied resources to suit the most common workflows undertaken at the facility where the particular system is housed.
The National Energy Research Scientific Computing Center (NERSC), funded by the DOE for example, has 2 Cray based systems and one IBM Blue Gene based system, and is planning its next system, named Cori after American biochemist, Gerty Cori, on Intel’s Many Integrated Core (MIC) Architecture.
Both the multiple petaflop systems, Oak Ridge’s Cray XK7 and Argonne’s Mira have around 700Tb (710Tb and 768Tb respectively). If the next generation of these machines need to address data-centric computing then they will need memory in the petabyte range and an order of magnitude again higher for exascale class computing – adding considerably to system costs.
The report states: ‘It is in the coupling of ever increasing capability for traditional modelling and simulation with the emerging capability for Big Data analytics, that the potential for the significant impact on US industry will be the greatest. This convergence, coupled with increasing HPC capabilities will result in “systems of insight”, where modelling and simulation, analytics, big data and cognitive computing come together to provide new capabilities and understanding.
The report also gives insight into the needs of key industries that regularly take advantage of HPC: oil and gas, biology and finance. The report states: ‘Many oil companies are predicting the need for exascale computing by the end of the decade. Some individual market players are already running data centres with over 60 petaflops of compute capacity and growing. Other players are contemplating data centres with hundreds of petaflops by the end of the decade.’
However the DOE acknowledges a change in the oil and gas workflows as more complex data-driven simulation becomes available, allowing geophysicists to adapt the parameters of the simulation to ‘dynamically consider numerous, perhaps thousands or millions, of “what if” scenarios to leverage their knowledge to explore more effectively.’
The report concludes ‘Enabling this kind of coupled operation to unlock the value it offers requires the deep integration of currently silo-ed stages; it requires enabling dynamic visualisation throughout all stages for analysis and computational steering of long complex processes. And it requires the incorporation of data analytics in various stages for estimating and managing both risk and value.’
This combination of HPC, big data and data analytics -- referred to as data-centric computing -- is a recurring theme throughout the report. While the trend towards increasing computational power or FLOPS that has been seen classically in HPC ‘is not likely to slacken in the foreseeable future,’ the report states: ‘the nature of the workloads to which these systems are applied is rapidly evolving. Even today, the performance of many complex simulations is less dominated by the performance of floating point operations, than by memory and integer operations. Moreover the nature of the problems of greatest security, industrial and scientific interest is becoming increasingly data-driven.’
Later, the report gives recommendations that highlight a convergence between traditional HPC and data-centric computing. The report states: ‘ [the]DOE should lead, within the framework of the National Strategic Computing Initiative (NSCI), a co-design process that jointly matures the technology base for complex modelling and simulation and data-centric computing.’
It goes a step further, advising that ‘the highest performing computational systems must evolve to accommodate new data-centric system architectures and designs, and an ever more sophisticated and capable software ecosystem.’
The report acknowledges that as HPC data volumes increase, sophisticated analytics and architectures designed around streaming and processing huge amounts of data will not just be commonplace but become necessary to achieving scientific goals. The report states: ‘As the data sets used for classic high performance simulation computation become increasingly large, increasingly no-localized and increasingly multi-dimensional, there is significant overlap in memory and data flow science and technology development needed for classic high performance computing and for data-centric computing.’
‘The computing environment has begun to change as the complexity of computing problems grows, and with the explosion of data from sensor networks, financial systems, scientific instruments, and simulations themselves. The need to extract useful information from this explosion of data becomes as important as sheer computational power. ’