As the supercomputing industry approaches the exascale, Europe and the US have developed projects to deliver this huge jump in computing performance. Using novel technologies and building partnerships between hardware vendors, academic and research centres the two groups are hoping to deliver exascale with very different computing architectures.
The US aims to continue with the use Intel CPUs with accelerators from Nvidia or Intel, while Europe aims to create an ARM-FPGA hybrid computing architecture. While they are very different technologies, both groups face similar hurdles in delivering energy-efficiency and parallelism on an unprecedented scale.
The exascale challenge
The pursuit of exascale is important for more than just the headlines and international bragging rights and accolades that come with hitting this milestone. Exascale promises to unlock the new level of computational performance that will provide the horsepower for future scientific discoveries not possible today.
Exascale class supercomputers will allows scientists and engineers to investigate problems at new levels of granularity and accuracy but it will also enable new scientific breakthroughs that are impossible new breakthroughs in computational performance.
However, reaching exascale is no easy task. While FLOPs is no realistic measure of exascale computing performance, the most basic milestone for an exascale class supercomputer would be a billion billion calculations per second. To reach this level of performance requires systems at a scale that has never been seen before and that requires complex innovation of both a hardware and software. All of this must be achieved within a given power limit of approximately 20 to 30 megawatts, which makes this all the more difficult as such a system would far exceed the efficiency of the most powerful supercomputers available today.
There are several ways of approaching these challenges but with the scale of the problem increasing for each generation of computing development most groups are choosing to tackle the challenge by creating their own computing ecosystem. This allows them to pool resources, investment and expertise with partners that can help to develop the launch pad for exascale computing.
Examples of this can be seen across several regions. Chinese HPC experts have opted for a somewhat home-grown approach but this still includes several component manufacturers, HPC centres, and organisations working together to create the necessary hardware and software to compete with other regions.
The American model of subsidising HPC development through investment in large supercomputing contracts from the DOE and other federal organisations is well understood but even this is taxed by the challenges of achieving exascale computing by the proposed date from the DOE of 2021.
The US has developed what it calls the Exascale Computing Project, a collaborative effort of two US Department of Energy organisations – the Office of Science (DOE-SC) and the National Nuclear Security Administration (NNSA).
'The Exascale Computing Project offers a rare opportunity to advance all elements of the HPC ecosystem in unison. Co-design and integration of hardware, software, and applications, a strategic imperative of the ECP, is essential to deploying exascale class systems that will meet the future requirements of the scientific communities these systems will serve,' said Paul Messina, former ECP director and senior strategic advisor at the Argonne Leadership Computing Facility (ALCF).
Messina commented that alongside the two DOE organisations there are also six vendor partners working on various aspects of the computing architecture as part of the PathForward exascale initiative: IBM, Nvidia, Intel, Cray, HP and AMD. He also noted that there are 22 institutions and 39 universities currently involved with some aspect of the research and development.
There are many moving parts for such a collaboration that uses co-design but Messina stressed that there is more to this project than just reaching the exascale milestone. He hopes that this work will leave a lasting legacy that provides benefits beyond just the use of the DOE exascale systems.
‘I envision that some of the results might not pay off in 2020-2021, but it might be in 2022 or 2024. Sometimes it takes a little bit longer but we hope to contribute to things beyond the delivery of just a couple of exascale systems in the United States,' said Messina.
Messina explained that the idea for meaningful collaboration extends across the entire project. One example off this is an industry council that has been setup with 18 commercial companies, of which 15 are end-user companies such as GM, GE, United technologies and FedEx.
'We established that industry council to make sure that we understand their needs. They all feel that they will eventually need exascale computing. We are working with them to understand their needs, what is realistic for them to do, and we will create a software stack that will meet their requirements.
This is a constant theme throughout the DOE ECP project. They do not just want to reach the milestone of exascale computing but develop the building blocks for other users and to develop meaningful application performance.
The Europeans have opted for a similar co-design approach banding several commercial entities and organisations together for a co-design approach. European plans for exascale are funded through the European Union which has set out significant investment in IT infrastructure through the European Commission's FP7 programme - part of the Horizon 2020 initiative. Horizon 2020 is the biggest EU Research and Innovation programme ever, with nearly €80 billion of funding available over seven years (2014 to 2020).
The EuroEXA project is funded through this Horizon2020 programme and builds on previous European high-performance computing projects and partnerships bringing together the focus of European industrial SMEs. Originally the informal name for a group of H2020 research projects ExaNeSt, EcoScale and ExaNoDe, EuroEXA hopes to coalesce all of these research projects into a single coherent exascale project.
The project is opting for co-design using a number of European developed technologies and partners including HPC centres, research organisations and hardware manufacturers that can help to create a European exascale class supercomputer to rival competition in the US and Asia.
The €20m investment over a 42-month period is part of a total €50m investment made by the EC across the EuroEXA group of projects supporting research, innovation and action across applications, system software, hardware, networking, storage, liquid cooling and data centre technologies.
Funded under H2020-EU.1.2.2. FET Proactive (FETHPC-2016-01) the consortium partners provide a range of key applications from across climate/weather, physics/energy and life science/bioinformatics. The project aims to develop an ARM Cortex technology processing system with Xilinx Ultrascale+ FPGA acceleration at peta-flop level by approximately 2020. This could then lead to exascale procurement in 2022/23 and commercialised versions of the technology available around the same time.
John Goodacre, professor of computer architectures at the University of Manchester, said: 'To deliver the demands of next generation computing and exascale HPC, it is not possible to simply optimise the components of the existing platform. In EuroEXA, we have taken a holistic approach to break-down the inefficiencies of the historic abstractions and bring significant innovation and co-design across the entire computing stack.'
Peter Hopton, founder of Iceotope and dissemination lead for EuroEXA, said: 'This is a world-class program that aims to increase EU computing capabilities by 100 times, the EuroEXA project is truly an exceptional collection of EU engineering excellence in this field.'
The EuroEXA project is certainly ambitious as it hopes to bring technologies from ARM and Xilinx together with Maxeler and memory technology from ZeroPoint Technologies to produce a new computing architecture for an exascale system.
Even if this process is successful it will require a lot of application development to develop the tools necessary to deliver sustained exascale performance. Alongside the scalable vector extensions (SVE) ARM has helped to provide Allinea debugging tools and the project has partnered with several research centres that bring their own large scale application codes for development.
Arm is providing Allinea tools as a bridge between the hardware architecture and applications, evaluating application performance and pinpointing steps to maximise efficiency.
The tool selection for the EuroEXA Program enables Arm to collaborate with project partners and understand their challenges in application development and preparation on the novel EuroEXA platform. 'New capabilities are often a direct result of collaborating with leading research efforts such as EuroEXA. Arm is quickly applying learnings from EuroEXA, and similar efforts, into future state-of-the art Allinea tools designed to help reach the most efficient levels of exascale compute,' said David Lecomber, Arm's senior director of HPC tools.
Another project partner Maxeler hopes to port its Dataflow programming model to support the relevant components of the EuroEXA platform. Ultimately this should allow the applications targeted in the project to be brought on to the heterogeneous EuroEXA system platform.
'Joining EuroEXA is exciting for us because it allows us to bring our long-established Dataflow technology into Europe’s latest effort towards achieving Exascale performance' commented Georgi Gaydadjiev, director of Maxeler Research Labs.
The Dataflow computing model will enable application developers to utilise the reconfigurable accelerators in a high-level environment. It also addresses the practical challenges of data movement when combined with other technologies such as memory.
Memory specialist ZeroPoint Technologies brings as background IP in main memory subsystems. Its technology uses novel compression approaches to store and transfer memory data more efficiently. The technology is based on more than 15 years of research from Chalmers University of Technology, Goteborg, Sweden.
ZeroPoint Technologies have developed memory systems that use an IP-block that compresses and decompresses, data in memory so that typically three times more data can be stored in memory and transferred in each memory request. ZeroPoint Technologies aims to deliver added value and competitiveness concerning cost, power consumption and performance of exascale systems.
ZeroPoint hopes to deliver added value in power consumption and memory performance by adapting its intellectual property blocks and integrating them in the computing chips. Additionally, the company will be responsible for the memory interface for the EuroEXA project.
Per Stenstrom, co-founder and chief scientist from ZeroPoint Technologies, said 'We at ZeroPoint Technologies are very excited of having the opportunity to join the EuroEXA project and demonstrate the added values our unique technology can offer to Exa-Scale systems.'
Iceotope will also be aiming to provide a boost to power consumption and efficiency with its liquid cooling technology which should allow denser computing racks and more efficient cooling technology. However the company was not just selected for its capabilities in liquid cooling, but also for IP within power delivery, I/O connections, infrastructure management and data centre infrastructure.
Peter Hopton, founder of Iceotope, said 'It’s a privilege to be selected as part of this program, with this investment in development, our technology will now enable the biggest computers of the future, as well as the cloud computing environments and edge computing of today and the near future.'
Leaving a legacy
Creating an exascale supercomputer is a huge achievement but if the accompanying software stack, programming models and even core design do not see widespread use by the wider HPC community then all the effort that has been exerted will be lost in the transition to following generations.
While one measure of success is the system capable of a huge number of calculations it is clear that there is an opportunity to develop better standards and approaches to computing that can provide benefits over the next five to 10 years.
One example of the negative side effects that have come from several generations of computational evolution along the von-Neumann architecture is the memory bandwidth bottleneck that we see in today's most computationally intensive HPC systems.
The problem is not just confined to the performance from a lack of data transfer as the energy ratio between control and arithmetic I/O and the scalability through I/O communication are both concerns for future system designers.
This issue was described by John Goodacre, professor of Computer Architectures, Advanced Processor Technologies Group at the University of Manchester and in a presentation as part of a workshop at the Infrastructure for the European Network for Earth and System Modelling (IS-ENES) – another FP7 funded project. In the presentation Goodacre noted that while the von-Neumann model was fundamental to many of today's systems it did have certain limitation as we approach exascale computing.
While the memory bandwidth problem is significant, Goodacre lists several approaches to overcoming this challenge by increasing processor efficiency. These range from SIMD or vector machines, DSP, GPGPU, hardware accelerators or FPGAs as possible solutions.
The EuroEXA project chooses FPGA acceleration and lays down several steps to creating a new computing architecture leading to the creation of commercial systems in approximately 2023. The EuroEXA project, which builds on several earlier projects, Euroserver, ExaNODE, ExaNeSt, ECOSCALE aims to lay the building blocks for a European exascale system by creating a new computing architecture based on an ARM/FPGA hybrid processor dubbed the 'ICT-42 EuroProcessor.'
This project will evolve through several phases from designing the processor and node architecture through to the interconnect technology and eventually OS, runtimes, programming models and application development.
However, designers must be careful not to create something which cannot be easily adopted by other users and this must be a clear thought from the outset as the project is already considering potential commercialisation of these technologies once the initial exascale systems have been deployed.
Across the pond, the DOE is dealing with similar challenges. 'We want to build a software stack that will support a broad set of applications and that will have a life beyond the end of the exascale project,' commented Messina.
'In other words, it would serve as the foundation for some time after for many applications to be able to take advantage of exascale. One of the broad goals is to come up with a software stack that can be used on medium class HPC systems, as well as exascale. People can adopt this software stack in order to make it easier to transition to higher and higher levels of performance,' Messina added.
While the EU and US exascale plans revolve around particular architectures or software stacks that are being created specifically to bridge the hurdles needed to obtain exascale application performance their efforts will be felt by HPC users across the globe as the technology, architecture and software design provide blueprints for other HPC users who wish to pursue their own exascale journey.
Investment from US government funding US companies will surely benefit a worldwide community of HPC users and this is also true of European efforts funded by European countries as part of the European Commission’s efforts accelerate computing efforts on the other side of the Atlantic.
'In recent years if one is running HPC on a medium-sized cluster the software environment tends to be somewhat different from what you have for the leading edge systems. That has been an obstacle that people have surmounted but we would like to lower the effort required to do that,' said Messina.
The hope from the US investment programme is to provide an ecosystem which can then support further development in the future. This is done through widespread adoption of tools and architectures that help to drive expertise and knowledge around a given technology, language or programming model. This development should provide a trickle-down effect to users who are not targeting exascale but can still take advantage of the same tools and methods for HPC.
Another aspect that Messina was keen to highlight was that another secondary goal would be to streamline some of the approaches to programming that are dealing with similar issues.
'Applications have, for very good reasons, adopted different approaches and algorithms and it is difficult for a community to converge to a single solution,' commented Messina. He gave one example of parallel I/O explaining that different groups had their own approaches but it would be beneficial to converge if such a process could support all the required applications.
'I/O is just one of the examples, another would be common runtime API that could support tools like performance measurement tools, visualisation tools, or even compilers. 'In the long run there would be substantial benefits for the high performance computing community if there were fewer choices so long as they properly support the applications,' commented Messina.
Performance portability is another key aspect to the project as it will help to ensure value in the work that has been done by the ECP partners. As Messina notes, 'portability is an important target or goal for applications, especially some performance portability because there will continue to be more than one architecture in the lifetime of an application. There will probably be several different architectures that a given application will run on.'
Beyond the goals of exascale lay a new set of challenges and milestones that must be surmounted by future development. To lay the foundation for future innovation requires today’s computer scientists to think carefully about the legacy they leave for the next generation. The industry needs real cooperation and teamwork to meet the requirements of exascale but also to lay the groundwork for future technologies that can successfully use the tools that are created today.