As the HPC industry reaches the end of technology scaling based on Moore’s Law, system designers and hardware manufacturers must look towards more complex technologies that can replace the gains in performance provided by transistor scaling.
There are many different potential solutions to the problems faced by in trying to extend computing performance. In the short term, HPC users have been able to look at parallelism and the increasing use of accelerators to drive more performance but this contributes to a need for memory bandwidth and more complicated coding. This is not a long-term solution as accelerators and parallel processing still rely on the same technologies and are also affected by the hard material limits placed on continued transistor scaling.
In the long-term there is some potential for multi-chip-module technology and there are rumours that Intel is working on a new dataflow computing system known as Configurable Spatial Accelerator (CSA) which could take the company away from the x-86 Von Neumann architecture.
3-D memory could also provide some much needed respite from the memory bottlenecks facing many HPC systems. Beyond the more conventional technologies there is also the promise of Quantum computing on the horizon which could also provide huge performance increases for certain types of applications.
Eric van Hensbergen, Arm fellow and director of HPC, doesn’t think of Arm as a disruptive technology anymore as it has become well established in HPC. However, while the company gains momentum it must also look to a future that will require more effort to continue the performance gains seen in the past.
‘Historically we were a disruptive technology in that we were in this market and coming at it from a different standpoint from others. But taking a step back, especially with the recent announcements from Intel on some of the approaches that they are taking with their new dataflow type of approach to things, and considering everybody else is GPU-based, I think we may be the only conventional architecture left in HPC.’
‘As a researcher I find that profoundly disappointing, that we are no longer a disruptive technology, perhaps we are the boring one. Of course application developers and scientists like boring because it means things can work without a lot of trouble,’ added Hensbergen.
Hensbergen noted the announcement of the Astra system at Sandia National Laboratories as an example of the maturing Arm HPC ecosystem. This will be the first public petaflop system using the latest Arm processor from Cavium the ThuderX2.
‘From an architectural perspective the tools and software environment have been maturing steadily, and we are quite happy with that but this is going to be larger scale and that where you find the other big challenges in HPC,’ added Hensbergen.
Arm has been working with HPC partners for some time but the last 12 months has seen a number of system announcements. While most of these are test-bed systems the Isambard system at Bristol University, part of GW4 and the Post K computer developed by Fujitsu for the Riken research institute were the largest systems that had been announced. While the Post K computer will be larger than the Sandia system it is not scheduled to go into production until 2021.
Hensbergen explained that the Arm teams have been working towards large-scale systems for some time, but the Sandia Astra system will represent the first time they can see a system of that scale in action. ‘Astra is one of the first at scale and I think that is going to help us mature the multi-node scaling aspect of the ecosystem a great deal,’ said Hensbergen.
‘Then we have Fujitsu coming further along, everything looks good on that end, and all of our other partners as well, so generally we feel that Arm is coming into its own now.’
Hensbergen also stated that the Arm is now staffing for business development in addition to its research activities as the company sees its role in the industry changing as the Arm ecosystem matures.
‘We will be transitioning stewardship of the Arm HPC development roadmap from more of a research centric activity to more of a business-centric activity. Research will of course look for what is further down the road in HPC; we are not abandoning HPC or high-performance data analytics but we are staffing up more from a business perspective,’ stated Hensbergen.
‘A lot of our focus has been on the leadership class with a series of developments – especially the post K system. I think that in the leadership class we are feeling pretty good about the maturity of the ecosystem in being able to address that,’ concluded Hensbergen.
Looking to the future
Quantum computing could be one answer to the problem of technology scaling but quantum computers remain largely untested for all but a small selection of applications. It is unlikely that quantum computers would ever replace general purpose computing systems but, for specific applications, they can deliver incredible performance. In May this year one of the biggest names in quantum computing, D-Wave, announced that it had opened a business unit that will focus on machine learning applications. Known as the Quadrant business unit, D-Wave aims to provide machine learning services that make deep learning accessible to companies across a wide range of industries and application areas. However, this new business currently uses GPU technology to give customers access to the Quadrant algorithms until the technology can be integrated into its quantum computing systems.
Quadrant’s algorithms enable accurate discriminative learning (predicting outputs from inputs) using less data by constructing generative models which jointly model both inputs and outputs.
‘D-Wave is committed to tackling real-world problems, today. Quadrant is a natural extension of the scientific and technological advances from D-Wave as we continue to explore new applications for our quantum systems,’ said Vern Brownell, CEO at D-Wave.
D-Wave also announced a partnership with Siemens Healthineers, wa medical technology company.
Siemens Healthineers and D-Wave took first place in the CATARACTS medical imaging grand challenge, using Quadrant’s generative machine learning algorithms to identify surgical instruments in videos. These algorithms are being researched as a way to improve patient outcomes through better augmented surgery and ultimately computer-assisted interventions (CAI).
‘Machine learning has the potential to accelerate efficiency and innovation across virtually every industry. Quadrant’s models are able to perform deep learning using smaller amounts of labelled data, and our experts can help to choose and implement the best models, enabling more companies to tap into this powerful technology,’ said Handol Kim, senior director, Quadrant Machine Learning at D-Wave.
‘Quadrant has the potential to unlock insights hidden within data and accelerate innovation for everything from banking and quantitative finance, to medical imaging, genomics, and drug discovery,’ said Bill Macready, senior vice president of Machine Learning at D-Wave.
In addition to machine learning applications D-Wave has also been working to publish results of other application areas that can help to demonstrate the effectiveness of quantum computing technology.
The study published in the journal, Nature, explored the simulation of a topological phase transition using its 2048-qubit annealing quantum computer.
The study helps to demonstrate the fully programmable D-Wave quantum computer can be used as an accurate simulator of quantum systems at a large scale. The methods used in this work could have broad implications in the development of novel materials. This new research comes on the heels of D-Wave’s recent Science magazine paper demonstrating a different type of phase transition in a quantum spin-glass simulation.
The two papers together signify the flexibility and versatility of the D-Wave quantum computer in quantum simulation of materials, in addition to other tasks such as optimisation and machine learning.
‘This paper represents a breakthrough in the simulation of physical systems which are otherwise essentially impossible,’ said 2016 Nobel laureate Dr Kosterlitz. ‘The test reproduces most of the expected results, which is a remarkable achievement. This gives hope that future quantum simulators will be able to explore more complex and poorly understood systems so that one can trust the simulation results in quantitative detail as a model of a physical system. I look forward to seeing future applications of this simulation method.’
‘The work described in the Nature paper represents a landmark in the field of quantum computation: for the first time, a theoretically predicted state of matter was realised in quantum simulation before being demonstrated in a real magnetic material,’ said Dr Mohammad Amin, chief scientist at D-Wave.
‘This is a significant step toward reaching the goal of quantum simulation, enabling the study of material properties before making them in the lab, a process that today can be very costly and time-consuming.’
Finding a path beyond Moore’s Law
While not always the case, one of the primary drivers for innovation is necessity – to overcome grand challenges that require people to think outside of the box beyond what may seem possible today.
While we will still see continued improvements in transistor size and density the free ride of increasing performance is coming to an end. Intel has announced delays in its products based on its next generation 10nm fabrication process and while AMD appears to be preparing to announce processors based on 7nm transistor fabrication, the industry as a whole is running out of room for more improvements.
Whatever number the companies finally plateau at the main question then becomes what can be done to further increase performance in the future. It may no longer be a free lunch but that certainly does not mean that improvements will not be made.
‘If it was just us that were suffering the end of technology scaling then that would be problematic but it’s everybody. It forces all of us to think a bit harder about how we approach things. In some ways it levels the playing field instead of having one of the incumbents having a two technology node advantage over everybody else,’ said Hensbergen.
‘We have a bunch of different work going on through the research organisation on 3D integration, Multi-chip modules (MCM’s) and a whole slew of things, which is going to increase computational density, decrease latency and improve the memory situation irrespective of technology scaling,’ added Hensbergen.
As the options begin to stack up its clear that there may be no single answer to solving the computing performance challenges in the future. However as Hensbergen notes, it is important to be think carefully about which technologies the company should invest time and resources. Too much time and investment on a technology that ends up being abandoned could set a company back in the next performance race.
‘It’s not like we are coming to a halt. It’s more of a question of “wherever technology scaling ends up plateauing what do we build on top of that?” We have to be clever about how we do that especially in terms of the focus on exascale and the leadership class but they are already starting the discussions of post-exascale,’ stated Hensbergen.
‘I believe there are technologies across the spectrum that enables the continued march of computing advances. It’s not a free lunch as it has been with technology scaling; we do have to be clever about these things but we have to be careful as well,’ explained Hensbergen. ‘Things like accelerator technologies are going to be important as we move forward but it’s equally important that Arm helps to work to standardise how software interfaces are composed because that’s been such a key aspect of our value proposition.’
While Arm is exploring more technologies in the long term, in the short term the company aims to take its experience in defining and curating standards and see how this can be applied to accelerator technologies.
‘The way the accelerator world is unfolding at the moment it is a little bit like the Wild Wild West. Yes, you can abstract a lot of it behind library interfaces, but then people start pushing at the edges and if we are not careful we get into a situation where we are in a much more embedded environment where you are not going to have the same level of code portability or performance portability that you have in today’s general purpose systems,’ said Hensbergen.
‘How do we take the expertise that we have in standardising an instruction set and creating a stable software ecosystem and extend that into accelerator topics so we can keep the same level of portability within the ecosystem.
‘One of the technologies that we are looking at is as we go forward is how can incorporate accelerator technologies. But we are really trying to look at it from a standards based perspective. We really want to push it as far as we could with general purpose technology. Then as we build out acceleration technology we can do it in a way that is not going to throw the baby out with the bathwater,’ concluded Hensbergen.