Will HPC technology transform enterprise computing?
In the second of his reports on SC14, held in New Orleans last week, Tom Wilkie ponders the implications for enterprise computing of technology developments in HPC.
One of the public, and perhaps rather glib, justifications for investing in exascale has been that it would make petaflop computing cheaper, more accessible and more widespread – bringing powerful computational techniques within the reach of even quite modest engineering companies.
Last week’s announcement by the US Department of Energy (DoE) may, for the first time, make that promise a reality. The DoE chose a consortium including IBM, Nvidia, and Mellanox, to build two supercomputers for the Oak Ridge and the Lawrence Livermore National Laboratories.
The joint Collaboration of Oak Ridge, Argonne, and Lawrence Livermore (CORAL) was established in late 2013 to streamline procurement and reduce the costs of investing in the development of supercomputing -- procurement contracts being a long-standing method by which the US Government provides 'hidden' subsidies for its high-tech industry, as discussed in Scientific Computing World in July.
But the event’s full significance may lie elsewhere than in the niche application of supercomputing. Instead, it may well open up the world of enterprise computing to new technologies that will help master the swelling volume to data that commercial companies have to cope with – both in engineering and in business intelligence.
Commenting on the DoE announcement, both Sumit Gupta, general manager of accelerated computing at Nvidia, and David Turek, vice president of technical computing OpenPower at IBM, stressed the importance of the design chosen for Oak Ridge and Livermore not just for scaling up to ever faster and more powerful machines, but also for ‘scaling down’, so to speak.
Turek maintained that he had always been slightly sceptical of the line of argument that exascale would inevitably deliver cheap petascale computing: ‘It’s easy to say but hard to do,’ he commented. In particular, IBM had found that its Blue Gene programme had offered very limited economies for smaller systems.
The fundamental lesson was, he said: ‘You have to pay attention to it from the beginning. We’re making it explicit and real.’ The Coral project was designed to be a one-node construct and economies of scale ‘in both directions’ were built in from the outset. ‘We didn’t want to have to say to customers: “You have to buy a rack of this stuff”.’
Sumit Gupta from Nvidia focused on the wider implications of the technology for applications outside the specialist area of high-performance computing. ‘Accelerators in high-performance computing are clearly well established today – GPUs are mainstream,’ he said.
But he sees the partnership with IBM as a way for Nvidia GPUs to make the transition to enterprise markets. IBM, he continued, ‘knows about data centres and is the preferred provider for many in enterprise computing. We have opened our GPU out, using NVLink, to other processors,’ he pointed out, ‘and the partnership with IBM ‘takes GPUs into the mainstream DB2 market.’
David Turek made the same point – that this was not a technology being developed for a niche application in supercomputing but had wider ramifications across the whole of business and enterprise computing: ‘Coral is within the mainstream of our strategy. We have an eye to Coral as a way to serve our business needs.’
Ken King, general manager of OpenPower Alliances at IBM, elaborated on the theme, stressing that data-centric computing rather than number-crunching was at the heart of the new vision. With the explosion of data in the market, he said: ‘How are companies going to be able to process that data? You need innovation up and down the stack and you’re not going to get that with a closed structure.’
The solution, he continued, was to build solutions that minimised the movement of data for example by building compute into the storage. He also cited the need to get GPUs and CPUs working together and managing the workflow so as to achieve increased performance with minimal data movement. Nvidia’s NVLink interconnect technology will enable CPUs and GPUs to exchange data five to 12 times faster than they can today.
The combination of innovative solutions and minimal movement of data was, he claimed, a compelling strategy and that was the way in which IBM, in partnership with Mellanox and Nvidia had approached the Coral bid.
But he stressed that the solutions were not just for the likes of the Livermore national laboratory: ‘Small companies are going to have data analysis problems. It’s a market-changing statement we’re making with this.’
It was, he concluded, an HPC strategy that goes beyond HPC.
This is the second in a series of articles by Tom Wilkie, prompted by last week's SC14 supercomputing conference and exhibition in New Orleans. The first article in the series can be found here.
The third article, which looks at non-US supercomputer vendors and finds that their mood too is upbeat, can be found here.