Does HPCs future lie in remote visualisation
Imagine, for a moment, that you’ve taken a break from work and are in a café. As you sip your skinny latte, an idea occurs about the modelling and simulation project that you have been working on. You need to look at the data to see if your idea will work.
But the dataset is enormous. It could take two days to download over the internet, even with a high-bandwidth internet connection (not usually available in coffee shops!), and you have only a laptop on which to do the processing. In any case, you do not want to access your company’s data over what may well be an insecure public internet connection.
So you get out your mobile phone and use the phone network to link your laptop to the remote servers where your data is stored, where you can run the application, and where you can visualise the results without having to move data.
Typically, your phone might give you a bandwidth of 5Mb/s. And that will be enough. Welcome to the new paradigm for scientific computing.
The new way of doing HPC
Remote visualisation is at the intersection of cloud, big data, and high performance computing. And the ability to look at complex data sets using only a mobile phone’s data rate is not some fantasy of the future. It is reality here and now. Tom Coull, president and CEO of Penguin Computing, pointed out that his company had demonstrated just such a process at the most recent supercomputing conference, SC14 in New Orleans in November last year.
In the opinion of Steve Parker, VP of professional graphics at Nvidia, the demand for remote visualisation can only increase: ‘I’m confident that data sets will continue to grow faster than bandwidth on the internet. The problem is just going to get worse and worse.’
Intel too believes that this is the way the wind is blowing. According to Jim Jeffers, principal engineer and manager, parallel visualisation engineering at Intel: ‘As the datasets get larger, visualisation is taking on more and more of a role in helping the analysis and providing the insight with such large datasets. That large data trend is there, which raises the question “how do you deal with that data when it is harder and harder to move it?” The compute is increasing faster than the ability to move it around.’ But Jeffers sees a second development, a demand coming from the user-community of scientists and engineers: ‘The other trend is that the nuances in the data, and the way to extract visual information from the data, is requiring higher fidelity.’
Andrea Rodolico, CTO at Italian software company Nice, believes the ramifications of all this will be far-reaching. ‘Today, remote visualisation is like a visionary need, a new concept that is now very popular where you have huge data sets,’ he said. ‘But if you look five years into the future, the real question will be “Why don’t I use this kind of model for pretty much all my technical workloads?” Data will keep growing; the ability to move it will be more and more complex; collaboration will be important; and security will be a very important aspect, so having everything in a well-defined and well controlled environment – that could be a data centre or it could be the public cloud – will make this very exciting. The vision in five years is really “Why would I go anywhere else than remote visualisation for my technical activities?”’.
Don’t move the data
The areas where Penguin has seen fastest uptake, Coull said, are in engineering and manufacturing – aerospace, automobile, and heavy equipment. If they have an application where the solver needs a cluster, he said: ‘They use their solver on our cloud, or run the modeller up on the cloud using the remote visualisation capability and so the data is right there; the application is right there. They’re not having to move anything around, pay for any data transport, or things like that.’
The same point was emphasised by Andrea Rodolico: ‘It was almost 10 years ago that we figured out that accessing data doesn’t mean moving data. Our offering is that you can have all your HPC activity, and at some point you can jump off the batch into the interactive system from the portal with just a couple of clicks.’ Nice demonstrated its Desktop Cloud Visualisation (DCV) 2014 technology at the AWS re:Invent conference late last year. DCV 2014 needs from between three to 10 times less bandwidth than its predecessor as well as offering improved latency mitigation. Penguin’s solution, Scyld Cloud Workstation, is easily accessible from the Penguin public cloud, with both a Linux and a Windows version available within a couple of clicks. However, according to Tom Coull: ‘It can be delivered independently of our public cloud. A couple of commercial customers are doing pilots with it right now where they have on-premise solutions.’
Overcoming the constraints
Although Penguin’s mobile phone visualisation is a striking instance, it neatly exemplifies the constraints that have to be overcome in providing the infrastructure on top of which scientists and engineers can run their applications to get insight via remote visualisation. Bandwidth is the first issue – particularly so for Penguin’s mobile phone example. Coull said that previously 20 to 30 Mb/s were required to deliver a high fidelity, high frame-rate display. The second issue is security: ‘Many organisations in the commercial world don’t want you to poke additional holes through their firewall by opening non-standard ports. It’s difficult to get IT organisations to modify their firewall rules just for remote visualisation. So we wanted to have something that ran over the standard ports that would be open to https.’
The Penguin system is built on Nvidia Grid technology and, according to Coull, produces a reasonable resolution at 1400 by 900 with broadcast quality using H.264 as the protocol. It runs at 25 to 30 frames per second, with very high image quality. ‘We brought it out first with Nvidia’s K2 Grid card,’ he said. ‘There are certain things that you want to do on the graphics processor in hardware that reduce the latencies in delivering a nice stream out to the client. Those have to do with delivering an H.264 coder on the GPU and being able to read out the GPU’s memory directly to the client so you minimise transferring things between the GPU and CPU. The Grid card is a great product for that and Nvidia has a library associated with it.’
Steve Parker also highlighted the centrality of Nvidia’s Grid technology, saying that at its core lay ‘some key technologies for video streaming and we tie that with virtualisation. The Grid platform allows us to run any application and stream it across the internet. We have hardware coders on the GPU that can compress and scan the frame at low latency – in many cases lower than a connected display. It’s transmitted over some secure link to a remote desktop. That allows us to place a grid server in a machine somewhere with a connection to an HPC data store and transmit the video stream over the internet – which is much smaller amount of data’.
Virtualisation as part of visualisation
Although, as a software company Nice is ‘brand agnostic in terms of the overall box. We work with SGI or Dell or HP or Lenovo, or whoever,’ according to Rodolico, but he highlighted the importance of GPUs for 3D visualisation performance: ‘What matters to us is to have the proper GPU on it. We only support Nvidia GPUs because they are the dominant player in the data centre. In particular, on Windows we can work with the Quadro cards or the Grid cards. We are very excited about their Grid cards because they also provide virtualisation – vGPU capabilities – which is something we see customers are really fond of.’
He pointed to the announcement by VMware, about a year ago, that it would offer Nvidia vGPU support, opening up the possibility of having one GPU shared across multiple virtual machines. ‘They are the leader in virtualisation in corporate industry, so that has been great news, a great boost. A lot of our customers are asking us to leverage VMware infrastructure for visualisation technology and so far we have been limited by the lack of support for vGPUs in their offering.’ Demand will start to rise this year, he thinks, ‘but the market acceleration is something we will see next year.’
Both Nice and Penguin offer support for Linux and for Windows. According to Coull: ‘You need to have both. There is a higher cost for the Windows because of the licences involved, whereas for Linux you don’t have that cost’. But companies using workstations tend to be running Windows – most desktop users have a Windows environment, he said. There is comparatively little overhead in providing the two ‘flavours’ – it takes about a month to port code from one to the other platform.
For Andrea Rodolico, open source access to remote visualisation tends mostly to be Linux-based and is seen more in academic research. As with Penguin, the commercial users – in automotive and manufacturing – tend to opt for a mix of Linux and Windows. ‘So we see open source as a great way for users to start engaging with the remote 3D concept, and expanding their work flows into EnginFrame. Then, when they need more support or add Windows-based applications, they can have drop-in replacements of the open source components and get commercial support’. As an example of the support provided by a company like Nice, he pointed to the close links that Nice has with the application software vendors so that: ‘In our lab, we have dozens of ISV applications that we test so we can ensure that when a new version hits the road it is performing optimally’.
Compute and visualise in the same machine
For Linux applications, Nice can also provide 3D graphics using Nvidia’s Tesla card rather than the other graphics cards. Rodolico pointed out that this ‘is particularly intriguing for the HPC community: for on the same card you can run both graphics and HPC.’ Remote visualisation brings with it the ability to compute and visualise in the same place, he continued: ‘With the ability to put everything in the data centre, you have the ability to give every user the right-size machine, dynamically for their purposes. From project to project you can relocate the user on to a different virtual machine or on to a different physical machine depending on the actual purpose and you use a standard scheduler to do that. The user does not see this’.
To Nvidia’s Steve Parker, this is the most exciting aspect of recent developments: ‘One of the things that I think is amazing is that it’s the first time we have a single device that is capable of both graphics and visualisation. It used to be there was a computational system and then a visualisation system. Separate resources: buy a big SGI and a cluster’. But GPU computing has ended that, he said. He cited a demonstration that Nvidia ran during SC14 with an astrophysics simulation code called Bonsai. Although the results were being displayed in New Orleans, the code was being run at the Swiss Supercomputing Centre, CSCS: ‘on hundreds and hundreds to about a thousand nodes. It would use the GPUs for computation, and then use the graphics horsepower in those same devices to rasterise the star field of the galaxy simulation’. It was, he said, computing and visualising at the same time: ‘The data was never stored to a disk; although, if you were doing science with it, you would probably want to store some of it. The data was streamed from the computation, to the visualisation system, to the video codecs, to a display halfway across the planet.’
This sort of capability is unprecedented, he believes and the advantage of a unified resource is that it can be put together in many interesting ways. It allows a systems administrator or even the end-user scientist or engineer a lot more flexibility, for they can allocate a set of nodes as the visualisation resource rather than it having to be distinct. ‘As far as future goes, having unified computation and visualisation is a key trend. Video streaming, rather than shipping large station wagons of tapes, is also going to be key. It will be the default way to work.’
Visualisation and graphics without GPUs?
But for Intel, an important consideration is that there are a significant number of large HPC centres that are not using GPUs. As Jeffers explained: ‘They see the need for visualisation, so they carve off a smaller subset of a system to have GPUs in it.’ It is, he said, almost the default method that people are using, ‘but an issue with that is that you have to move the data over to that arena, and then display it with an OpenGL rasterisation environment on the GPU – which continues to mean moving the data in each step of the way.’ In his view, that entails delays sometimes of many days if not weeks before the scientist can get to see the visualisation and get that extra insight.
For some time now, Intel has been adding parallel compute capabilities to its products and, according to Jim Jeffers: ‘That parallelism adds capabilities across many workloads and graphics workloads, being parallel, can strongly benefit from it.’ So Intel is developing open source tools to allow those large-scale installations to do visualisation without having to have a peripheral, GPU-based system for the visualisation. ‘A peripheral GPU has a limited amount of memory but if you apply these techniques to the standard HPC cluster which has more memory for the compute applications you can take advantage of that and not be limited in how you craft your data set. Sometimes you have to decimate your data set to sit on the GPU. You don’t have to do that if you can spread it across a cluster,’ Jeffers said.
‘We’re building with industry partners, such as TACC [the Texas Advanced Computer Centre] and Kitware, a stack that will scale across large HPC datacentres that have Xeon and Xeon Phi products as compute resources, without the need for a GPU, and render those in software with a focus on enabling the visualisation to match the scale of the data, and drive towards real time remote visualisation of that data at reasonable frame rates of 10 to 20 frames per second.
‘That’s the direction we’re taking. Ultimately, it is providing a faster time to science because of the memory size and efficiency. When the data begins to exceed the size that a GPU peripheral can cope with, then our performance is higher as well,’ he claimed.
In Jeffers’ opinion, ‘TACC has one of the world’s finest visualisation teams that both supports end users and enables development work. They are a perfect partner for us to work with.’ He described how Intel worked with TACC and Florida International University to visualise an aquifer near the Bay of Biscayne in Florida as part of a study to monitor salt water intrusion into the fresh water aquifer. If intrusion does occur, the questions were how does happen; how does it flow through; and how quickly can the aquifer recover? As important is to look at the model and see if there is any remedial action that can help repair the damage. The issue is important not just scientifically but also in social terms: many of the freshwater aquifers in Florida are within intrusion range – a mile or so – of salt water.
Today’s standard open source tools for visualisation include ParaView and VisIt, which are built on the Visualization ToolKit VTK, supported by Kitware. To visualise the water flow in the Florida aquifer, the team used a plug-in developed by TACC that allows VTK-based applications such as ParaView and VisIt to render with the OSPRay framework, which in turn was developed by Intel for building distributed applications which use ray tracing such as scientific visualisation tools.
The point is, Jeffers said, that he and his team are building the underlying infrastructure to allow people to run the sorts of applications with which they are already very familiar. ‘We don’t want to change the use-model, but we do want to provide a new enabling technology underneath,’ he said.
They can continue to use the same menu systems, the same interface that they’re used to with ParaView, but down inside the software they are moving to these tools that produce ray-traced images.
Most visualisation at present takes advantage of the graphics capabilities of GPUs, and of Nvidia products in particular (the initials do, after all, stand for Graphics Processing Unit). Their highly parallel structure makes them more effective than general-purpose CPUs for algorithms where processing of large blocks of data is done in parallel. On the software side, most visualisation systems are built on OpenGL, which has been industry standard for more than two decades.
Rasterise or trace the rays?
According to Nvidia’s Steve Parker, OpenGL has evolved over that time: ‘It’s grown up to be a very sophisticated graphics API, and Nvidia’s main business is OpenGL. Almost all applications are OpenGL – from Maya [Autodesk’s 3D animation software] or CAD systems all the way to scientific visualisation. So if you’re building a display wall most of the time it will end up being a Quadro system because we’ve made sure that that product is capable of the best graphics. There’s also an increasing number of games that make use of OpenGL and our GeForce product line is also based on the same underlying technology. All of those things are built around rasterisation. We can rasterise about three billion triangles per second on benchmarks and for large data sets being able to process that quickly is key to the insight you can get out of it.’
For Jeffers, Intel’s line of development, however, offers a further bonus, in that this line of development opens up ray tracing as an alternative to rasterisation and OpenGL. Ray tracing is, he believes, a method that will deliver that higher fidelity in scientific visualisation that he sees as one of the demands driving developments. As always, the very large data sets in the oil and gas industry loom large. According to Jeffers: ‘They are already doing ray tracing because of the improved fidelity and understanding they can get. My team has developed pretty much the fastest ray tracing architecture and implementation. Certainly on CPUs, it’s the fastest.’
The memory available to CPUs is typically larger than that available to GPUs and so for small datasets (i.e. data that fits into GPU memory), rasterisation on a modern GPU will frequently outperform ray tracing on a CPU or GPU. However, for very large datasets, the performance of ray tracing becomes competitive and, according to Intel, CPUs can outperform GPUs on ray tracing.
Nonetheless, Jeffers stressed that ‘We don’t think the OpenGL rasterisation pipeline is going away. We’re complementing it for those cases where the ray tracing can add a capability. We’re building the tools underneath, to enable that across an HPC cluster and to tune the user’s problem to the scale of the cluster and the scale of the allocation the user has received.’
However, Steve Parker cited Nvidia’s collaboration with John Stone at the University of Illinois Urbana Champaign on an open source ray tracing visualisation system called VMD. ‘It’s the main application for molecular dynamics visualisation. He adopted a library that Nvidia has called OptiX, which does GPU based ray-tracing. We’re working with him on the ability to remotely stream an OptiX application. So in this case, VMD ran on a workstation and would transmit data to an Nvidia Visual Computing Appliance (VCA) and, using the same underlying technology of video streaming and high-bandwidth low-latency data movement, it could interactively ray-trace those images.’ The process was demonstrated at SC14, with the cluster in California and the application in New Orleans.
Parker pointed out that he, then at the University of Utah, and his group had won the best paper award at the Visualization 1998 conference for the first applications using ray tracing in scientific visualisation. He explained that applications such as VMD use ray tracing ‘because they can represent some shades, in this case molecules, with higher fidelity and sometimes higher performance.’ Nvidia has a bunch of ray-tracing products, he said, including Iray a physically correct, photo-realistic rendering solution.
Historically, he went on, ray tracing has always been too slow to use interactively. ‘I don’t think the whole world is going to go for ray tracing but it will be an important use case because you can add shadows and this helps you understand complex shapes and occlusion. These kinds of things are easier to add with ray tracing, just a few lines of code.’
He believes it will come down to a trade-off between performance and ease of use.
Dr Tom Wilkie is the editor for Scientific Computing World.
You can contact him at email@example.com.
Find us on Twitter at @SCWmagazine.