A silver lining for HPC in the cloud
As cloud-based HPC technology matures, the flexibility it provides is becoming attractive to users across a number of industries not typically seen as HPC users – from engineering and AI to medicine, media and broadcasting.
Cloud-computing technology was implemented successfully in enterprise and commerce years ago, but it was not quickly adopted by HPC users. HPC and cloud were not immediately compatible as performance and the underlying hardware was not suited to HPC applications.
There has been significant investment in developing HPC specific technology for use in cloud computing, and this is now available from both large-scale cloud providers and smaller companies focusing on the HPC cloud market.
Cloud-based HPC can be used to replace traditional on-premises supercomputers or clusters. But it also has other uses which can complement an existing HPC infrastructure. Cloud can provide flexible access to HPC systems and allow users to benchmark new technology – without the significant investment required for new infrastructure.
While public cloud implementations may not be suitable for HPC users that require the highest levels of performance, users can adopt strategies around private on-premises clouds or systems hosted by another company. Even for users with an established computing infrastructure, cloud can be used to burst additional workloads during times when work exceeds the capacity of an on-premises cluster.
David Power, CTO for cloud computing provider, vScaler, explained that the company has been developing and growing organically to support customers that were not initial targets of the technology.
He explained that what started off as a focus towards delivering traditional HPC technology, such as high-performance hardware, low-latency fabric and high-bandwidth storage, was applicable to a growing number of application areas outside of the traditional academic HPC users.
‘We initially covered quite a broad spectrum on the academic and research computing domain, but as we have been going out and speaking to people we are now starting to work with the manufacturing and automotive industries. They are using HPC technologies to do a lot of internal prototyping and engineering.
‘We are also working with customers in media and entertainment that are using HPC technologies to do rendering and visual effects, sound effects or graphic design. They were not initial targets of our technology as we started building it, but, as we have grown, we are starting to see the foundations of HPC are applicable to a lot more domains’ said Power.
Power explained that media companies using cloud HPC for IP-based delivery ‘uses a lot of transcoding so there is a fair amount of CPU grunt needed’.
‘That has put a big requirement on the compute power and storage, to allow companies to provide these services to their users. Strip back the application layer for delivery and transcoding and all of that stuff, and we are using HPC technology behind the scene to accelerate their workloads,’ Power stated.
Power also commented that many of the vScaler users are using the company’s public cloud facility – but not necessarily for the reasons that the firm had originally envisioned.
‘While we have invested in our public infrastructure and put a load of HPC technology in there, we are really using that more as a burst facility for customers, or as a prototyping facility,’ said Power.
He explained that many of the customers vScaler works with want to test cloud computing capabilities using the public cloud facility. Many of those customers choose to deploy their own on-premises solutions from vScaler, after having the ability to test hardware configurations and hardware to see what works best for them.
Power also notes that we will continue to see this multi-tiered approach to cloud computing, as it allows users to benchmark new technology before investing heavily in their own infrastructure.
Another application area that has been generating interest in cloud computing is AI and deep learning frameworks. The team at vScaler has been working with a company using AI and deep learning to develop autonomous vehicles. They are developing a prototype that can stream data from the vehicles back to their cloud system, which will be used for processing and running analytics on the data that has been collected.
‘That was not a field that we started building vScaler for, but as a result of all the capabilities that we are able to deploy and configure, and the performance of the system, means that it is actually quite a good fit for these AI applications,’ stated Power.
‘HPC nowadays means a lot more than just the traditional scientific workloads. The technologies that we use for HPC are applicable to a lot more of the market today,’ Power concluded.
Medicine and healthcare is one discipline that has adopted cloud technology, especially for management of informatics data and sharing information between multiple sites or laboratories. However, the use of cloud-based HPC is gaining traction for heavy workloads such as the development of initiatives for precision medicine.
Precision medicine and the concept of predictive medicine that came before it focus on the use of using targeted personal data about individual patients in order to make more accurate, rapid diagnoses. Generally, this involves the use of genetic screening of individual patients in order to build up a comprehensive genetic picture of a patient, but this can also include data from other areas, such as pollution data for the area where the patient lives.
Earlier this year, two companies that are partnered with Amazon Web Services (AWS) announced a partnership to combine their technologies to create a platform that can deliver the performance and security needed to drive precision medicine research.
The partnership combines the DNAnexus platform-as-a-service (PaaS) to create custom workflows and Edico Genome, which specialises in accelerating genome sequencing analysis for precision medicine using FPGAs.
DNAnexus has architected its platform to align with key security and compliance frameworks, such as HIPAA, 21 CFR Part 11, CLIA, and FedRAMP to provide security and ease of use through the service-based platform.
Edico Genome uses FPGAs in its Dynamic Read Analysis for GENomics (DRAGEN) software. By optimising the FPGA logic for this specific application, Edico can deliver results much faster than CPU implementations. The company has released a white paper detailing how it sequenced the entire genome of a newborn baby in 26 hours, to demonstrate the speed of its technology.
DNAnexus and Edico Genome announced a joint partnership to integrate Edico Genome’s DRAGEN solution, deployed on Amazon EC2 F1 instance, into the DNAnexus platform. This integration gives customers the ability to use the combined capabilities of the two companies. This provides the speed of DRAGEN to analyse genomes coming from high-throughput sequencers, while also making use of the security and compliance controls that DNAnexus has implemented through AWS.
This new platform has been adopted by the Rady Children’s Institute for Genomic Medicine in San Diego, USA. The institute aims to advance genetic screening and precision medicine for infants and children with the aim of developing rapid diagnosis and targeted treatment for critically ill patients.
The institute adopted the DNAnexus platform to gain a secure, flexible, and scalable environment for local and distributed sequencing and analysis.
Stephen Kingsmore, president and chief executive officer at Rady Children’s Institute for Genomic Medicine, commented in a blog post by AWS, that the institute’s ‘goal is to ensure that genome-powered precision medicine is available to every child who needs it. To do this, we needed a rapid research-to-bedside pipeline and be able to scale it and make it accessible to hospitals around the world’.
Kingsmore also added: ‘DNAnexus has the technology and expertise to facilitate this ambitious project, Edico Genome’s rapid testing capability allows for rapid diagnosis of critically ill newborns.’
As the number of applications for cloud computing grows, so too does the number of implementations. The traditional choice of whether to set up an in-house deployment, use hosting services, or to use a public cloud infrastructure still persists, but now this is further complicated by the number of cloud providers.
Selecting some technologies can lock you into a specific technology as the effort and investment required rewriting code for a new architecture or GPU can be prohibitively expensive.
Some cloud providers, such as vScaler, are trying to alleviate the problem by using OpenStack for its cloud technology. ‘There is a little bit of resistance in putting all your eggs into one basket and getting locked in to a single vendor. It is good for us, because we are an OpenStack based product. All of the APIs that we expose are completely open standard, so if you can push and burst into our cloud, you have the flexibility to use other OpenStack clouds,’ said Power.
Power added that the Linux implementation of OpenStack made developing the vScaler product easier, as the HPC specialists had considerable experience with Linux-based systems. ‘We have spent years tuning and building HPC systems, so we were immediately able to go in and optimise the OpenStack platform,’ stated Power.
As the use cases for cloud computing increase, many new users are beginning to take up the technology.
In the opinion of vScaler’s David Power, we will continue to see growth in demand for cloud computing. ‘It started off as just a prototype with a single customer, but over the years it has developed into more of a comprehensive product as we have been building and adding additional features and capabilities.
‘Cloud addresses a need, certainly, and I do not see that need going away. Nearly everyone that I go and see, now has some form of ambition or strategy towards enabling a cloud technology within their HPC infrastructure,’ Power concluded.