Cloud converges on HPC
Almost everyone was looking to the cloud, Robert Roe discovered when he attended the UK’s HPC conference in December.
The convergence of cloud computing and HPC emerged as a major theme of this year’s Machine Evaluation Workshop, held at the Ricoh arena in Coventry in December. Now in its 25th year, the event, organised by the UK Government-funded Science and Technology Facilities Council, offers a short but intense insight into developments in the HPC industry both in the UK and internationally.
Many of the speakers saw the convergence of cloud computing and HPC not as the next potential trend in the HPC industry, but as an inevitability on the march towards ever cheaper, ever faster HPC systems. Solutions are already entering the market, designed around cloud bursting or what was described as ‘HPC as a service’.
Enterprise computing has embraced the cloud faster and more enthusiastically than HPC, but now the same ideas are being applied to HPC and this is fundamentally changing the way many services are delivered. Bright’s cluster manager, for example, has been extended to manage cloud infrastructure.
The promise of increased capacity on-demand, in conjunction with the reduction of investment in infrastructure, has led to a wealth of cloud based services from huge Amazon server farms to smaller niche offerings that can address the needs of HPC users.
Mark Allsopp, Clustervision’s country manager for the UK and Ireland, said during his presentation at MEW25: ‘HPC the cloud and big data are coming together.’ He went on to explain that ‘the cloud gives you the flexibility to deploy the solution you want.’
ClusterVision focused its attention on OpenStack, the open-source cloud operating system that enables the control of compute, storage, and networking resources directly throughout a datacentre.
ClusterVision realises that to make cloud-based HPC an attractive option to customers, it needs to provide a flexible solution that can be tailored to the highly specialised hardware configurations and to the carefully optimised software that is used by many of today’s HPC systems, Allsopp said.
By adopting OpenStack as its method of choice for cloud deployment of HPC, ClusterVision believes it can work with an already very active community to offer a solution that makes best use of the cloud infrastructure. However there have been some concerns over the performance of cloud infrastructure, Allsopp said: ‘We have a lot of technology that we are working on to ease HPC adoption.’
OpenStack uses one of multiple supported hypervisors in a virtualised environment such as KVM and XenServer. However ClusterVision reported that it has had success using Docker, an open platform for developers to build, ship, and run distributed applications. Using Docker, ClusterVision found that it could minimise virtualisation overheads and achieve greater efficiency and performance.
Docker removes the need for the guest OS; instead it runs only the application and its dependencies through the Docker engine on the host operating system, sharing the kernel with other containers. This provides many of the advantages of existing VMs such as resource isolation, but in a more efficient manner.
Boston showcased a range of servers for cloud-based deployments. Its ‘business in a box’ solutions provide a range of turnkey cloud computing platforms for composing, running, and scaling distributed applications, with the added bonus that they are designed to fit alongside existing IT infrastructure.
David Power Head of HPC at Boston said: ‘We have a number of different deployments to suit a number of configurations.’ This year however Boston was highlighting the CloudX platform, based on Mellanox’s OpenCloud architecture. Not only is this platform designed around off-the-shelf building blocks, making it easy to configure and maintain but it can also be configured as an open-source environment giving greater control of the cloud to engineers in the data centre.
The CloudX platform is designed to reduce infrastructure costs, particularly compute and storage, but the use of Mellanox interconnect technology, principally 40 and 56Gb/s Infiniband and Ethernet, allows users to transfer data across the cloud into their data centre much faster than is typically possible. This in turn opens up the possibility of real-time analytics in the cloud.
Bright has extended its cluster management software for use in cloud deployments, allowing users, especially those with experience using Bright software, to easily manage and provision cloud resources to make best use of the infrastructure. Not only does the software provide all the normal tools for managing an HPC cluster, it also enables users to create a new cluster on the fly in the cloud with just a few mouse clicks.
Bright also paid special attention to OpenStack, explaining that its software could be used to manage an OpenStack cluster in much the same way as a traditional cluster. Teun Doctor, a software developer at Bright Computing warned that: ‘Managing an OpenSTack cluster can be even more difficult than managing other types of clusters.’
Doctor continued: ‘Bright aims to provide a single pane of glass to manage and monitor all aspects of an OpenStack cluster.’ By providing one integrated environment, Bright aim to reduce the time spent managing cloud resources, allowing users to spend more time getting results from their HPC systems.
On the final day of MEW25 a special event was held to look at the benefits of using HPC in a variety of industries; again cloud computing took a prominent role.
Andy Searle CAE & HPC IT Manager at Jaguar Land Rover (JLR) discussed the role of virtual prototyping at JLR. He explained the importance of simulation in the development of JLR vehicles. He said ‘We have seen a complete shift from physical to virtual prototyping.’
JLR has an in-house cluster of 22,000 cores: ‘But demand is coming into our environment for 50,000 cores’ said Searle. This of course leads to times where the compute infrastructure at JLR cannot support the sheer volume of simulation that the company is undertaking; Searle explained that cloud bursting can deliver the extra performance in times when the workload is too much for the existing cluster. Searle said: ‘I have to give my teams the ability to compete in a global marketplace with the likes of Toyota and BMW.’
Industry and academia face the reality that they must constantly do more with fewer resources. Shrinking R&D budgets and reduced government spending all contribute to this environment, and cloud computing offers a chance to get ahead. Delivering the performance needed without having to buy, implement and manage an in-house cluster is an advantage to many HPC users.