Efficient clouds reduce the time to science

Clouds can be flexible but still have to be used efficiently, according to Dr Bruno Silva, who believes that new cloud technologies will make the cloud even more important to scientific computing.

Scientific Computing World recently reported on two new cloud initiatives being developed to further scientific research, asking whether the cloud will change scientific computing. In the USA, researchers are attempting, over the next five years, to federate the private academic clouds of three, geographically dispersed universities into the Aristotle Cloud Federation.

In the UK, eMedLab is a biomedical data analytics cloud. It is primarily a high-performance computing system with high data throughput characteristics. Its design was proposed by OCF, a UK-based systems integrator, and has been improved and refined through close collaboration with the eMedLab operations and support team, spanning the Francis Crick Institute, University College London, Queen Mary University of London, London School for Hygiene and Tropical Medicine, EMBL-EBI, the Sanger Institute, and more recently Kings College London, led by me at the Francis Crick Institute.

Use private clouds efficiently

Although eMedLab is a cloud, the fact that it is private makes it fundamentally different from the public, commercial variety -- well beyond the simple fact that in the latter there is a pay-as-you go model. Resources in a private cloud are finite. Users of a private cloud, unless given a fixed, time-bound amount of resource, cannot simply fire some jobs to a system like this, and expect it to run with the most efficient resource allocation possible. In fact, the onus falls squarely on the researcher to utilise this resource efficiently, which introduces problems in terms of efficient utilisation and return on investment.

Cloud is sometimes thought to be useful for specific types of workloads where efficiency and utilisation do not matter so much, as this is offset by higher customisation – low efficiency and utilisation being the ultimate price for flexibility. Interestingly, there is a divide between scientific computing infrastructure managers: some believe that resource utilisation is of the utmost importance while others disagree: noting that most research computing hardware is generic, what matters in their view is time-to science and not time spent in optimisation.

There is a perception, often repeated, that these two views are conflicting – that there is a conflict between time-to-science and efficient resource utilisation. The truth is that they are in fact coupled in the traditional scientific computing model which is constrained by finite resources: if computing resources (cloud or otherwise) are over-subscribed, science will always be delayed.

It is therefore important that computations can be packed and performed very efficiently to maximise resource utilisation, and therefore produce the most science within the same resource footprint with the added environmental and economic benefits, and in less time. Therefore, private cloud infrastructures, owing to their limited size, have to be utilised efficiently. This can be achieved through carefully crafted use policies to which users must subscribe and respond, or by employing automation using scheduling technologies where VMs are optimally allocated based on a traditional resource allocation model, in addition to ensuring that software is parallelised and optimised, so long as the efforts required to do the latter do not negatively impact time-to-science and are cost-effective.

The game-changer: cloud bursting

The emergence of public cloud and the ability to cloud-burst is actually the real game-changer. Because of its ‘infinite’ amount of resources (effectively always under-utilised), it allows for a clear decoupling of time-to-science from efficiency. One can be somewhat less efficient in a controlled fashion (higher cost, slightly more waste) to minimise time-to-science when required (in burst, so to speak) by effectively growing the computing estate available beyond the fixed footprint of local infrastructure – this is often referred to as the hybrid cloud model. You get both the benefit of efficient infrastructure use, and the ability to go beyond that when strictly required.

Another important aspect of the cloud model, and in particular of the use of virtualisation, is the ability to encapsulate workloads in a repeatable and portable manner, and to utilise these in multiple infrastructures. This increases the number of resources available to researchers to do their work, and also helps with the goal of improving the reproducibility of the scientific results by, for example, allowing the results published in a paper to be reproduced using a workflow installed in a standard Virtual Machine along with the original data set, both made publicly available. This is strongly aligned with the EU’s Open Data policies for EU funding, and increasingly for national, as well as private funding agencies such as the Wellcome Trust with its data sharing policy to maximise the public benefit of research.

OpenStack is the cloud operating system of choice in most biomedical research sites, as well as for CERN sites and Grid PP (Particle Physics), making it a natural choice for eMedLab in the context of UK and EU academia. It is also supported by the fastest growing open source community, with contributors from across all industries, and great hardware vendor support.

The OpenStack roadmap also includes the development of new features to support interoperation with commercial public clouds and federation between multiple sites – these are strategically important for science, due both to the way funding agencies are seeking collaboration between institutions, in particular in the EU, and by the need to explore new research avenues through multi-disciplinary cross-breeding of ideas and access to data, which otherwise would be locked behind technical or legal barriers.

eMedLab is an example of a successful research grant by the UK’s Medical Research Council, exactly to promote the crossing between research domains and data sets, using varied methodologies, with the goal of generating new insights that will lead to new clinical outcomes, with the goal of joining a federation of similar clouds.

Even the cloud model is in flux

To answer Scientific Computing World’s question about how profoundly the cloud will change scientific computing, it’s worth noting that even the cloud model is now in flux. New technologies are coming along that could potentially up-end it or revolutionise how it is consumed and at the very least how it provides computing resources to its consumers.

Technologies like containers, made popular by Docker, do away with the need to have an entire operating system hosted in a virtual machine with their additional (and resource-costly) hardware abstraction layers. Intel’s ‘clear containers’ go a step further by utilising virtualisation technology in hardware to make containers even more efficient. Unikernels go yet another a step further by reducing the footprint of traditional virtual machines while at the same time, preserving all their security features and the cloud model intact.

The research infrastructure and scientific computing community is very keen to explore these avenues and will no doubt be contributing to their development, as historically has been the case.

Dr Bruno Silva is High Performance Computing Lead at The Francis Crick Institute in London

Twitter icon
Google icon icon
Digg icon
LinkedIn icon
Reddit icon
e-mail icon
Analysis and opinion

Robert Roe looks at research from the University of Alaska that is using HPC to change the way we look at the movement of ice sheets


Robert Roe talks to cooling experts to find out what innovation lies ahead for HPC users

Analysis and opinion