Supporting science with HPC cloud services
HPC integrators can help scientists and HPC research centres through the provisioning and management of HPC clusters. As the number of applications and potential user groups for HPC continues to expand supporting domain expert scientists use and access of HPC resources is increasingly important.
While just ten years ago a cluster would have been used by just a few departments at a University, now there is a huge pool of potential users from non-traditional HPC applications. This also includes Artificial intelligence (AI) and machine learning (ML) as well as big data or applying advanced analytics to data sets from research areas that would previously not be interested in the use of HPC systems.
This culminates in a growing need to support and facilitate the use of HPC resources in academia or research and development. These organisations can either choose to employ the staff to support this infrastructure or try to outsource some or all of these processes to companies experienced in the management and support of HPC systems. HPC Integrators can help an organisation choose the right technology to support their application portfolio. Moreover, integrators can also manage and support the HPC system which reduces overheads and the need to provide a large in-house team to support the use of HPC.
Technical support, maintenance, consulting services or project management and even fully managed HPC services can be delivered by HPC integrators in order to reduce the burden to support technical IT departments and scientists’ use of HPC. Maintenance for example can help reduce the downtime of HPC systems and help to predict the failure of systems or components. Increasing the time that a computing service is available also increases the amount of scientific output that a HPC system can generate. Integrators can also support research centres that need access to specialised clusters to support domain specialists who need access to HPC. This is seen in areas such as biomedical research, engineering, chemistry, cosmology, weather and climate simulation and other areas of research that are heavily focused on simulation or large scale data analysis.
For example, Biomedical research centres may need access to very large storage resources to support large data sets that they are dealing with. Engineers may need access to specialised hardware that can support large scale simulations in order to generate insight into ongoing projects. Altrantibgely HPC can also speed up the time to get new components and systems to market.
OCF supports engineering research In a recent blog post, Andrew Dean, sales director at OCF states: ‘The need for agility is never more apparent than in manufacturing, where we are creating products today more complex than ever.’
‘Especially when taking advantage of the latest manufacturing techniques such as additive and subtractive manufacturing, added Dean. ‘Traditional design and manufacture methodologies, as well as the once leisurely timeframes, are simply not the way to stay ahead of the competition today.’ ‘Computer-based simulation has been well-established for decades, but we are continuing to see more smaller organisations and engineering teams realising its potential for the first time,’ states Dean. ‘Being able to take advantage of simulating tests that would be impractical or uneconomical to carry out on physical prototypes.
When designing and validating a product using digital twins or simulation models, engineers can quickly modify designs and create alterations and new iterations of components quickly. This can accelerate time to market or help an organisation innovate on designs before a new product is brought to market. The benefits mean that engineering simulation makes sense from a financial point of view for a wider range of end-users across various industrial sectors and organisation sizes.
‘Whilst the benefits of engineering simulation are clear, in many organisations when the use of engineering simulation becomes an established part of the product development process, innovation can quickly become reduced by IT equipment, often relying on single-user workstations, that are being used to process these simulations.
Users can be limited in the number of simulations they can run or simulations simply taking too long to complete. This is simulation has been well-established for decades, but we are continuing to see smaller organisations and engineering teams realising its potential for the first time” an area where HPC can provide a platform to support engineering research and development.
‘In simple terms, an HPC cluster combines a number of identical servers, a fast network, and some management tools to give a single pool of compute resources that can be shared across a number of users,’ states Dean. ‘Simulations can be submitted to a scheduler and run across multiple servers simultaneously, returning results quicker than could ever be possible on a single workstation.’
‘By centralising resources amongst a number of users, and using the scheduler to queue up jobs, the HPC cluster can be kept busy, so in addition to being able to deliver results quicker, can also offer much higher utilisation, and therefore simulation throughput, than an equivalent amount of compute capability spread across multiple users,’ Dean continued. With the additional benefit that users’ workstations are freed up and they can concentrate on other engineering work, rather than waiting around for jobs to finish.
There is a perception that the added complexity of HPC can make these systems hard to manage or even potentially beyond the reach of organisations that may currently be using high spec workstations for engineering simulation. However, integrators can assist in the management, maintenance and service quality of HPC system which vastly reduces the burden on the organisation. This allows a research centre for academic institutions the freedom to support its scientists’ use of computational resources without requiring a large investment in large numbers of staff to support the more technical aspects of running a HPC service.
‘Whilst I won’t deny there is some inherent complexity this can be mitigated against - the key here is to work with the right partners and with the right technologies – there are software products out there that make adopting these technologies much easier and specialist integrators, like OCF, that take the pain out of designing, installing and managing these systems,’ states Dean.
‘If manufacturers make HPC tightly integrated with their engineering simulation applications, it can be possible to make HPC just another tool for engineers,’ Dean continued. ‘Selecting HPC systems can be as straightforward as choosing a printer making it easier to benefit from the capabilities HPC systems gives you: high utilisation and scale. By moving away from engineering simulation being performed on discrete workstations by engineers working in isolation, a HPC system can bring these single sources together, into one big, centralised pool. Which results in each individual engineer having access to a much bigger resource,’ Dean concluded.
Dell bolsters the University of Pisa’s remote learning
HPC integrators can also help to modernise and update existing architectures to support new types of research. In the case of The University of Pisa, Dell was able to support the use of a new storage architecture that helped the staff transition to remote learning while also supporting advanced research.
Maurizio Davini, CTO, University of Pisa comments: ‘As we transitioned to remote learning, we needed reliable, scalable technology to provide our 53,000 students and faculty with quick, easy access to critical data and applications at all times, from any location. Dell EMC PowerStore is at the centre of our IT modernisation efforts because it delivers the high performance and availability needed to support leading-edge teaching and research without any downtime or data loss.’
The university uses Dell hardware to store scientific computing applications for genomics and biology, chemistry, physics and engineering. The Dell PowerStore’s NVMe-based, adaptable design delivered a 6x performance improvement on previous storage infrastructure, making applications faster and easier to access.
PowerStore’s deduplication and compression capabilities also have allowed the university to realise three times the capacity savings when compared to its previous storage infrastructure. PowerStore’s AppsON capability, which allows storage administrators to run applications directly on the storage array, has greatly simplified application and data mobility throughout its four data centres spread across the city.
Jeff Boudreau, president and general manager, Infrastructure Solutions Group at Dell Technologies states: ‘Performance and availability are key for institutions like The University of Pisa that pursue critical medical and biological research. With the latest Dell Technologies storage systems, students and faculty have immediate access to the data and applications that drive meaningful research forward, even as Pisa has instituted remote learning.’
The university’s storage infrastructure also includes a Dell EMC PowerMax storage array to support VDI and remote workstations and database workloads. With PowerMax, the university has experienced five times faster data processing and 80 per cent better performance for its essential applications, allowing them to access key research and applications faster than before.
In order to support demanding AI and bare-metal HPC workloads, The University of Pisa also uses Dell EMC PowerScale all-flash storage. PowerScale’s scalability and flexibility is important as the university expects its total amount of unstructured data to double within a year.
HPC integrators are no more impervious to the march of AI than other HPC companies and so they too must adapt to support this fast-growing technology. In a recent announcement, European HPC specialist Atos announced it was partnering with Graphcore so that it could offer advanced AI solutions to scientists and researchers.
In a press release discussing the announcement Fabrice Moizan, general manager, and senior vice president Sales EMEAI and Asia Pacific at Graphcore commented: ‘ThinkAI represents a massive commitment to the future of artificial intelligence by one of the world’s most trusted technology companies. For Atos to have put Graphcore as a key part of its strategy says a great deal about the maturity of our hardware and software, and the ability of our systems to deliver on customer needs.’
ThinkAI brings together Atos’ AI business consultancy expertise - with its experts at the AtosCenter of Excellence in Advanced Computing - with its digital security capabilities and its software, such as Atos HPC Software Suites, to enable organisations to accelerate time to adopt AI technology.
Graphcore, the UK-headquartered maker of the Intelligence Processing Unit (IPU), plays a significant role in Atos’ ThinkAI offering, which is focused on the twin objectives of accelerating pure artificial intelligence applications and augmenting traditional HPC simulation with AI. Graphcore’s IPU-POD systems for scale-up data centre computing will be an integral part of ThinkAI.
Agnès Boudot, senior vice president, head of HPC and Quantum at Atos said: ‘With ThinkAI, we’re making it possible for organisations from any industry to achieve breakthroughs with AI. Graphcore’s IPU hardware and Poplar software are opening up new opportunities for innovators to explore the potential of AI for their organisations, complemented with our industry-tailored AI business consultancy, digital security capabilities and software, we’re excited to be orchestrating these cutting-edge technologies in our ThinkAI solution.’
University of Aberdeen’s HPC usage rockets by 50 per cent and continues to support pioneering research with remote compute power
Throughout the pandemic, the University of Aberdeen has quickly adapted to change with its continued commitment to, and investment in, innovative technologies. The institution’s High-Performance Computing (HPC) cluster, named Maxwell, designed, integrated and managed by HPC, storage, cloud and AI specialist OCF has been instrumental to this. It has provided large amounts of remote computational processing power to ensure the development of research remains of utmost importance during this turbulent time and beyond.
Maxwell supports research at the university’s Centre for Genome-Enabled Biology and Medicine (CGEBM) and provides a centralised HPC system for the whole university, with applications in medicine, biological sciences, engineering, chemistry, maths and computing science. Researchers are using Maxwell in various schools across a wide range of disciplines and research topics – including genome sequencing and analysis, chemical pathway simulation, climate change impact assessment and financial systems modelling – and as a catalyst for interdisciplinary research in areas such as systems biology. With 20 times more storage than the university’s previous HPC system, Maxwell comprises four Lenovo servers for management, 40 further Lenovo compute nodes and a significant expansion of Nvidia GPUs. OCF is also providing a software stack and its HPC Virtual System Administrator service management to support the in-house HPC team.
Dean Phillips, assistant director, digital and information services at the University of Aberdeen, explains: ‘OCF’s HPC Virtual System Administrator service is an extension of our team and really helps to ensure the smooth day-to-day running of our HPC cluster and with dealing with support issues, user requests and keeping on top of software and security updates.
‘Throughout the pandemic, OCF helped with our infrastructure and helped us to use it in different ways to suit our needs as the world changed. OCF delivered the knowledge and expertise needed and were quick to react, in and out of hours. As a result, we have created a stronger, long-term relationship and we are a true partnership.’
The HPC service is suited to solving problems that require considerable computational power or involve huge amounts of data that would normally take weeks or even months to analyse on a desktop PC. Maxwell can provide over a thousand desktop computers’ worth of resources for days on end, completing the work a single desktop computer would take a year to do in just one day. Therefore, the HPC cluster is paramount to the continued success of research work, especially when the university was forced to take work off-campus in 2020.
With a complete remote way of working instilled, researchers, staff and students alike needed constant access to Maxwell to utilise the sheer scale of compute power needed to carry on with life-changing research projects. The university understood support was key and upped its game further by developing digital skills workshops for the research community.
With support from OCF, the university developed its teaching and training HPC environment called Macleod, which supports more than 30 courses, increasing visibility and understanding of Maxwell. The sessions were well received, upscaling the uptake of the system, as a new bank of individuals wanted to understand and use it. Through working remotely, there is now a bigger active audience with an eagerness to adopt nascent technologies and adapt to new ways of working. As a result, when lockdown started the usage of Maxwell doubled and the use of the HPC cluster is still well above the pre-pandemic baseline, meaning results are delivered faster, new discoveries and game-changing products are developed, and improved times to science and market are realised.
The university had the foresight to recognise the potential of the HPC and how it could affect the wider community to positively impact people and everyday lives. So, as well as supporting the university, the HPC application has also been broadened to support small business initiatives to drive much needed economic growth and innovation in the area.
Phillips explains: ‘At Aberdeen, we are passionate about making a difference and delivering outcomes that affect real life.
‘We are very connected to the business community in Aberdeen and we work closely with the Small Business Research Initiative to engage with local start-ups that could use Maxwell to support their research. Maxwell has the ability to do algorithm work, driving AI innovation to support the NHS in Aberdeen.
‘We have a huge part to play, and these are exciting times. I come from a research background working with clinicians, so being able to support where we can have a huge impact is monumental. This is just the tip of the iceberg at what can be achieved. Maxwell’s potential is huge.’