Maximising the value of HPC
HPC integrators can help scientists and researchers to access and manage their high performance computing (HPC) resources to get the most out of their computing infrastructure. Increasingly, this includes hybrid cloud environments and managed services, which also help to support the upkeep and usability of the systems.
But for the service provider to determine the right level of support for each organisation requires knowledge of the existing stack, the use application portfolio and the level of expertise of the user community. Andy Dean, sales director at OCF, commented: ‘It depends on how mature their usage of HPC is within that specific customer. So if somebody has been using HPC for a number of years, this could be the second, third, fourth iteration of the HPC environment; they tend to have a very good understanding of the workload they’ve got.’
Dean described how mature users can fall into two further categories. Users with a specific small set of applications want to maximise the performance of those key applications. ‘In that scenario, it becomes a conversation about how do we achieve their performance requirements – if they have requirements for how quickly they want to run a model, for example,’ said Dean.
Alternatively, there could be wide and varied workloads where the customer wants to try and provide a balanced infrastructure that meets a wide number of requirements, but potentially is less specialised than a system focused on just a handful of applications. ‘Or maybe they’ve got a very varied workload,’ Dean added. ‘Then we might look at how we will build an environment to try and keep all of the users happy. In that scenario, you want to deliver a balanced environment that works for the largest number of users.’
When the customer or its user community are new to HPC, then engagement focuses on understanding the application portfolio and how the users intend to make use of the HPC resources. ‘Other potential customers are very new to HPC, or really looking to expand their environment and aren’t necessarily in a position to understand precisely what all of their users are after and what they require,’ noted Dean. ‘For those kinds of customers, it can be more like an interview-type process that’s more of a consultancy engagement, rather than a technical pre-sales engagement.’
Dean states that it is important to understand exactly what the users currently have access to and how the resources are being consumed. ‘We need to build a picture of the organisation that we can build into a report that can be used to help shape what their next system might look like. It depends on how well the customer feels they understand the requirements and then we kind of go down one of the two approaches.’
Integrators no longer just provide hardware and support services but increasingly deliver fully managed services. In the future, this may also extend to application support and optimisation. Cloud providers deliver similar hardware and managed service agreements, but it remains to be seen whether they can provide the same level of expertise in HPC-specific applications, hardware and software frameworks.
‘Our business is moving further up the stack,’ said Dean. ‘We’re initially involved in projects around deploying HPC. Now when we are deploying a system, we typically use our own OCF steel stack that’s based on a number of open source technologies. But in addition to the integration side of things, and support of that, we’re also getting a lot more involved in managed services, and helping to manage those environments.
‘I can see, as time goes on, that we are being asked more specific questions about applications, end-user management. I can imagine that’s a direction where things are going, that we’re being asked to do more.’ Dean also noted that while this is possible in some centres with a more monolithic application portfolio, many HPC centres have a large set of applications that prohibits optimisation of each one individually by their integrator partner.
‘Some users have hundreds of applications, so there’s not much point in really spending a lot of time optimising for each application; you’re trying to build something that works for everyone. It’s more on the commercial side of things; we find that users – maybe they’re running two or three engineering applications, let’s say – and we work closely with them to make sure we’re picking the right hardware initially to make sure we’re getting the best out of that application. There can be other work to help with the workflow side of things as well,’ Dean concluded.
The role of cloud computing
At a time when energy prices are soaring and sustainability is becoming an increasingly complex problem for many organisations, data centre provider atNorth is supporting scientists and researchers with its HPC, GPU and AI data centres, which are based on energy-efficient hardware and renewable energy sources with additional heat recovery, improving the cost of delivering highly complex computing services.
The Nordic data centre company says it offers environmentally responsible, power-efficient, cost-optimised data centre hosting facilities and high-performance computing services. It describes its HPC resources as sustainable, highly scalable and fully delivered as a service, enabling scientists and researchers to focus on their applications without having to worry about the underlying HPC infrastructure. atNorth also recently announced the availability of its new GPU-as-a-service (GPUaaS) solution. This new service is aimed at scientists who want to accelerate deep learning, machine learning and HPC workloads that are suitable for large-scale use of GPUs.
The company has data centres based in Iceland and Sweden that are specifically designed and optimised for HPC and AI computing. According to atNorth, these resources are delivered through managed services that can be tailored to meet customer requirements, based on their level of experience and the type of service level agreements (SLA) they require. These managed services can be scaled up or down as necessary to provide general capacity for everyday operations and the ability to cloudburst or quickly scale up operations as needed.
Guy D’Hauwers, sales director — HPC and AI, atNorth, commented: ‘The speed at which technology innovation is moving is often incalculable, and much of this is due to digitalisation and the rise of extreme data-hungry applications to fuel the transformation. Today’s data-driven businesses are reinventing the way in which they work and recognise they need a new type of partner that can help them achieve next-generation computing power, with great connectivity and infrastructure built on high precision and sustainability.
‘Our GPUaaS solution not only multiplies the HPC and AI capacity, delivering energy and cost-efficient service, but it also operates as a full tech stack solution that does the legwork for our clients, so their data scientists, engineers, developers and researchers can focus on their increasingly important day job – from building solutions and services to gathering insights.
D’Hauwers added: ‘Many companies are adjusting their business models to secure stakeholders’ trust and safeguard long-term profitability. Digitalisation has had a massive impact on our climate, yielding an ever-increasing demand for electricity and rising carbon emissions because of its acceleration. Many businesses rely on technology and data to drive value to their customers, whilst also recognising this can come at a cost to the environment. Therefore, many businesses are migrating their data centre footprint to atNorth’s site in Iceland. Businesses must be exploring the best possible ways to walk the talk by adopting best practices when it comes to reducing the digital footprint of their IT and operations.’
GPUs are designed for high-density workloads, such as advanced calculations for AI, natural language processing, scientific simulations and risk analysis. The nature of these applications, in addition to rising costs associated with using public cloud services and increased pressure on sustainability, has tasked many organisations with the challenge of finding new alternatives to ensure continuity with high-performance applications in a cost-effective and energy-efficient way. atNorth’s new GPUaaS will offer a much larger capacity, according to D’Hauwers. Its sites already deliver a total capacity equivalent to 125,000 A100 GPUs, with plans afoot to double this in the next 12 – 18 months.
But to sustainably grow the capacity for its HPC and AI systems, atNorth has had to be very careful about how it designs its data centres, taking advantage of the climate in the Nordics and available renewable energy sources to deliver highly efficient, HPC and AI infrastructure.
‘We rely on the fantastic weather of the Nordics,’ stated D’Hauwers. ‘When it comes to data centres and HPC, the Nordics are relatively cold, and so we take that benefit and make use of it for HPC and AI. Our users benefit from this infrastructure and deliver faster science and simulation projects. And as a result, they are capable of improving the time to market and doing this in a very sustainable way, driving towards carbon neutrality where possible.
‘And, because we do everything from the ground up from the data centres, not just from the HPC side, but the entire integration of it as a fully managed service to the user, this allows the user to focus on their own business, their core business,’ added D’Hauwers.
‘We use 100 per cent renewable energy,’ stated D’Hauwers. ‘In Iceland, this comes primarily from geothermal energy. But, importantly, the goal is to use much less of it. Because “the energy you don’t use is the most sustainable”.’
Rising energy prices across Europe are making cloud an even more attractive proposition than running an in-house data centre, as users can effectively outsource their IT requirements to countries with lower energy costs. D’Hauwers added: ‘So they bring [their project] to us, and they see the delta of so much less energy used, and then, on top of that, the energy is 100 per cent renewable.
‘With the current evolution of energy prices in mainland Europe, the cost of energy has been driven up substantially, with gas prices increasing and so on. The pricing [is going] through the roof’. But D’Hauwers was keen to stress it was not just renewable energy sources but atNorth’s focus on efficient systems that can reduce overall enery consumption, which is key to the company’s vision. ‘The whole stack we deliver, the data centres are efficiently built for HPC and AI, [as well as] the way the cooling is done and the recovery of heat.
‘For example, what we do in Stockholm is recover the heat and sell it to the municipality to heat tens of thousands of houses. We constantly take whatever actions are possible to make it as sustainable as possible.
‘Of course, these AI systems and big supercomputers use a lot of energy, but if you can reduce that to the minimum, there is a huge benefit for research organisations,’ said D’Hauwers.