FEATURE

Cooling becomes a hot topic

Concerns about energy-efficient computing are driving innovation in cooling technologies, as Robert Roe discovers

At ISC’14 in Leipzig at the end of June this year, more than 20 companies were demonstrating different variations of cooling technology. It has become a competitive market and manufacturers of cooling technology must innovate to stay relevant. Some have chosen to go down the path of novel technology, whereas others favour the clever, iterative improvement of existing designs.

For Geoff Lyon, CEO and CTO at CoolIT Systems, flexibility is a key focus, due to the diversity of HPC systems. Lyon said: ‘We have a broad range of well-developed products that we can mix and match to create a custom solution for the customer, by utilising the building blocks that we have developed.’

He pointed out that: ‘We have been working, since our inception, on direct-contact liquid cooling. That’s in essence, utilising a micro-channel water block right down onto the source of the heat.’ The company uses a modular rack-based solution, designed to be as flexible as possible, allowing it to be used for any standard racks and with any type of server.

Moreover, as Lyon explained, not only can CoolIT’s solution be adapted to fit the majority of deployments but it can also make use of either hot or cold water. ‘What we have attempted to do is just maximise our efficiency. We minimised the ΔT [temperature change] from the cooling liquid to the processor temperature because the processors are often happy all the way up to 70 degrees Celsius or even higher. If we have a small difference between the liquid temperature and processor temperature it allows us the flexibility of using warm water as the cooling fluid.’

Calyos, a company based in Belgium, also uses direct-contact fluid cooling but has taken a different path of technological development. Its passive two-phase cooling technology, now being offered to the HPC market, was originally designed for power electronics, such as embedded or ground-fixed power converters.

Latent heat

The system makes use of the latent heat of vaporisation, with a passive capillary pump, to remove high-density heat from the CPU, GPU, or accelerator cards in any HPC servers. The use of two-phase technology inside a passive capillary pump means that the heat can be removed from the components without any need for external energy to pump the fluid inside the server.

Maxime Vuckovic, business developer at Calyos, said: ‘It starts with liquid (100 per cent) at the entry of the evaporator. Once pumped by capillary forces inside it, the liquid will vaporise (100 per cent vapour) while absorbing a lot of heat. Thanks to the latent heat of vaporisation, you have nearly no limits. If 400W CPUs did exist, we could cool them easily.’

Calyos’ solutions can use methanol as the cooling fluid, but the design also allows for the use of a refrigerant fluid.

Vuckovic explained that the solution from Calyos is simple and reduces data centre costs by using a passive system. The solution itself consists of only four main parts: a cooling block over the CPU; the piping that allows the liquid and vapour to reach the condenser; the condenser itself; and a reservoir that is integrated inside the evaporator.

Vuckovic concluded: ‘The advantage is that we are using vaporisation. The fluid inside the system enters in liquid form; it will be sucked through the capillary pump. Once it has been sucked by the capillary pump it will fully vaporise.’

Vuckovic stressed that the passivity of the system means no need for pumps to move the liquid or vapour around the cooling system: ‘This means that you are able to extract the heat outside of the server without spending a single watt of power.’ He pointed out that although ‘nowadays a pump is very reliable so you can see guarantee around 50,000 hours of lifetime, it still consumes power and it increases the chance that a system may break down.’

Take the server for a swim

In contrast to these approaches of taking the coolant to the server and processors, a radically different strategy for cooling is to take the server blades to the coolant. Green Revolution Cooling (GRC) has chosen this approach to cooling dense HPC systems. Its main product, called CarnoJet, comes in both containerised and standard forms and is a total fluid submersion cooling solution for data centre servers.

Servers are installed in special racks that contain a dielectric fluid, called ElectroSafe. The containerised version uses the same technology but pre-installed in a shipping container, which provides a very quick installation process.

Brandon Moore, sales engineer at GRC said: ‘We have one product that is housed in an ISO shipping container, either 20, 40, or 53 feet; we also have the standard CarnotJet product by itself, that goes into pretty much any data centre. It can be installed anywhere.’ All that is a required is a flat level surface, power,  communications, a roof, and a water loop. Walls are optional.’

Moore continued: ‘Our racks resemble a standard 42 U rack turned on its side, although we can do bigger sizes than that: we can do 42, 48, 55 and 60U tanks; and we can make the racks any width as well. We have actually done this for a couple of projects where we make a tank that doesn’t stick to the standard U format. We did this for the Vienna Institute of Technology – the VSC-3 project. In that project, the racks were 30 inches wide with severs stacked two by two.’

The VSC-3 system is the third iteration of the HPC cluster housed at the Vienna Institute of Technology. Eight Austrian universities will share the computing resources provided by the VSC-3 cluster, which has been heralded as the first ‘skinless supercomputer’. This involved removing the chassis and unnecessary metal parts.

Cutting construction and plumbing

Removing unnecessary parts such as fans reduces power consumption at the server level by about 10 to 20 per cent (for air-cooling down to approximately 5 per cent), which yields a 50 per cent drop in energy consumption for the entire facility. This approach also reduces capital expenditure for the infrastructure and installation costs due to the elimination of traditional cooling equipment and its associated infrastructure.

Moore said: ‘Let’s say you are building a new data centre, we eliminate the need for chillers, computer room air conditioners, raised floors – it really helps to simplify the build. Some customers have reported savings of up to 60 per cent on greenfield installations.’

Moore continued: ‘The infrastructure savings are immediate and usually greater than the cost of our system as a whole, and the energy savings are just an added bonus starting from day one.’

The containerised solutions can be extremely effective for greenfield deployments. However the converse is also true: existing data centres with a more traditional infrastructure, such as raised floors, would benefit more from a solution that could be tailored to a more traditional HPC data centre environment.

Moore said: ‘All I need outside of my system [GRC] is an evaporative cooling tower and a water loop. Evaporative cooling towers are extremely cost effective, very reliable, and very simple; we can deploy these things very rapidly.’

Moore continued: ‘It really simplifies the amount of contractors that you need on a build as well, which in turn reduces the labour costs needed to set up our system.’

CoolIT has developed its own solution to ease the burden on data centre operators deploying new racks of servers. Lyon said: ‘The trouble is, in the computing industry, there are a lot of people that are in a real hurry. Getting plumbing organised sometimes takes time. You have got an IT group that are anxious to have their new equipment installed, but then you also have to get the facility organised to get lines for the liquid cooling brought into the data centre. The timetables don’t always match up.

‘So we ended up with something that we actually developed for our own use in the lab, or for specialised applications, which was the air heat-exchanger.

‘This uses the same manifold and the same server modules that go down inside the server but, instead of going to a liquid-liquid heat exchanger, we developed a large form-factor liquid-air heat exchanger, with accompanying fan and pumping system.’

Lyon concluded: ‘Regardless of the eventual heat dissipation strategy, the manifold and the server modules are always the same. We designed it that way, not only so it would be easier to manufacture and keep on top of for the various different flavours of servers, but also so the customer can upgrade at a later date.’

Heat transfer and efficiency

CoolIt has maximised the efficiency of its design with the use of a patented split flow technology, which dates back to 2007. It was created as a result of CoolIT’s R&D efforts to improve liquid cooling performance while reducing pumping power requirements.

Lyon said: ‘We developed it very specifically to be a low-profile split flow; some of the design that was required to achieve that was where the innovation was.

‘The highest efficiency of heat exchanging can be delivered through the use of increasingly dense micro-channels. The micro-channels that we use today are under a hundred microns.’

Lyon explained that the design maximises heat exchanging efficiency of the cold plate, while helping to minimise the pressure drop as the liquid passes through. Lyon said: ‘You have to have adequate liquid flow going through the system in order to be effective at gathering all of that high-density heat from the processor. But at the same time, we want to minimise the pumping power that is required to attain the correct amount of flow.’

For Calyos’ Vuckovic, maximising the heat transfer coefficient is the key to efficiency. He said: ‘The heat that you can transfer is directly correlated to the heat exchange surface, so the bigger the radiator the more heat you can transfer.

‘It is also linked with the larger the ΔT [temperature change] between the CPU and the cold source.

‘The heat transfer coefficient is dependent on the technology itself. A larger heat transfer coefficient means that you can obtain more performance from the whole system.’

Vuckovic explained that this is one of the main advantages with Calyos technology: ‘since heat exchange surface and ΔT are mostly the same for every cooling technology, the only way to increase performances is to increase the heat transfer coefficient. Vaporisation is the key solution to achieve it. This means you are able to make a system more powerful, more compact, more reliable, or you can work with a lower ΔT to transfer the heat.’

GRC provided the cooling technology for the TSUBAME-KFC, which occupied first place in the most recent Green500 list, published in June 2014. The machine, installed in the Tokyo Institute of Technology, is the only system to break the 4,000 megaflops per watt barrier.

Brandon Moore highlighted the efficiency of GRC technology: ‘We see a mechanical cooling overhead of around three per cent. If you have 100 kilowatts of IT load, then our system will need about 3 kilowatts to cool that.’

Moore said: ‘The densities that we can achieve are much higher than in an air-cooled environment, above 100 kilowatts in a single rack. We can build you a megawatt data centre in just 10 racks.

‘You can pack the hardware in as dense as you like. We have never had a customer that is able to reach the upper limits of what the system is capable of in a single rack.’

The future is cool

The market is competitive and is fostering innovation, as these examples show. But the companies see a secure future for their technologies and believe that, as the issue of energy-efficient computing continues to grow in importance, the demand for the services and solutions that they offer can only grow too. Lyon said: ‘The activity level and the momentum that we have seen in the liquid cooling space are unlike anything that we have ever seen before. We are utterly blown away by the activity that we have seen.

‘I suspect that given the fervour and excitement around liquid cooling, I expect that we will see a lot more of it in the years to come.’

Vuckovic was similarly optimistic about the role of Calyos in HPC cooling. Vuckovic said: ‘Liquid cooling has a bright future, not only to reduce the PUE [power usage effectiveness] by delivering better cooling solutions, but they are still forecasting a growth in the heat load of the CPU. This means that future HPC infrastructure will have higher heat densities.

‘I think that we are going to enter into a market that requires very high performance solutions and we can provide them.’

Feature

Robert Roe explores the role of maintenance in ensuring HPC systems run at optimal performance

Feature

Robert Roe speaks with Dr Maria Girone, Chief Technology Officer at CERN openlab.

Feature

Dr Keren Bergman, Professor of Electrical Engineering at the School of Engineering and Applied Science, Columbia University discusses her keynote on the development of silicon photonics for HPC ahead of her keynote presentation at ISC High Performance 2018 

Feature

Sophia Ktori explores the use of informatics software in the first of two articles covering the use of laboratory informatics software in regulated industries

Feature

Robert Roe discusses the role of the Pistoia Alliance in creating the lab of the future with Pistoia’s Nick Lynch