Exascale energy production
The Barcelona Supercomputing Center is developing new methods for exascale programming and using this research to further develop critical applications for energy production.
BSC is the national supercomputing centre in Spain. The centre specialises in HPC and manages MareNostrum, one of the most powerful supercomputers in Europe, located in the Torre Girona chapel. In addition to managing and provisioning, HPC resources BSC also provides services to the international scientific community and industry partners that require HPC resources.
Since its establishment in 2005, BSC has developed an active role in fostering HPC in Spain and Europe as an essential tool for international competitiveness in science and engineering. The centre manages the Red Española de Supercomputación (RES), and is a hosting member of the Partnership for Advanced Computing in Europe (PRACE) initiative. The centre actively participates in the main European HPC initiatives, in close cooperation with other European supercomputing centres.
The MareNostrum supercomputer has gone through four main incarnations since it was first installed in 2004. The initial system was around 42.35 Teraflops which was upgraded in 2006 reaching 94.21Teraflops. With a further upgrade in 2012-2013, MareNostrum achieved a peak performance of 1.1 petaflops.
At the end of June 2017 begun operating MareNostrum 4, which when fully installed will have a peak performance of 13.7 petaflops. Its calculation capacity is distributed in two different blocks with one focused on general-purpose computing and the second block providing access to ‘emerging technologies.
The general-purpose block has 48 racks with 3,456 nodes. Each node has two Intel Xeon Platinum chips, each with 24 processors, amounting to a total of 165,888 processors and 390 terabytes of memory.
Its peak power is 11.15 petaflops, around ten times the performance of MareNostrum 3. Although this represents a ten-fold increase in peak performance, the energy consumption of the system has only increased by 30 per cent to approximately 1.3 MWatt-year.
The second element of MareNostrum 4 is formed of clusters of three different emerging technologies. These are technologies currently being developed in the US and Japan to accelerate the arrival of the new generation of pre-exascale supercomputers.
One cluster consists of IBM POWER9 processors and NVIDIA Volta GPUs, which are the same components that IBM and Nvidia will use for the Summit and Sierra supercomputers that the US Department of Energy has commissioned for the Oak Ridge and Lawrence Livermore National Laboratories. The computing power is around 1.5 petaflop/s.
Another cluster is made up of AMD Rome processors and AMD Radeon Instinct MI50. The machine will have a similar processor and accelerator to the Frontier supercomputer that will be installed in 2021 at ORNL. The computing power of the machine will be 0.52 petaflop/s.
A third cluster is formed of 64-bit ARMv8 processors in a prototype machine, using technologies from the Japanese Post-K supercomputer. The computing power is around 0.65 petaflop/s.
This combination of technologies puts BSC in a unique position where it can explore and develop programming and software frameworks for many of the current state-of-the-art HPC technology platforms installed around the world today.
Enriching energy research
There are many different research projects across the primary fields of research at BSC. One example that encompasses both the innovation in HPC software and its work with real-world data-intensive applications can be found in the Enerxico project.
This is a large collaboration of partners from research institutes and industry across Europe and Mexico.
Enerxico is a two-year project exploring the development of exascale applications for energy production, driving home the message that there is a strong need for exascale HPC and data-intensive algorithms in the energy industry.
BSC director of computer applications in science and engineering and European Enerxico project coordinator, Dr Jose Maria Cela, explained the idea behind the project: ‘We started one year ago in 2019 and the project will end in one year around July 2021. The main objective of the project is to develop some applications in different areas that are all related to the energy industry. Specifically, we are working in oil and gas, wind energy and combustion.’
‘We take critical applications in these three areas and we ensure that these applications can make use of exaflop computing systems that will be coming in the next few years,’ added Cela.
The project is split into three main areas of study based on these energy applications; wind energy, biofuels and oil and gas simulation.
‘In the area of combustion we are simulating turbulence of combustion in engines and we are simulating the performance of the combustion when we are using biofuels instead of fuels derived from hydrocarbons.
This implies simulation of turbulent fluid dynamics in addition to all of the chemistry of the combustion. Combustion is one of the most difficult phenomena that has a high degree of complexity in simulation. Today we are not able to simulate the full physics of combustion. We cannot simulate completely the chemistry in combustion so we need to approximate the chemistry with some simplified models,’ explained Cela.
The use of exascale computing using high-fidelity simulations can represent an important step forward to provide further understanding of the combustion process and emissions characteristics of these new fuels and industrial guidelines for engine operation and maintenance.
Similar work is also going on in the other areas of the project as the goals for wind energy production focus on developing accurate wind resource assessment, farm design and short-term micro-scale wind simulations to forecast daily power production. In offshore wind farms, additional engineering problems related to mooring and anchorage mechanisms also increase the need for HPC resources. In the oil and gas sector, the need for the efficient exploration, production, transport and refining of oil and gas requires massive data simulations using exascale computers. Exploration data processing requires fine meshes to capture small geological features in areas of several square kilometres. Exploitation relies heavily on hydrocarbon flow forecast at reservoir scale which needs robust simulations Finally the refinement industry will benefit from simulated experiments that can substitute expensive real laboratory exploration for new catalysts.
‘So what are we doing in the project? We are adapting approximations in the chemistry models in order to reduce, as much as possible, the error in the simulation and how it compares to physical reality,’ stated Cela. ‘This requires new methodologies to simulate the combustion and new programming of these algorithms in order to exploit the capacity of modern processors.’
The main characteristic in future computing systems is that the amount of parallelisation is massive. An exaflop supercomputer will have hundreds of millions of cores,’ said Cela. The added complexity of all of this parallelism is that scientists and researchers must look at how they develop codes and algorithms to ensure parallelism to optimise performance. This can also require rewriting or using new methodologies for communication between and across nodes.
Cela continued: ‘Normally algorithms are architected by people not really thinking about the parallelism, they are thinking about some mathematical details or some tricks that make it easier to understand but they do not think explicitly about how to understand the parallelism.’
‘Exploiting the parallelism in normal applications with moderate parallelism is not too difficult but when we are talking about millions of potential parallel entities, which need to communicate synchronously, this is quite complex and requires some forms of new programming methodologies,’ added Cela.
One method that Cela and his colleagues on the Enerxico project have developed is a runtime software component that enables them to remove synchronous communications to further enable parallelisation and increase performance.
‘Normally when we write parallel programs we use methods that impose artificial synchronisation. A very simple example is a loop which has two vectors.
‘Each addition of each element of the vectors could be done in parallel and you put a barrier at the end. Until all the threads are finished it cannot continue after this point. It is an artificial barrier that we imposed due to the way that we program.
‘This is the primary change in the software. We are moving from algorithms that have artificial synchronisation points to algorithms that have no artificial synchronisation points. Further optimisations with data locality and so on,’ added Cela.
‘In order to remove these barriers, you need a mechanism in runtime that controls the question “are all my data dependencies done?” And this is automatically done by a piece of software that is included with the new language. In the past using the normal languages this runtime software does not exist so you cannot express this asynchronous parallelism.’
‘We are not only making a change in the way that we program the codes, we are also making improvements related to the functionality of the codes. With the algorithms that we use we can approximate or simulate specific physics,’ Cela concluded.