Accelerating the search for oil

Share this on social media:

Oil companies are turning to acclerators, such as GPUs and FPGAs, to process data and speed up simulations. Tom Wilkie reports

At the recent International Supercomputing Conference in Hamburg, Philippe Ricoux from French oil company Total noted that, in the mid-1990s, only about 40 per cent of oil deposits fulfilled the company’s expectations. More than a decade on, that figure is pushing up beyond 70 per cent. The improvement, he said, was due to better modelling and simulation using high-performance computing. If HPC-based simulation avoids Total drilling a single dry well, then it would save the company around $80M or more, he added.

Three factors affect the computational workload in the oil and gas industry. Firstly, the amount of data being acquired for analysis is growing rapidly. Seismic data sets, which represent the majority of the data being acquired, are getting bigger and bigger. In the search for oil under the seabed, exploration companies are moving from having just one ship that trails a few streamers behind it, to wide-azimuth scans where many vessels trail many streamers, each of which has many sensors. The aim is to generate more data to provide superior illumination of the subsurface.

The other side of the computational workload is that geoscientists can choose how complex a model they’re going to use. One of the most popular, ray tracing, is relatively quick nowadays, but there are other algorithms that can take weeks to run. For example, the oil in the Gulf of Mexico is ‘hidden’ below salt domes which are difficult to map and the geology of the Gulf is very complex. Thus a lot of the exploration in the Gulf of Mexico uses relatively complex anisotropic physics models employing reverse time migration, a complex, 3-stage, full-wave simulation model, to ‘look’ under the salt structures and find the oil. An added complication is that salt bends and twists itself into complicated shapes where there are useful places to look for oil.

All the easy oil has already been extracted, so companies have to look in geologically complex environments. The high price of oil makes deep-water, sub-salt, environments viable – including not only the Gulf of Mexico, but also offshore Brazil and West Africa. Brazil has been able to double its estimated reserves by drilling through the salt and coming on new hydrocarbon reserves. But such complex geologies present geoscientists with problems as they try to interpret the information. There is further scope for computing in this area; to help with the sophisticated interpretation of geological structures.

Unsurprisingly, these pressures are driving companies to use accelerators, such as GPUs, and ever-larger clusters. According to Guy Gueritz, Oil and Gas business development director for Bull, companies used to buy commodity servers to run, for example, the Kirchoff seismic migration algorithm ‘which has served its purpose well. But now they are having to use particular imaging methods for the sub-salt and this has driven changes in the HPC architecture.’ Some oil companies are investing in petaflop systems costing upwards of 30 million euros. They need ‘very large systems to derive much more accurate velocity models, so they get images that are focused and interpretable,’ Gueritz said.

In the view of Oliver Pell, vice-president of Engineering at Maxeler Technologies, both data handling and compute speed are required to meet the needs of the oil industry. The major utilisation of HPC is on the exploration side – to locate the optimum position for drilling and avoid dry wells – and this generates many dozens of terabytes of data that need to be crunched many times to produce an image of the subsurface field. Because of the costs of running a drilling rig, it is important to all oil companies to turn round a job in a few days, rather than weeks of computing time.

Because oil exploration is a ‘Big Data’ problem, much of the run-time is not actually in the computation itself but rather the burden of I/O – loading all that data in and getting the results out efficiently. He said: ‘We work on crunching down the run time, by putting parts of the application into custom dataflow hardware, and also optimising parts in software, and having the two talk to each other very rapidly. We can build a custom cluster that has a mix of dataflow compute engines and conventional CPUs, where the work passes back and forward between them.’ Although the parts of the problem that are put through the dataflow engines may not be the most computationally intense, they are the ones that use the most time – due to the data handling requirements – and the customised dataflow solution can be 10 to 100 times faster than before.

Pell continued: ‘You can build a cluster that is optimised for your workloads. Instead of just balancing network, disk, and CPUs, you add the dataflow engines as another resource that you can balance with these as well. But, on an application-by-application basis, you can create custom chip configurations that map the application into a dataflow and that execute it in hardware – and you can change these every second. So, when an application starts up, it will acquire some dataflow engines, configure them to do part of that application and then run calculations on them by streaming data through the chip. There are no instructions. So it is an efficient computational paradigm. You don’t have dependencies or have to worry about caching.’ Possibly 90 per cent of the program will be executed elsewhere, on conventional CPUs, but this may represent only one per cent of the run-time.

Maxeler recently launched its MPC-X series which puts dataflow engines on the network as a shared resource, a bit like a disk. According to Pell: ‘Any CPU can execute CPU code and at some point it can decide that it wants to execute a piece of dataflow code and it can pass that to another node. But the CPU node is always running the control of that application, so you can dynamically allocate as many dataflow nodes as you require and release them.’ The use-case is that it’s basically a coprocessor for the main application. The dataflow engines themselves map onto configurable hardware chips like FPGAs. In Pell’s view, data movement is the main problem in high-performance computing, not flops. ‘There are relatively few applications that are purely limited by the computational performance of, say, the CPU. There are more that are limited by the memory, or the interconnect bandwidth, or the storage.’

Moving data is also a focus for Bull’s Gueritz: ‘Poor I/O on the node is going to affect performance more than anything else,’ he said. Reverse time migration (RTM) needs a lot of scratch space: ‘The first process is to model the wavefields going downwards through the velocity model. The values generated for each cell at each depth have to be stored temporarily and, if there is no room in memory, then they have to be stored in scratch space – a local disk. The second stage is going back from the data received at the surface – to back propagate through the velocity model. Those two values are then correlated together to produce the final result. That means that you have an I/O issue: if there is latency on your local disk, that will affect your overall performance.’

Given the sheer amount of data that has to be read in and the constraints of the algorithms, he said, ‘a system has to be well balanced. Bull was an early supplier of hybrid systems. That gave us experience of designing and deploying these hybrid architectures and we have also had a lot of experience with the Lustre file system and of getting higher bandwidth out of the storage units to get data in and out.’ Bull has been discussing with its customers what their through-put expectations are, and how applications would run on a GPU or other form of accelerator architecture. ‘The idea is to say: well, based on the data sample that you have given us, we can do a proof of concept and tell you what the speed-up would be.’

Gueritz did not see that porting legacy code across to hybrid architectures presented too much of a problem. ‘There has always been a need to keep programs up to date. Everyone accepts that some kind of many-core or multicore approach is needed, regardless of the specific technologies that are appearing. Bull does have a lot of parallel programming expertise in-house. Application programmers are geoscientists first and parallel programmers second. Our perspective is to get applications to perform as best they can.’

The human factor was taken up also by Vincent Natoli, chief executive of Stone Ridge Technology: ‘15 years ago it was not ridiculous to hire a geophysicist, put him in a room and get him to write code. Nowadays, it has to be parallel code and the physics is more complex.’ He too saw a requirement for two different skills – the skill to map the equations to a modern computing environment is completely different from understanding how those governing equations are derived. ‘We are the guys that fit in the middle. We love the science. We have physicists and applied mathematicians who are also passionate about the computing.’ Stone Ridge has been working as a consultant for the oil and gas industry, providing expertise in this combination of science and software optimisation, for more than seven years. It has helped its customers develop production codes – including Reverse Time Migration; Kirchhoff Time Migration; and Reservoir Simulation – for CPU and GPUs.

In Natoli’s view, the oil industry has ‘a queue of stuff waiting till the hardware is ready. They already know what they want to do at exascale’. And it is clear to Natoli that in HPC ‘parallelism is going to be everything. It’s not going to get easier to map to the highly parallel computational architectures There is going to be an increasing division of labour between the domain scientists and the people working closely with them to map it to parallel compute architectures. It’s an exciting field,’ he said.

For Philip Neri, vice president of Marketing and Product Strategy at Terraspark, the focus is also on the human factor. But his interest is in the final stages – the interpretation – after most of the highly-intensive computing has been done. ‘Where the pinch comes is with the geologist or geophysicist, who has to interpret this information on an every tighter schedule,’ he said. With exploration licences granted only for about three years, this leaves little time to get the boat and other survey equipment into place and then do the drilling to validate the results. ‘The man in the middle is this interpreter, who has the avalanche of data and is often working on a deskside computer. By the nature of this industry, there is a need for more accurate work on more complex geology. There has not been an increase in compute power for this relatively small group of geoscientists. These people are pressured by the drilling schedule. There is more stress in that critical link than there is at the processing end.’

Most of the software is geared to the visualisation of the data, he continued. There are good data cards and expensive displays. The software offers ‘lots of possibilities to colour data and zoom it, but the computer is not being used as a computer to calculate things. What is lacking is help for the geologists to gain more insight quicker into the data rather than looking at it with their eyes and trying to infer insight into the geological model.’ Terraspark has been concentrating, therefore, on accelerating the interpretation process and relieving the operator of a lot of menial tasks to free up time for analysis and comparison.

Their approach is to divide the workflow into four steps: ‘We try and remove areas of no data from the interpretation – salt bodies or areas of no reflection. We have automated methods to “shrinkwrap” a salt dome very quickly to take it out of any further activity. Secondly, our signature product automatically extracts all the faults.’ The software enhances the discontinuities within the data column so that the process of building fault planes is easier. ‘It now takes a couple of days of work to perform what would previously have taken a month or two,’ he said.

The next step is to identify the major horizons or geological levels. The final stage – which Terraspark has patented – is to transform the data into its paleo-depositional environment. ‘It’s very hard to see geological features when everything has been disrupted by the history of tectonic movement since they were deposited. We basically unfold everything and bring it back to the status it was in when it was deposited. This is very compute intensive; removing hundreds of millions of years of folding and mountain creation.’

Neri does not see a need for expensive compute clusters at this stage but looks to GPUs to provide the computational power. It means of course that the code has to be ported to GPUs but ‘we did not find the rewrite to be too onerous. We are making the compute-intensive parts run on the GPU, not the whole code.’ He reports that the shift can accelerate performance by a factor of 10 to 15, generally.

In oil and gas, GPUs are here to stay, according to Stone Ridge’s Natoli: ‘It is remarkable what Nvidia has done in just two or three years – they have made themselves an essential part of the compute technology in a major industry.’

Unblocking the connections

No matter how sophisticated your applications, they are useless if your data is not in the appropriate form. So David Butler of Limit Point Systems has been looking at data exchange methods, leading to the development of the company’s Data Sheaf System.

He cites the problem of how to interoperate different numerical representations on different meshes with each other, for example moving properties – porosity, density velocity temperature – from one mesh to another. A user may start with a geological structure mesh and want to feed into a geomechanics mesh, but these can use different representations and different boundaries. In the current workflow, such ‘data munging’ is not an automatic process but takes a good deal of work on the user’s part. In the past, typical implementations have been on workstations, so the task of parallelising them – converting that data munging application – is big and complex.

Lots of man years have gone into parallelising the kernels – the core elements of a program – he explained, but less attention has been given to the data munging problem to enable the data to be used as input. But the reason an application can exploit that parallelism is because of parallelism somewhere in the problem. ‘Our mathematical datamodel allows us to express the parallelism and we can map that onto the hardware,’ he said.

‘The culture of HPC is that the heroes are the guys that write the central kernels,’ he added, and ‘their emphasis is on the computer performance. By and large they get the glory. The second order connection problems – such as moving data from one application to another -- are a bit like a sewage system: no one wants to think about them until they back up. We see an opportunity there for our data management technology.’