Digging for black gold
As the earth’s natural resources continue to be plundered, there are fears that the pace at which oil and gas can be extracted will slow up imminently. Those fears have been with us for more than a decade or so – but, thanks in part to HPC, the pace of extraction is yet to slow up at all.
Dr Raymond McGarry, seismic research team lead at Acceleware, says: ‘HPC is absolutely indispensable within the oil and gas industry today, particularly on the upstream side, with compute- and data-intensive applications such as seismic processing and reservoir simulation.’
Supermicro’s Tau Leng believes that customers in oil and gas are major drivers in HPC development. ‘Money talks,’ says Leng. ‘Oil and gas has the highest refresh rate across all of the sectors that HPC addresses. Every year, they keep upgrading the technology, and the lifecycle of a server might only be three or four years. HPC is playing a part in maintaining the pace of oil and gas extraction by helping to find resources quicker.’
Darren Foltinek, RTM product manager at Acceleware, adds: ‘The entire history of geophysics has been a matter of approximating the physics so that a useful image can be produced on the computers of the day. As computing power has grown, these physics approximations have become more and more accurate, but there are still vastly simplifying assumptions being made in order to keep compute costs reasonable.’
McGarry adds: ‘The impact of modern HPC solutions hasn’t simply been about making the same old applications run faster. The greater impact is in making completely new things possible – which is timely, given the advanced state of exploitation of “easily-accessible” hydrocarbon deposits. New oil and gas discoveries are increasingly confined to regions with relatively complex geology that require more advanced imaging techniques, for example where hydrocarbons are contained under a body of salt as is the case in much of the Gulf of Mexico, offshore West Africa and Brazil.
‘Imaging below such salt bodies requires computational solutions that adhere much more to the fundamental physics of seismic wave propagation than the older techniques in which gross over-simplifications were required to make imaging possible at all on the computational infrastructure of the day. A good example of a current state-of-the-art imaging method is reverse time migration (RTM). To image a single seismic shot (of which there may be tens of thousands within a survey), RTM requires two or three full simulations of the propagation of a seismic wavefield through the Earth. This is a huge computational problem that simply would not be possible without modern HPC technology.
‘Apart from the extreme computational burden, the wide azimuth nature of the seismic surveys required to adequately image around and beneath salt bodies leads to huge data volumes, easily running into many terabytes. Simply dealing with this volume of data is a challenge in itself. RTM adds an additional level of complexity to the data management problem by requiring that the data volume produced during one wavefield simulation be available in another.
‘Meeting the computational needs of current imaging projects requires specialist teams with expertise in many different areas; including geophysics and mathematics, as well as a deep understanding of how to squeeze maximum performance out of hardware that may have to be tailored to suit particular applications. The Acceleware model for meeting these needs is to provide a library of functions which bridge the gap between the massively multi-core hardware and the top-level application. By using our library functions, our clients can immediately take advantage of the very latest hardware developments while concentrating their own efforts on the high-level application in which they are ultimately interested. An appealing benefit to our clients is that they can also consider their applications as being future-proofed against changing hardware trends.
‘RTM simply would not be practical without modern HPC. However, by implementing our solution on heterogeneous clusters based on GPUs and multi-core CPUs, Acceleware has made the technology available to a wide user-base, rather than it being the preserve of the major oil and gas or service companies. In this sense, the newer compute architectures are levelling the playing field by dramatically reducing the cost of HPC.’
Global Geophysical is primarily a seismic acquisition company. Global’s Bill Menger explains: ‘We send out trucks and boats around the world to vibrate the earth or sea and listen to the echoes. We do a lot of speculative work, whereby we will scan a whole area, and then sell the data to whoever is interested. We call it speculative, because we don’t know whether anyone will buy it. It’s quite expensive to do that so, in order to make the data more attractive, we process it and actually show some results with full 3D images. It’s this processing of data that requires HPC.
‘We collect two classes of data: microseismic and active seismic. Active seismic is where we use field crews to vibrate the earth and collect the results; microseismic involves placing an array of sensors around a well where fracking is taking place, and listening for tiny cracks that occur in the rock and cause little pops. The method provides an accurate map of the rock and cavities.’
The processing of data collected in this way makes HPC a sensible option. ‘It is common practice in this industry to have clusters of between 100 and 500 nodes,’ continues Menger. ‘The algorithms we use are well suited to being split out among the nodes. There are a number of different major algorithms that need a cluster, one of which is called Kirchhoff prestack depth migration. This tracks the ray paths of the sound waves from the sources back to the sensors, and the migration process collapses all of these to create a focused image. These jobs will run on between 60 and 600 executive threads across any number of servers, using MPI to communicate between those servers and threads. A typical job might run from one to five days.
‘We also use RTM. It’s a very compute-intensive process, though, that requires about a terabyte of disk space per node. Ideally, this needs to be solid state disk, as pushing out a new wave every 4ms is quite I/O intensive. It also needs a lot of memory in order to store the entire output image.
‘Global Geophysical has five data centres right now across the Americas, including one in Houston with 300 nodes feeding into a core fabric from Gnodal. Our previous core fabric was proving to be inadequate, leading to occasional lock-ups of the system. For a replacement, I looked at several of the leading players and, having decided to stick with Ethernet, we decided on Gnodal because of two particular features that appealed. The first was the very low latency, and the second was dynamic routing and load balancing.
‘Data management presents a huge challenge. I could have 50 nodes all wanting to write to one disk file. So, every node has to contend with every other node for I/O access. Having a fair, balanced network is very important.
‘A further challenge is that MPI transactions need to be very quick and very predictable, as well as feature low jitter, low latency and high bandwidth.
‘The installation of the Gnodal fabric meant it could feed and draw from our data storage much quicker than ever before, so we’ve ended up upgrading our storage significantly.
‘The only other HPC user that comes close to oil and gas in terms of compute intensity and data size is Nasa. We use so much memory and so much disk space that the cloud is simply not an option.’
Hardware specialists Supermicro, which has several customers in the oil and gas space, builds its HPC solutions from the ground up, from the motherboard up to server configuration.
Dr Tau Leng, VP and general manager of HPC at Supermicro, says his company is well set-up to deal with the demands of the oil and gas industry. ‘We have a very broad range of products,’ he says. ‘And we specialise in providing application-optimised solutions, whether that be for seismic analysis or reservoir simulation. Each of these requires a very different solution.
‘Seismic analysis is all about processing power, with very little communication required between nodes, so it scales very nicely to thousands of nodes. Reservoir simulation, by contrast, requires very high-speed interconnects, so we develop systems that have Infiniband built in.’
Reservoir simulation is the process by which existing wells whose oil production is slowing are assessed for ongoing productivity. The process enables the production company to take data from points underground, reconstruct the underground in a 3D picture to help establish where there is still oil, and run simulations that will inform future extraction plans.
One of Supermicro’s recent projects has been with CGGVeritas, a geophysical company delivering technologies and services to the global oil and gas industry. Supermicro worked with Green Revolution Cooling to deliver an HPC solution at CGGVeritas’ centre in Houston.
Supermicro’s 1U dual-GPU SuperServer is being used in conjunction with GRC’s CarnotJet fluid submersion cooling system. Together, they create a high-density, high-capacity computing solution that has reduced data centre power needs by 40 per cent. ‘We found that CGGVeritas is a company that is always open to new technology,’ says Leng. ‘We used a GPU-based solution for the computation. This is a high-density solution, and also draws a lot of power, which is why the submerged solution was particularly suitable. Some cost-benefit analysis has been done on this, which suggests that if the power is above 25kW, then a submerged solution is often better.’
PBS Works is an HPC workload management suite, which is used extensively by the oil and gas industry. It includes a number of tools, including PBS Professional, which optimises HPC resource usage, and PBS Analytics, a web-based portal that visualises historical usage data by jobs, applications, users, projects, and other metrics so the user can capture trends for capacity planning and what-if scenarios.
‘Oil and gas users generally have large clusters,’ says Rick Watkins, account manager at Altair, which produces PBS Works. ‘And clusters need management! Both seismic processing and reservoir simulation are compute-intensive, so oil and gas has always been an early adopter of HPC technology. It’s also about ensuring that any task is using the right resources. We have alternative resources now, such as GPUs and other accelerators, which oil and gas codes are starting to utilise. So, it’s just simple business sense to have cluster management software in place of a team of people allocating jobs to resources.
‘The clusters used in the oil and gas industry are very large and that size dictates that functions such as health monitoring are essential.
‘HPC technology has improved research in oil and gas. Jobs that used to take two or three weeks to complete can now be done in two or three days. Processors are better, codes are better optimised, and interconnects are faster. All of these component parts of an HPC setup have had to improve to keep up with the size of cluster demanded by oil and gas.
‘We have integrated PBS Works with a number of specialised third party simulation packages in the oil and gas industry, such as Schlumberger’s Eclipse, which means that users of that software can access the functionality of the PBS Works suite directly from the Eclipse interface. On the HPC side, PBS Professional is able to schedule jobs executing that software, according to the number of licenses available.’
The storage challenge
Panasas has been involved in supplying the oil and gas industry with storage solutions for several years. Barbara Murphy, chief marketing officer, says: ‘Panasas has been operating in the energy sector since our inception. Our scalable, high performance system suits the size and complexity of the data sets they are dealing with.
‘We’ve worked with third-party seismic software vendors to help them parallelise their applications for the workload, and that’s really helped us gain market share. It remains our largest growth sector, as the energy industry has moved from a long period of being in “extraction” mode to once again return to “discovery” mode. With the price of oil and gas continuing to climb, it’s now worth the investment in complex extraction. We have installations in over 50 countries, in environments as diverse as deserts, the Arctic circle, mountains and so on. These are often in very remote, inclement areas with minimal IT facilities. The design of our units makes them particularly easy to service.
‘Oil and gas is one of the most mature industries when it comes to using HPC for its scientific workload. From that point of view, the market was early to adopt parallel file systems and scale out architectures to manage both the complexity of the simulations they are running and the scale of the storage required for the data they’re creating.’
Geoffrey Noer, senior director of product marketing, adds: ‘Within seismic processing, there is a continuing push for ever-finer detail at a more granular level. As that happens, the drive is towards larger and larger compute clusters to process the data, as well as the need for faster and faster storage. The faster you can process the data, the faster you can make a decision about where to drill. This is why, in seismic processing deployments, you often see thousands of compute nodes, rather than the tens or hundreds many other HPC applications demand.’
As the years go by, data accumulates, and this creates a challenge. Murphy continues: ‘All seismic data is valuable, and it is so expensive to collect that no data is ever thrown away. Some of our customers are still referring to data that they extracted 20 years ago. The earth’s structure won’t have changed in that time, but the tools to pull out and analyse that data do evolve. Oil and gas, therefore, is an industry that has a massive scale out problem; we don’t talk about terabytes here, we talk about petabytes – and it’s tens of petabytes of new storage every year.
‘Seismic processing is compute intensive, network intensive and storage intensive. So, moving data around becomes a real problem. Our customers are looking to bring the problem to the data, rather than the other way round, to save the costly effort of having to move the data in the first place.’
Noer believes that the application of HPC within oil and gas is growing beyond the data management side. ‘Until now, many oil and gas companies have a dedicated HPC department, which is entirely separate from their day-to-day IT department,’ he says. ‘However, we are seeing the adoption of HPC technologies even within the IT departments too. We see this trend continuing even beyond energy companies.’
DDN has years of experience in providing the storage behind HPC, and more recently has been supplying such products to the oil and gas market. ‘The real data-intensive part of oil and gas is seismic processing,’ says DDN’s James Coomer. ‘The data capture itself takes place in odd locations such as on ships in the middle of oceans or on vehicles in deserts. Storage has a major part to play here in two roles: first, in the ingestion rates – that is, collecting the highest resolution data possible from the sensors, which can be from 1GB/s upwards; and second, the processing of that data into, for example, a 3D map.
‘Our storage can cope with the high data rates during the ingestion period, and also in the processing part, when thousands of nodes may be accessing the storage at the same time. In oil and gas, data rates in this latter part of the process can be typically 6GB/s and maybe much higher. It is easy to migrate from a traditional NAS storage system to one of our parallel file systems, so users need make no changes within the application. This is important for industries such as oil and gas, who can now take advantage of the faster data rates offered by parallel file systems without impacting the surrounding systems.
‘The industry is becoming ever more complex; the acquisition and exploration process is becoming more precise and using more complicated algorithms. So both the compute side and the amount of data being ingested is always going up.’
Acceleware’s McGarry concludes: ‘The current HPC generation has, for the first time, given us the ability to base production seismic imaging software on realistic physics. In the coming years we will see ever more complex physics being simulated, for example elastic RTM will supplement the current acoustic-based version to account for elastic deformation of the Earth due to the seismic disturbance. Full Waveform Inversion, which has long been the Holy Grail in terms of building structural Earth models, will become increasingly common. And these developments will require significantly more computational power.’
Total has been a customer of SGI in France for more than 15 years. Marc Simon, technical director at SGI, says: ‘We supply them both the HPC and the storage – the two are very tightly connected. They need to manage both the processes and the big data that the processes create.
‘Our approach with them has always been to get a full understanding of the customer’s workflow, their approach to R&D, and how they use HPC. We have a team of people based at Total’s operation in the south of France, who not only address HPC projects, but also other aspects of Total’s requirements, such as data management or visualisation.
‘Last year, we won the latest contract on offer from Total, which will enable them to take the next step in terms of HPC usage, using new algorithms, and also to allow it to cope with bigger data. We supplied them with an integrated cluster based on our Ice X architecture. It has 100,000 cores, more than 5400 terabytes of memory, and 8 petabytes of disk space. Like any customer, they wanted the solution to demonstrate a high level of performance, but it also needed to be energy-efficient. We achieved this through the improved density our products, and also through a cooling system that was able to accept warm water cooling.
‘For the most part, Total uses the system for seismic processing. The new algorithms, together with the power of HPC, enable them to see what is under the earth much more clearly, and therefore determine whether oil or gas is present.
‘In the future, Total will also use the set-up for reservoir simulation, which is a technique that helps oil and gas companies determine the most efficient method of extraction.
‘SGI is a partner in terms of the support we can offer Total, ensuring that what they want to do is able to run efficiently and reliably on our system. Around half of our dedicated team works on system administration, while the other half concentrates on R&D support and development. This means helping the process move from a pure engineer’s algorithm to a smooth application of computer science on an HPC system.’
'By working with Total at the R&D stage, we are aware of the algorithms they are working on, and can evaluate early on how to run these algorithms on emerging systems and processors. In turn, this helps Total select an appropriate configuration the next time they upgrade their system. We have teams looking at Nvidia GPUs and Intel’s latest chips all the time to assess their suitability to these new algorithms as they emerge.
Marc Simon, technical director, SGI