Computing applied to the earth sciences will save human lives, according to Felix Grant
As you read this, a new US president will be in his first weeks at the head of an administration informed by respected earth scientists including John Holdren, Jane Lubchenco, and Steven Chu. The words ‘earth science’ usually evoke those disciplines concerned with the lithosphere (particularly geology, seismology, and vulcanology), but public concern is rising about the effects of human interaction with the other three spheres as well – and all sectors of the earth sciences are intensive consumers of computing resources.
Computational science began with water. Societies dependent upon fertile flood plains surrounded by arid regions needed advance knowledge of when their rivers would ebb and flood; and from that arose everything from algebra to astronomy. Today, from acute surges in the Thames to chronic vulnerability in Bangladesh, from one-off disasters such as the 2004 tsunami to the global rise in sea level, flooding remains a primary concern. A tsunami is not only a hydrospheric phenomenon, it is also in the class of seismic events. Sea levels are rising due to many causes, but one significant influence is melting of old ice as a result of changing climate – which also affects the crust beneath it, and the flow of Coriolis currents. Scientific computing in these areas embraces collection, assembly, and analysis of huge, complex data sets. There are other data associated with the human impacts of earth science events: mortality, economic dislocation, rescue, recovery, and medical demand.
Though tsunamis are well known and recorded, Boxing Day 2004 had an unprecedented impact on global psychology, producing a new impetus for research. There were widespread demands to know why, when suitable technologies existed in principle, no warning system had been in place for a region known to be vulnerable.
Prediction of a tsunami, once an underlying seabed trigger has occurred, is a matter of rapidly (and preferably automatically) analysing either seismographic or hydrographic (or both) data. Seismic analysis is the key to long-term understanding which may allow preventative planning and effective protection of human populations. It is the only realistic approach where the epicentre is close to the point of impact. Hydrographic methods provide the most ready and economic means of raising an alarm when the trigger has already occurred at greater distance. Both approaches are data analytic.
The ‘Tsunameter’ system (from Envirtech of Genova, Italy) is an example of the mix. Two different types of observatory module, Poseidon and Vulcan, work independently, or in concert, to generate a real-time stream of data fed via surface buoy and satellite communications to an onshore control centre for analysis. Poseidon, primarily concerned with fluid phenomena, is dropped into deep water at distances upwards of a thousand kilometres offshore and gives warning of a wave which has already formed. The more sophisticated Vulcan sits closer to the coast, supplies data streams over greater data bandwidth, and concentrates on precursor indicators.
Whether underwater or on land, predicting, or even just understanding, seismological probabilities is both vital and computationally intensive. As an IGOS report  noted almost five years ago, ‘volcanic eruptions, earthquakes, landslides and subsidence [are among] ...the main natural causes of damage to human settlements and infrastructures’. Attention to such natural hazards is widespread and cooperative, embracing national government organisations like the US Geological Survey, the China Earthquake Administration, and numerous academic centres, through shared initiatives such as the Pacific Disasters Center based in Maui, Hawaii .
Illustration from Cronin et al  on seismo lineament analysis showing identified seismogenic sites (top) and stereographic projections of mean vector orientations, showing 90 per cent confidence interval uncertainties (bottom).
Another interesting approach is retrospective study of past events through data deduced from consequent anomalous geologies. Tsunami research back through the Holocene and beyond , for example, or tracing historical human-induced changes in physical geography , or effects of tide and current over time on the fossil record makes data available to a range of other disciplines . Analysis of data from geological cores shows the circular nature of human impacts on and from the biosphere, as exemplified by a study  showing that major vegetational shifts coincided with human mastery of fire.
Physical sampling has now been supplemented (even supplanted) by orbital observation. From large projects like NASA’s earth observation satellites (EOS) programme to those such as Italy’s COSMOSkyMed, this is leading to dramatic increases in aggregate data volume. Even Iran (victim of the Bam earthquake, and existing user of Russian facilities) is pursuing in-house satellite launching capacity with declared earth-observation intent. Wooster  notes that TERRA  (NASA’s primary earth observation satellite) delivered more data in its first six months than the whole preceding 50-year earth observation programme combined, and continues to provide new output in the region of a terabyte per day.
The size of the data sets means constantly escalating use of developments in scientific computing, and points up general changes in the nature of computerised analysis. A considerable portion of what would once have been the statistician’s pre-process computational workload is now done by the satellites themselves, which assemble signals enviably clean by past standards. Much of the data is disseminated over the internet.
Analyses are conducted here, there, and everywhere, by consumers of the basic data to produce digested data products, and often returned to the net where they swell the available resources for second-level analysis, and so on. The distributed nature lets everyone play a part. Cyprus and the Slovak Republic, for example, neither of them highly resourced nations on a world scale, both participate in DEISA earth sciences projects , with the Cyprus Institute’s Energy, Environment and Water Research Centre partnering Germany’s Institute for Atmospheric Physics (University of Mainz) and Max Planck Institute for Chemistry. This sort of rapid, circular, organic evolutionary analytic model is commonplace in many areas of science, but is fundamental for the earth sciences. But the hasty layered development of heterogenous resources has side effects, including concerns about data consistency.
There is a US federal information processing system, the USGS Spatial Data Transfer Standard or SDTS , which defines how geodata should be transferred from one computer to another. It is designed to be an extensible open system which embraces data and their relationships, preserving the integrity of both regardless of platform differences. It hasn’t as yet, however, been universally adopted. Even if adopted without exception at all stages, it would not cure conflicts in data accumulated before its introduction.
This consistency issue is exemplified by BMT ARGOSS, a Dutch company specialising in supply of processed marine data products to a wide range of government and corporate customers. ARGOSS buys in raw data from a number of sources including the US Government National Oceanographic and Atmospheric Administration (NOAA) and the European Space Agency (ESA). Intensive high resolution numerical modelling converts environmental data into oceanographic, bathyscaphic, and atmospheric services. But data from different sources for the same set of phenomena vary, so cross-checked quality controls are needed with follow-up research to maintain reliability.
The company’s PV-WAVE based systems have to incorporate this requirement as well as the core function of model generation. The consistency and reliability of remote-sensing data are dependent upon getting the design and implementation right in the first place, and ESA’s European Space Research and Technology Centre (ESTEC), also using PVWAVE functions, has a purpose-designed data analysis package which is used throughout the life of a project from prelaunch testing onwards.
Programmable numeric computing is more prominent than statistical software in analysis of earth sciences phenomena; the reverse being the case for the human impacts. Matlab, for instance, is quoted in studies of event detection  and velocity pulse occurrence  probabilities, waveform inversion , , near surface sediment responses , systematic location errors  and event type discrimination . SPSS, on the other hand, appears in work on community and economic recovery ,  as well as post-disaster medical responses ranging from crush-related renal failure  to post traumatic stress disorders  in affected populations.
The human consequences of disasters are most visible in the immediate death rates and infrastructure destruction, but the cumulative long-term effects absorb more social and economic resources. These effects are also more amenable to computationally derived ameliorative planning. Research into medical, search and rescue effectiveness in the immediate aftermath of catastrophic events (see, for example, comments by Ashkenazi on the Sichuan earthquake in China last year ) shows that the actions of bystanders are more significant in saving life than even the best planned professional interventions.
Costa Rica earthquake planning, January 2009, from the European Space Agency website.
Subsequently, however, the balance shifts rapidly. As time passes, good planning and well designed infrastructures soon become the best indicators for survival and recovery. In Sichuan, the numbers of seriously injured and homeless were three times and 140 times the fatality count, respectively. That both a major disease outbreak and a secondary mass death rate from hypothermia were avoided is down to large-scale prepared outside intervention.
There has recently been increasing interest in modelling to maximise the effectiveness of medium-term medical and reconstructive intervention. The flooding of New Orleans by Hurricane Katrina in 2005 shocked the USA and gave a particular boost to this, when events seen as endemic to the third world showed that they could strike the most advanced societies. Complex input/output table models  take into account bidirectional propagation of economic processes and the impact of degraded production on response and reconstruction. Bayesian analyses of exposure risk allow buffering of the degree to which vital databases may be compromised . Attention has been turning to simulation and gaming approaches with, in the words of one paper , ‘a library of interoperable component models that can be assembled together in different combinations to represent a range of scenarios’.
The increasing interdependence of such work is gradually producing a global network of mutual interest which is easy to see but difficult to quantify. Investment by technologically advanced societies in remote sensing, databases, warning systems and so on, are amortised indirectly by return of analyses and methods from developing ones. An analysis-driven system which works in a developing-world disaster may depend on data derived in a high-technology economy, and can be applied back to other catastrophic events in industrialised regions.
The technologies which result can then, as prices drop, enhance future developing-world response.
Global interdependence with respect to causes is even more fundamental, but human impacts in this direction are more contentious as they open questions of deep shifts in economic behaviour. Much of the argument here is based on high-performance computer models for prediction, combined with examination of the past for corroborative evidence. The Colombaroli study , mentioned above, suggests that the comparatively thin human population (probably only 10-15 million globally,) at the Mesolithic/Neolithic boundary (roughly 4000 HE) managed to unseat a complete Mediterranean broad-leafed evergreen forest ecology dominated by holm oak and replace it with a new one based on an open textured mix of shrub and deciduous woodland.
Related work shows a subsequent dramatic rise in the European beech population and, along the Mediterranean in Lebanon’s Bekaa Valley , deforestation associated with major disruptions to sedimentation. Over the last 2,500 years, deforestation, destruction of raised peat bogs , , and land reclamation have all produced changes in local sedimentation, sea current and climate. The effects in the areas concerned are demonstrable, and untangling their relation through erosion and sedimentation patterns to larger trends is painstaking but progressing. Analysis of morphological data from both the western and eastern margins of the Atlantic , for instance, illuminates the feedback processes by which sedimentation changes seabed profile at landmass edges, validating some predictive models and sidelining others.
All of this, even before the size of remotely sensed data sets is taken into account, is completely dependent upon computing power. Even with quite small inputs, the interactions involved in a bi-directional and cyclic view of human interaction with the planet rapidly become too complex for unaided analysis. One geographer commented, in discussion during preparation of this article, that ‘it’s not too much of an exaggeration to say that there was no earth science before the arrival of computers in universities; the best we could do was earth speculation... and even then, it would have hit a plateau without the arrival of the internet as a global mechanism for data repositories... now, the various flavours of high performance computing, from multiple desktop cores and kernels upward, are vital to continuing progress... earth science and earth science computing are the same thing’.
The references cited in this article can be accessed through the Scientific Computing World website. Please go to www.scientific-computing.com/features/referencesfeb09.php
BMT ARGOSS, marine environmental data products, email@example.com
Envirtech S.p.A., Tsunanameter, www.envirtech.org/feedback.htm
ESA, European Space Agency, www.esa.int/esaCP/index.html
MathWorks, Matlab, Simulink, firstname.lastname@example.org
National Oceanographic and Atmospheric Administration, Satellite Climate Research Group; international satellite based observations archive, www.noaa.gov
SPSS Inc, SPSS software, email@example.com
Visual Numerics, PV-Wave; Numerical and visual data analysis tools, http://www.vni.com/contact/index.php#Informationrequest