the UK e-science Grid
Readers of Scientific Computing World will be aware of many aspects of the Grid and e-Science activities. Recent editions of the magazine have included articles on Bioinformatics and the Grid, Astrophysics and data issues, together with news articles on Grid developments and technologies in the UK and internationally. We shall, therefore, not spend a great deal of time introducing e-Science and Grid concepts but focus on the status of developments in the UK e-Science Programme and introduce a set of 'industrial' projects that give some indications of how the Grid is likely to be developed to meet future commercial and industrial needs.
It may be important, however, to set a backdrop for the particular developments we will discuss. In November 2000, the Secretary of State in the UK announced additional funding of 98m for an e-Science initiative. This provided for 74m of application-orientated e-Science research, 15m for generic research in a Core Programme and 9m for additional high performance computing. This funding was augmented by a further 20m investment in the Core Programme by the Department of Trade and Industry (DTI). It is the efforts of the Core Programme that will be the focus of this article.
As the reader will recognise, current trends in scientific research are toward multi-scale applications, involving multi-disciplinary teams, often geographically dispersed. Support of these trends requires tools for co-ordination and collaboration and an infrastructure capable of providing secure, adaptable, fast communications - The Grid. The choice of the name 'Grid' to describe this infrastructure resonates with the idea of a future in which computing resources, compute cycles and storage, as well as expensive scientific facilities and software, can be accessed on demand like the electric power utilities of today. Indeed, IBM already speaks of the Grid as the fifth utility. One of the elements of the Core Programme's remit is to ensure that the Grid infrastructure has the services required to support the UK e-Science applications. This means understanding those requirements, ensuring there is a group within the e-Science community who are in dialogue with the broader international community and with industry, recognising where there are gaps in Grid developments, and ensuring that there are activities within the UK working in those areas. The international discussion of protocols and standards is held at the Global Grid Forum, the most recent meeting of which attracted some 1000 participants.
The other element of the Core Programme remit is to engage industry. This is important for a number of reasons. Firstly it is a matter of education of UK plc - ensuring that UK industry and commerce understand what Grid technologies are about and how they might benefit from them. Secondly, as noted above, to ensure that the directions taken by the research community are in line with industrial developments. There would be little use for the UK e-Science community to create a marvellous Grid framework that had nothing to do with the standards used in industry.
Supporting e-Science applications in the UK
The Core Programme has set up a network of e-Science centres across the UK to support existing e-Science applications, engage industry in projects, and create a nucleus of people who know how to build a Grid.
On 21 April, the Chancellor of the Exchequer, Gordon Brown, opened a National e-Science Centre, located in Edinburgh, but jointly managed with the University of Glasgow. The National Centre hosts an e-Science Institute which holds workshops and hosts international visitors. Eight regional centres have also been set up to support e-Science activities across the country.
A Grid Support Centre has also been set up to operate a telephone and email help desk, and a Grid Network Team exists to help application developers understand their network needs. However, if these individuals are to help others, there is an urgent need for that nucleus of people to gain experience in the detail of running a Grid. Therefore a primary role for each of the e-Science Centres is to donate a specific amount of computing and storage resource for use in the construction of a national e-Science Grid. The donated resources range from supercomputers and commodity clusters to databases and other repositories. Gaining experience in the use of digital certificates for single sign-on and authentication across the Grid is a key purpose of this work, and sorting out problems with the security, firewalls and policy issues of each of these sites is a difficult task. Building a grid between multiple sites that accommodate different resources, owned by different groups or individuals, each with their own policies, raises not only a number of technical issues but, indeed, many social barriers that have to be traversed.
There is a Single Certificate Authority for e-Science projects in the UK hosted by the Grid Support Centre. This allows any person working in an e-Science project to obtain a digital authentication certificate to use resources in that project. A careful policy for the issuing of certificates is in place, and this then forms the basis of trust between different Certificate Authorities to allow international collaboration.
Each of the centres has also been provided with 1m (3m for the National Centre) in order to develop industry projects. In the next section we shall discuss a few such projects.
The applications that are being developed on the Grid benefit from Grid technologies in different ways. For many it is a matter of being able to access and control remote resources - instruments, compute resources, visualisation or data resources. For others it is a matter of being able to collaborate with remote colleagues or specialists. Indeed, in some cases the Grid has provided a mechanism for new methodologies of scientific investigation - the ability to combine real-time experimental data with simulation data and have a distributed team visualise the results; the ability to collect data by remote senses and integrate that into simulations or analyses in, for example, agricultural or environmental settings, or in a medical application. In this section we describe four of the projects funded by the Core Programme, three of which are funded through the e-Science centres with substantial industrial collaboration.
The Core Programme funded a project, e-Star (www.estar.ac.uk). In this project the teams at Liverpool JMU, University of Exeter, and the University of Liverpool developed a prototype system to illustrate the concept of a network of remote, robotic telescopes connected via appropriate middleware. This system enables distributed, dynamically scheduled astronomical observations to be carried out and the data interpreted by intelligent software systems. It also allows the integration of astronomical databases and other curated information. The idea behind the system is that a user can request an image of a certain piece of the sky. This request is then sent out to the intelligent agent that 'quizzes' database resources and the robotic telescopes to ascertain whether the data is available or when it could be made available. The user can then choose to have the image taken by the telescope at a particular time or to use an image from a database. If the user chooses to use a remote telescope, he or she can control the telescope remotely through Grid middleware developed for that task, and therefore does not need to be physically at the telescope in its remote location. Once the user has the image, the system automatically analyses the image against historical data held in the US. If there are any anomalies shown in the data, further information is gained from databases in France and the US to provide the user with references to the anomaly and information regarding publications that have previously been made. The steps from receiving the image, to having a history of information about it and the publication list, are completed in a matter of seconds. Without the Grid technology this would take a matter of weeks or even months.
The implications of the Grid for health and medical research are enormous, and a number of projects have been funded by the Core Programme as well as by the Medical Research Council. We shall focus just on one centre project in this article. To cover all of them would require an article of its own.
As noted above, a key motivator for the Grid is the need to collaborate with individuals who are not at the same location. One such project is a Cambridge e-Science Centre project on Telemedicine. The partners in the project are the West Anglia Cancer Network (WACN), the Department of Radiology (University of Cambridge and Addenbrookes Hospital), the Cambridge eScience Centre (CeSC), Macmillan Cancer Relief and Siemens Medical Solutions.
Cancer services across the National Health Service (NHS) are developing to meet the challenges as set out in the recent NHS Cancer Plan. To meet those challenges, clinical networks require effective and timely communication of information, including diagnostic images from radiological and pathological investigations. In the West Anglia Cancer Network, clinicians are currently travelling large distances to provide remote clinical services, and to meet other specialists to discuss patient diagnosis and treatment. In this project, the team is exploring the use of AccessGrid technology to allow remote collaborations, providing video conferencing between multiple sites, access to remote microscopes and patient data, thus avoiding the need for travel.
In the AEC (Architecture/ Engineering/ Construction) industry, large projects are tackled by consortia of companies and individuals, who work collaboratively for the duration of the project. In any given project the consortium may involve design teams, product suppliers, contractors and inspection teams who are likely to be geographically distributed. The Cardiff Centre, together with the Civil Engineering Division at Cardiff and BIWTech, are creating a Grid based system that will allow:
- Interactive, collaborative planning and management;
- Products and supplies, availability, delivery and costing; and
- Integration of 3D geometrical product models into a CAD environment for architects and design teams.
The system will bring together different pieces of software that have been designed to tackle these issues individually, together with a product and supplies database, and integrate them into a shareable, secure environment.
There will be two kinds of software deliverables from this project. The first relates to the specific applications, PlanWeaver and the Product Supplier Catalogue Database, that are being developed by the industrial partner BIWTech, and is a Grid-enablement of those applications. The second is the more generic infrastructure providing services and collaboration tools that will be available to similar projects. This is typical of the projects being funded in this way, with the industrial partner helping to build the generic tools, but also positioning to readily exploit those tools.
Future of the Grid
It is evident that there is a great deal of interest from both the research communities and industry in the development of Grid technologies. Some 40 or more companies have collaborated in e-Science applications in the UK. These include companies working on the application side as well as the technology drivers.
It is often argued, however, that the Grid has been oversold and that there is too much hype surrounding it. But we would agree with Fran Berman, Director of NPACI and SDSC, USCD when she said, in her keynote speech at the last Global Grid Forum, that it is not the Grid that has been oversold, but the difficulty of developing the requisite Grid infrastructure that has been underestimated. The dream of 'the fifth utility' is obtainable, but we have not yet truly scratched the surface of many of the aspects of building it. Areas such as dependability and fault tolerance, programming models and environments, authorisation and accounting, need further investigation and development.
Irving Wladawsky-Berger of IBM identified the industrial early adopters of Grid technology as coming from the pharmaceutical, engineering and petrochemical sectors. This has been somewhat confirmed in the UK programme by the industrial collaborators in the e-Science projects (AstraZeneca, GSK, Merck, Pfizer, Rolls Royce, BAESystems, Schlumberger). He also predicted that we would see Grid middleware being adopted by mainstream commerce and industry in the 2003/2004 timeframe. Before the Grid can become the fifth utility, however, as well as the technology issues that need to be solved, there are many social and legal issues that will need to be addressed. Some are clear, such as IPR and liability, but many others are likely to appear as the technology moves into the mainstream.
The OST funding of the programme has provided both a unique opportunity and a challenge for the UK, to steal the lead in key areas of science and technology that will be crucial in keeping our place in research and commercially worldwide. We believe the accomplishments of the first year of the programme have us on track to meet this challenge but there is still much to be done.
Tony Hey is Director, and Anne Trefethen is Deputy Director, of the UK e-Science Core Programme
EPSRC, Polaris House, North Star Avenue, Swindon SN2 1ET, UK