A major upgrade is being made to double the storage available in one of the UK’s leading environmental science supercomputer. The upgraded system will support the global analysis of the next generation of climate models and provide a venue for UK academia and industry to exploit Earth observation data.
Called JASMIN, the system provides the UK and European climate and earth-system science communities with the ability to access very large sets of environmental data, which are typically too big for them to download to their own computers, and process it rapidly, reducing the time it takes to test new ideas and get results; from months or weeks to days or hours.
The upgrade will double the available storage to more than 44 Petabytes, equivalent to over 10 billion photos. It will also add around 40 per cent to the processing capability, with 11,500 cores on 600 nodes. This means that the 1700 registered users of JASMIN can process and analyse big datasets simultaneously and in very little time.
JASMIN users research topics ranging from earthquake detection and oceanography to air pollution and climate science.
The JASMIN infrastructure is hosted at the STFC Rutherford Appleton Laboratory where it is managed by RAL Space's Centre for Environmental Data Analysis (CEDA.
Dr Victoria Bennett, Head of CEDA, said ‘We are excited to be expanding JASMIN to manage the increasingly large datasets, from satellites, climate models and other sources. For example the current Sentinel Earth observation satellites alone are producing 10 Terabytes of data every day and this will grow as more are launched as part of the European Commission's Copernicus programme. This upgrade will allow us to build on the successes we've already seen in enabling our users in the science community to efficiently process and analyse these massive datasets.’
Funded with a multi-million-pound investment from the Natural Environment Research Council (NERC), the upgraded system will also continue to provide the ‘UK environmental data commons’ - an online collaborative space bringing together data, services and expertise - underpinning much of academic environmental science.
NERC associate director for National Capability and Capital, Dr Liz Fellman, said, ‘The JASMIN supercomputer is central to delivering NERC science across its portfolio and provides a globally unique and increasingly powerful capability for the UK's environmental science community, enabling significant improvement of predictive environmental science to benefit the UK and beyond. NERC welcomes this major upgrade to a world-class facility.’
Professor Pier Luigi Vidale from the University of Reading has been using JASMIN since 2012 to store and analyse high-resolution global climate model data and said of the upgrade. ‘The project we’re currently leading involves 21 institutions across Europe and will output more than 4 Petabytes of data. The JASMIN upgrade will allow us to store all data and to do most of the analysis online, thus dramatically speeding up the extraction of science, at unprecedented resolution and enabling scientific publication at a far higher rate. We would not have embarked on the project without the enhanced JASMIN.’
The JASMIN upgrade involves the integration of computing equipment from many suppliers, a specialised new network, the development and deployment of new software, and the migration of Petabytes of archived data from old hardware, in need of retirement, to new.
The entire process will take many months, from the integration of the first new equipment in March until the last of the old storage is retired. Completion is expected by the end of 2018. The system integration is being led by STFC Scientific Computing Department (SCD), and the software and data management by CEDA.
Jonathan Churchill, JASMIN Systems Architect and Manager for SCD, is part of the team that has designed and are now installing the upgrade that will be exploited by the ever-expanding JASMIN science communities. Churchill said: ‘Not only have we dramatically scaled out JASMIN storage, compute and networking, but the new storage and networking technologies will improve the user ‘experience’ and provide capabilities that we have never been able to make available to users before. The compute upgrade will provide not only much needed extra batch computing cores but also provide the deep, on-demand cloud computing capacity and flexibility that releases new analysis environments to our science communities.’