APPLICATIONS NEWS

Physics big data benefits from IBM's Elastic Storage

20 August 2014



Robert Roe reports on Elastic Storage, a scalable, high-performance data and file management solution developed by IBM in collaboration with researchers operating the Deutsches Elektronen-Synchrotron (DESY) in Germany.

IBM research has announced Elastic Storage, a scalable, high-performance data and file management solution (based upon GPFS technology) which has been implemented through a collaboration with researchers operating the Deutsches Elektronen-Synchrotron (DESY), a national research centre in Germany. The aim of the project is to speed up handling and storage of massive volumes of x-ray data.

The teams have designed a big data and analytics architecture capable of delivering 20 gigabytes per second at peak performance. Its purpose is to provide data storage for DESY’s 1.7 mile long PETRA III accelerator/storage ring.

PETRA III accelerates electrons, or their anti-matter counterparts, positrons, to nearly the speed of light. The particles are then sent through a tight magnetic slalom course -- created by a series of magnets called 'undulators' -- to generate intense synchrotron radiation at x-ray wavelengths. The machine offers scientists outstanding experimental opportunities with X-rays of an exceptionally high brilliance for very small samples or those requiring tightly collimated and very short-wavelength x-rays.

The facility is used for materials science research into the atomic structure of samples ranging from semiconductors and catalysts, through to viruses and living cells.

DESY is housed in Hamburg and Zeuthen in Germany and is home to 3,000 scientists from more than 40 countries.

Dr Volker Gülzow, Head of DESY IT explained:  ‘A typical detector generates a data stream of about 5 Gigabit per second, which is about the data volume of a complete CD-ROM per second. At PETRA III we do not have just one detector, but 14 beamlines that are currently extended to 24, all this data must be stored and handled reliably.’

‘IBM’s software defined storage technologies can provide DESY the scalability, speed and agility it requires to morph into a real-time analytics service provider.’ said Jamie Thomas, General Manager Storage and Software Defined Systems at IBM. ‘IBM can take the experience gained at DESY and transfer it to other fields of data intensive science such as astronomy, climate research and geophysics to design storage architectures for analysis of data generated by distributed detectors and sensors.'

IBM Research in Zurich, Switzerland and IBM Storage Development team in Mainz, Germany will provide the technical expertise and evaluate a broad range of features which are being worked on for the Elastic Storage system.

A key challenge in this process is storing and handling huge volumes of x-ray data. This is in part due to the scope of the project; more than 2,000 scientists use the PETRA III accelerator each year to examine the internal structure of a variety of materials at an atomic resolution.

DESY is addressing this challenge with the help of IBM Research and IBM Software Defined Storage technology code name Elastic Storage. Based on the General Paralllel File Storage (GPFS) system - that IBM began developing as early as 1993 – it is designed to scale easily to manage the deluge of data flowing every second from PETRA III.  

Elastic Storage can provide scientists high-speed access to increasing volumes of research data by placing critical data close to everyone and everything that needs it, no matter where they are in the world. This architecture will allow DESY to develop an open ecosystem for research and offer analysis-as-a-service and cloud solutions to its users worldwide.

The scalability of IBM Elastic Storage will be used to support DESY and a number of international partners that are currently building the x-ray laser European XFEL, a research light source that will generate much more data. ‘We expect about 100 Petabyte per year from the European XFEL,’ said Gülzow. That's comparable to the yearly data volume produced at the world's largest particle accelerator, the Large Hadron Collider (LHC) at the research center CERN in Geneva.

Related internet links

IBM Research
DESY