Read capability improved by 900 per cent at Purdue University
Purdue University has deployed DataDirect Networks’ (DDN) SFA high-performance storage to accelerate the delivery of research results for up to 1,000 researchers working on several hundred concurrent projects. To drive highly innovative, multidisciplinary research, Purdue has developed one of the nation’s largest campus cyberinfrastructures for research.
Three of the university’s high-performance computing (HPC) systems are currently featured on the Top500 list, including the United States’ largest academic distributed computing grid and largest collection of science and medical online hubs. The university also has implemented a robust data repository, called the Data Depot, which takes advantage of DDN-driven, enterprise-class storage equipped with the company’s Storage Fusion XceleratorTM (SFX).
‘The challenge of managing varied research needs is accommodating both very large parallel I/O jobs and millions of small, random read requests without imposing performance penalties on anyone,’ said Mike Shuey, research infrastructure architect at Purdue University. ‘With DDN’s scalable storage platform and SFX technology, we can sustain the highest levels of performance for all researchers by supporting all types of workloads at the same time.’
To meet its multidisciplinary research demands, Purdue sought a powerful yet flexible storage solution that could keep pace with traditional big data volumes generated by top research areas, including computational nanotechnologies, aeronautical and astronomical engineering, mechanical engineering, genomics and structural biology. Additionally, the university needed to keep pace with emerging requirements that were causing an exponential surge in data volume, velocity and variety. For example, Purdue’s College of Agriculture recently teamed with the School of Mechanical Engineering to use sensor-equipped unmanned aircraft to collect critical data from acres of fields. New research outside the typical HPC realm also needed to be accommodated in the data repository, such as new projects from the College of Liberal Arts and Department of Sociology.
To best address its diverse set of stakeholders, Purdue deployed a pair of DDN SFA12KX storage systems with SFX and 6.4 PBs of raw capacity for the University’s GPFS parallel file system. To ensure predictable, fast access to the Data Depot, Purdue also deployed DDN SFX Software to extend the storage cache with solid-state memory. As a result, the system loads the right data into flash storage at the right time to maximize cache hit rates and deliver a fast response.
By pre-loading data into solid-state storage, Purdue has been able to realise the performance benefits of flash storage for handling big data sets at a price point that’s closer to lower-cost, high-density hard disk drives.