The Genome Analysis Centre (TGAC), a research institute based in the UK, has dramatically reduced the time it takes to perform large genome assembly, with the installation of a new supercomputing platform based on two SGI UV 300 systems.
This installation represents the world’s largest SGI UV 300 installation for life sciences, including one of the largest Intel SSD for PCIe deployments worldwide. The new technology will also be used to aid the development of novel analysis techniques for data integration, by taking advantage of the larger, faster memory-per-core specifications of the system and its accelerated I/O capabilities from the NVMe SSDs.
Dr Tim Stitt, head of scientific computing at TGAC, stated that the new SGI system with NVMe SSDs ‘will undoubtedly be a leader in-field of genomic analysis.’
‘With the unique shared-memory technology from SGI and Intel’s leading processor and non-volatile memory storage solutions, this system will set the new yardstick for large-scale data-intensive bioinformatics computations. The combination of processor performance, memory capacity and one of the largest deployments of Intel SSD storage worldwide makes this a truly powerful computing platform for the life sciences,’ stated Stitt.
This new TGAC platform comprises two SGI UV 300 systems, totalling 24 terabytes (TB) of shared-memory, 512 Intel Xeon Processor E7 v3 cores and 64TB of Intel P3700 SSDs with NVMe storage technology. Each SGI UV 300 flash memory solution features 12TB of shared memory with 7th generation SGI NUMAlink ASIC technology, scaling up to 64 TB of globally addressable memory as a single system.
Jorge Titinger, president and CEO, SGI, commented on the importance of HPC to genomics industries: ‘The complexity of genomic data and workloads today requires high performance computing (HPC) to provide new insights for researchers and allow them to derive conclusions from the massive data jigsaw puzzle.
By tightly coupling the Intel P3700 SSDs with NVMe storage technology with the latest generation SGI UV 300 system, SGI allows customers like TGAC to achieve extraordinary bandwidth and IOPS.’
SGI reported that the combined 24TB SGI UV 300 supercomputers, when combined with flash storage, can increase processing speeds of heavy workloads in scientific research by up to 80 per cent.
Genomics role in food security
This upgrade is central to TGAC’s continued research to increase global food security by analysing wheat genotype and phenotype data generated by the Seeds of Discovery programme.
The Seeds of Discovery (SeeD) project strives to develop a complete understanding of maize and wheat genetics to make them more attractive to breeders. The SeeD project hopes to shed light on the genetics of seeds to ‘unlock the dormant genetic potential of maize and wheat’ by providing breeders with a toolkit that enables a more targeted use of grains in the development of high-yield, climate-ready and resource-efficient food.
SeeD is one of four MasAgro (Modernización Sustenable de la Agricultura Tradicional) projects funded by Mexico’s Ministry of Agriculture, Livestock, Rural Development, Fisheries and Food (SAGARPA).
The TGAC will use the SGI HPC technology to enable faster analysis of complex genomes which require both large memory and fast processing capabilities, providing a powerful boost to TGAC’s research projects. This research will include sequencing and assembling multiple lines of wheat with the Institute’s ‘w2rap’ assembly software - developed by the Algorithm Development team led by Bernardo Clavijo.
Ketan Paranjape, the general manager of Intel’s Life Sciences team, added: ‘Knowledge of plant and animal genomes can lead to breakthroughs in drug discovery, food safety, and more, helping us to better manage climate change, feed a growing population, and mitigate the impact of newly emergent diseases.’
With the SGI UV 300 system, Intel Xeon Processor E7 v3 product family and Intel DC P3700 SSDs with NVMe, TGAC can now assemble large plant and animal genomes in record times that, until a few months ago, were impossible.’
TGAC is funded by Biotechnology and Biological Science Research Council (BBSRC) and operates a national capability to promote the application of genomics and bioinformatics to advance bioscience research and innovation.
The Biotechnology and Biological Sciences Research Council (BBSRC) invests in bioscience research and training on behalf of the UK public. Funded by Government, BBSRC invested over £509M in world-class bioscience in 2014-15 and leads funding of wheat research in the UK with more than £100M investment on UK wheat research in the last ten years.