Irish HPC at the fore in Dublin for PraceDays15
In the second of our reports from PraceDays15 conference in Dublin last week, Robert Roe looks at the role of HPC in the host country, Ireland.
Ireland sees HPC as a key tool that can invigorate industry and encourage new business into the country but it must make efficient use of its resources and partner with outside organisations such as Prace if its industry is to compete on a global scale.
PRACEdays15, which was held last week in Dublin, not only showcased European HPC research facilitated by the Partnership for Advanced Computing in Europe (Prace) but it also offered a platform where the host nation, Ireland, could demonstrate the benefits that its own industries can realise by using HPC.
HPC in a small country
Jean-Christophe Desplat, Director of the Irish Centre for High-End Computing (ICHEC), stressed the importance of international cooperation to further European supercomputing and its role in economic development. Nonetheless, he warned that ‘computer modelling is still struggling to be accepted as a mandatory and cost effective methodology in some countries.’ The attitude appeared to be, sometimes, that it was all very well for large companies such as Boeing in the USA, but organisations such as Prace and his own ICHEC had to make HPC more relevant to smaller companies with smaller budgets.
He accepted that there was still a lot to be done to achieve this, both in Ireland and in Europe. The uneven state of preparedness across Europe for the application of HPC to economic development reflected past investments and history. He welcomed the activities of Prace as a way of helping to rectify this and particularly stressed the important of the ‘P’ in Prace’s title: as a Partnership for Advanced Computing in Europe. Although Ireland is one of the smaller European countries, it is able to make a positive contribution, he maintained: ‘Excellence does not know borders. Budget, computer size, and head-count are not indicators of excellence. Just because your system is not in the Top500, does not mean you do not have excellent people in these countries or organisations.’
ICHEC was founded in 2005 as Ireland's national high performance computing centre. Its mission is to provide HPC resources, support, education and training for researchers in Irish institutions and also to support Irish industries and contribute to the development of the Irish economy.
Some of the fruits of the work supported by ICHEC in partnership with Prace were discussed in a special session during the Prace event, entitled ‘HPC in industry in Ireland’.
Tullow Oil is a multinational oil and gas exploration company founded in Tullow, some 35 miles south of Dublin in Ireland, although its corporate headquarters are now in London. It relies on seismic imaging to locate oil supplies buried deep in rocks under the sea and on land. In its simplest form, it involves propagating sound waves through a substrate and then recording the time it takes for the wave to bounce back into geophones (microphones) to provide an indication of the materials the wave passed through.
Sean Delaney, a computational physicist at Tullow Oil, told a special session of PRACEdays15 on HPC in Ireland, that a lot goes into the infrastructure behind seismic imaging in addition to the complex algorithms and physics which must be accommodated into simulation of the data produced. Delaney said: ‘Each boat has a basically a small HPC centre on board, constantly monitoring things and making sure that the data is coming in as expected in addition to performing some initial processing and analysis.’
This process generates large amounts of data which may be analysed multiple times depending on the resolution needed and complexity of the underlying geology so small improvements in the data analysis process can translate into gains in revenue.
One way to do this is to increase the computational resources but this provides diminishing returns as it increases costs, reducing overall ROI. In reality the growth in demand for computational resources almost always outstrips increases in budgets, so the job falls to the computational physicists like Delaney to improve the performance of the software. Software engineers can accomplish this through further parallelisation of the code or by streamlining it to make better use of existing resources.
Delaney said: ‘We have lots and lots of data, and computers just can’t get fast enough as far as the oil industry is concerned. The more horse power we can get our hands on, the more we can use at any given time. Small things matter, small improvements in resolution really do make a difference.’
The team at Tullow used a variety of methods to speed up the algorithms they are using by implementing them through OpenMP, which allowed the teams to force vectorisation and generally improve the parallelisation of the code. Overall this led to a six-fold speed-up on the old code, with more than 80 per cent of the operations now being vectorised.
More efficient genomics
NSilico provides easy-to-use data management and analytics software for the life sciences and healthcare industries and it is also interested in speeding up the processing of large amounts of data but, in NSilico’s case, the focus is on genomic sequencing data. Its flagship program, Simplicity, is a cloud-based system for the automatic annotation, analysis, and visualisation of genetic data. The company is based in the city of Cork, to the south of Ireland.
Last year, NSilico took part in the Prace SHAPE programme to encourage small and medium companies to use HPC to help develop their business. In that project NSilico partnered with CINES in France and ICHEC to develop a technique for rapid alignment of short DNA sequences.
Brendan Lawlor, a software architect from NSilico highlighted the falling cost of genomic analysis but pointed out that this is creating difficulties as the amount of data generated is rising exponentially, putting a large burden on the technology, both software and hardware that is available for genomic sequencing today.
Lawlor said: ‘While Moore was formulating his law decades ago, Amdahl described mathematically what I think we can all appreciate with our intuition -- that it does not matter how many cores you have in your CPU, and how many boxes you have in your network, if the software that you are running can only do one thing at a time.’
In Lawlor’s view, Moore’s Law has shielded us from having to improve our algorithms and think effectively about how to improve software design because of the increasing processor performance. He said: ‘Now the difficulty is -- and I am speaking as a software engineer -- the software development community is a child of Moore’s law and not Amdahl’s, so effectively we have been getting a free ride for decades as our software gets faster without us having to do anything.’
He explained software developers have lagged behind and now have to learn to make the most of the hardware available to them.
NSilico uses the Smith-Waterman algorithm to determine similar regions between two strings or nucleotide or protein sequences. Instead of looking at the total sequence, the Smith–Waterman algorithm compares segments of all possible lengths and optimises the similarity measure.
Lawlor stated that the ‘Smith-Waterman is a data-dependent algorithm largely because some cells in the matrix may rely on the results of others.’ So far NSilico has streamlined the code, improving its performance in the process by rewriting code that uses the Smith-Waterman algorithm into only 1000 lines of Scala code and thus reducing opportunities for inefficiencies. However the team has so far only scaled the code across three cores, so they are just beginning investigations into how this code will scale across larger systems.
With its programme to encourage the wider use of HPC among smaller industrial companies, Prace is fostering economic development through cooperation and partnership in Europe. As the barriers against the use of HPC are lowered, more companies will be able to take advantage of this powerful technology but they need the right infrastructure in place to develop algorithms and educate new users in the skills needed for highly parallelised computing.
This is the second in a series of reports from PraceDays15, held in Dublin last week. On a similar theme, Tom Wilkie writes about how supercomputing for small companies can be made simple. The reasons why parallel programs need new maths are explained by Tom Wilkie in a further report from the conference. In addition, European supercomputing policy was set out at PraceDays15, prompting the question whether a European company will build Europe's first Exascale computer. In the final report, Robert Roe compares and contrasts the support given for HPC in Japan and in Europe.