Parallel programs need new maths
The drive to create exascale computers may force researchers and engineers to reverse the way in which they approach the task of writing software. They will have to start from parallelism and only then think of the mathematics, Mark Parsons, director of the UK’s Edinburgh Parallel Computer Centre (EPCC), told the PRACEdays15 conference in Dublin at the end of May.
He warned that two major Open Source software packages, OpenFoam and OpenIFS, would not scale to the massively parallel architectures of the future. ‘It will cost many man years to port the codes to exascale and massive parallelism by 2025,’ he said.
The EU-funded Cresta project, to review software and encourage co-design for exascale, had revealed that OpenIFS ‘is in no way appropriate for exascale,’ he continued. The situation for OpenFoam, which is one of the most widely used CFD codes, is worse, however. ‘It is not even a petascale code. It’s never going to get to exascale in its current form.’ In fact, Parsons said, the Cresta workers ‘gave up on’ OpenFoam.
OpenIFS is, technically, not open source despite the name, but perpetual licences for institutions, though not individuals, are avail free of charge. The project is led by the European Centre for Medium-Range Weather Forecasts (ECMWF) to provide an easy-to-use, exportable version of the IFS system in use at ECMWF for operational weather forecasting. The community supporting the code had taken Cresta’s point to heart and were starting to look at how they could restructure the software.
Eric Chaput from Airbus underlined the importance of open-source software by saying that in future Airbus would be using open source software as the basis for its engineering simulation work. It would not be relying on commercial software from the independent software vendors (ISVs) because of the cost of licences. Commercial software was too expensive, even for a company of the size of Airbus, he explained: because most licencing models were ‘per user’ Airbus would have to pay for more licences than the company was willing to do.
In any case, according to Lee Margetts, lecturer in computational mechanics at the University of Manchester, the ISVs see their market as desktops and workstations, not HPC, let alone Exascale. Reporting on the results of a survey of 250 companies conducted late last year for NAFEMS, the international organisation set up to foster finite element analysis, he told the meeting that ISVs were moving towards supporting accelerators such as Nvidia GPUs and the Intel Xeon Phi coprocessor. However, the idea of rewriting their code to port to FPGAs, ARM processors – or to cater for such features important to exascale as fault tolerance and energy-awareness – was, for many ISVs, ‘not on their roadmap. Some vendors don’t know what it is,’ he said.
In this, the ISVs are simply following the market, according to Margetts: the survey revealed that the standard workstation and laptop are where most engineers do most of their work. The ISVs are not going to deliver Exascale software, he continued, and hence Exascale engineering software needs to be open source.
In Parsons’ view: ‘Hardware is leaving software behind, which is leaving algorithms behind. Algorithms have changed only incrementally over the past 20 years. Software parallelism is a core challenge. We need to see codes moving forward.’
Insatiable appetite for exascale
Despite these issues, it became very clear that there is an almost insatiable appetite for computing power and that some commercial users and many not-for-profit researchers are eagerly awaiting the advent of exascale machines. According to Eric Chaput’s presentation to a satellite meeting reviewing European exascale projects, the ultimate goal of Airbus is to simulate an entire aircraft on computer. Chaput is senior manager of flight-physics methods and tools at Airbus, and his remarks prompted one audience member to observe that even exascale would not be enough for Airbus, but rather it needed zetascale or beyond.
The requirement for ever more computing resources in Europe was powerfully reiterated by Sylvie Joussaume in the course of the panel discussion that concluded the conference. This time, the emphasis was on research for the public good rather than commercial benefit. Dr Joussaume, the chair of Prace’s Scientific Steering Committee in 2015, is a senior researcher within the French CNRS and an expert in climate modelling. She stressed that European climate researchers needed access to the next generation of the most powerful machines if they were to maintain their expertise in the subject.
The minds of many delegates had been concentrated by the US Department of Energy’s announcement that three next-generation systems are to be built as part of the Collaboration of Oak Ridge, Argonne, and Lawrence Livermore (Coral) by two consortia, headed by IBM and Intel respectively at a cost of more than $425 million. This Coral procurement is intended to develop supercomputers that will leapfrog the international competition and open up the way to exascale machines.
However, Europe is a loose collaboration of individual nation states and the European Commission is not permitted a budget that would allow it to place heavily subsidised contracts with a European supercomputer vendor in the same way as the US Department of Energy appears to be free to do for US high tech companies. Nonetheless, the head of the e-infrastructure unit at the Commission Augusto Burgueño Arjona told the meeting, a high-performance computing strategy was an essential building block to meet the aims of Europe’s Digital Single Market. There was a need to develop infrastructure for innovation combining Cloud, HPC, and big data and, he assured his audience, Prace was seen as a fundamental part of that.
Prace – the next five years
The European Commission’s policy will face its first significant test later this summer. Technically, Prace, the Partnership for Advanced Computing in Europe, is coming to the end of its first phase. It has allowed researchers access to some of the most advanced computing resources in the world, even if they happen to live and work in countries that have only very limited national computing facilities. Currently, four hosting countries – Spain, France, Germany, and Italy – allow researchers from other European member countries run time on their premier national facilities.
Although everyone agrees that Prace has been a scientific success, some of the hosting nations feel that the current funding arrangements have been a little unfair. For the next phase, it was expected that the European Commission would provide funds to reimburse to the hosting nations at least some of the operational expenditure for Prace jobs running on the national machines. But it appears that this may not happen, so it may have to be arranged at the level of the Prace membership rather than the Commission. Funding the next phase of the project therefore will be tricky, especially to ensure that the arrangements are transparent to all.
Sanzio Bassini, chair of the Prace council, was asked about the transition to Prace 2.0. He acknowledged that ‘one of the most pressing issues is the participation on all members of the association to the operational costs of the infrastructure, and the sustainability of the model moving forwards.’
He raised several options, such as basing contributions to the Prace infrastructure from each country’s GDP, but another option would be to subsidise these HPC resources by selling a certain portion of the computational cycles to industry.
Alternatively these strategies or some combination of ideas could be employed in tandem reducing the burden on the participating countries that provide the computational muscle to Prace.
A European approach to buying supercomputers?
Bassini remarked that to ‘maintain Europe as a world class contributor in science’ governments and organisations like Prace ‘must ensure that they can offer access to leading HPC systems.’ He assured delegates that: ‘Prace aims to offer at least one system in each architectural class.’
Currently, Prace is dependent on HPC infrastructure bought and owned by EU member states. However, there was a growing recognition, Burgueño Arjona suggested, that that the member-state/individual approach might not be enough to create a European approach that was both effective and economic.
The Commission would be monitoring the HPC market and R&D landscape in Europe in the course of this year and would report to the Council of Ministers and the European Parliament by the end of 2015 on the steps that should be taken after that. In addition, the European Strategy Forum on Research Infrastructures (ESFRI) was being invited to look at the issue and to propose ways of better coordinating the investments being made by individual member states.
Research and innovation in Europe needed world-class computing capability, Burgueño Arjona assured his audience. The European Commission needed to be careful not to be seen to be favouring one commercial company over another (something that does not appear to trouble the USA in its procurement of the Coral project), but Burgueño Arjona did point out that some 700 million euros would be available through the EU’s Horizon 2020 research programme for public-private partnerships with the commercially led European Technology Platform for HPC (ETP4HPC).
Japan supports HPC for industry
Despite very different political structures, both Europe and Japan have come to very similar conclusions about how best to improve access to high-performance computing (HPC): HPC resources must be shared and not monopolised by the individual owners of the computer systems themselves. And just as Europe has set up Prace, so Japan has created the Research Organisation for Information Science and Technology (RIST) to coordinate that process of sharing.
However, in one significant area the two regions differ strongly in their policy: in Japan, compute cycles on the country’s foremost HPC systems are offered to industry to conduct commercially sensitive work, without the industrial partners having to openly publish the results of the project, as is the case in the pan-European access system, according to Masahiro Seki, president of RIST.
Both Prace and RIST offer computational cycles to industry, but RIST proposes a much better deal to those concerned about protecting sensitive information – a key selling point for many industrial users. In certain circumstances, industrial users of RIST do not need to publish the entirety of their results; instead they can keep their own IP safe from public view.
A second point of difference is the degree of support for users that has been put in place by RIST. Seki said: ‘RIST provides 17 scientific consultants, with seven consultants in Kobe, six in Tokyo. The Tokyo office is setup for the support of industrial users.’ By providing consultants RIST is providing some expertise to groups of users that may understand their chosen field very well but may not have the skill s and knowledge required to operate HPC systems. Because RIST provides knowledge in areas such as code optimisation, industrial users can concentrate on the specific challenges that face them without having to worry about the HPC systems themselves.
HPC’s benefits to industry in small countries
The host nation, Ireland, used PRACEdays15 to demonstrate the benefits that its own industries can realise by using HPC. Jean-Christophe Desplat, Director of the Irish Centre for High-End Computing (ICHEC), stressed the importance of international cooperation to further European supercomputing and its role in economic development.
Nonetheless, he warned that ‘computer modelling is still struggling to be accepted as a mandatory and cost-effective methodology in some countries.’ The attitude appeared to be, sometimes, that it was all very well for large companies such as Boeing in the USA, but organisations such as Prace and his own ICHEC had to make HPC more relevant to smaller companies with smaller budgets.
He particularly stressed the important of the ‘P’ in Prace’s title: as a Partnership for Advanced Computing in Europe. Although Ireland is one of the smaller European countries, it is able to make a positive contribution, he maintained: ‘Excellence does not know borders. Budget, computer size, and head-count are not indicators of excellence. Just because your system is not in the Top500, does not mean you do not have excellent people in these countries or organisations.’
Among the projects that ICHEC has worked on in partnership with Prace was one from Tullow Oil, a multinational oil and gas exploration company founded in Tullow, some 35 miles south of Dublin in Ireland, but now with its corporate headquarters in London. It relies on seismic imaging to locate oil supplies buried deep in rocks under the sea and on land. Sean Delaney, a computational physicist at Tullow Oil, told a special session of PRACEdays15 on ‘HPC in Ireland’, that a lot goes into the infrastructure behind seismic imaging in addition to the complex algorithms and physics. Delaney said: ‘Each boat has a basically a small HPC centre on board, constantly monitoring things and making sure that the data is coming in as expected in addition to performing some initial processing and analysis.’
This process generates large amounts of data and in reality the growth in demand for computational resources almost always outstrips increases in budgets, so the job falls to the computational physicists like Delaney to improve the performance of the software. Delaney said: ‘We have lots and lots of data, and computers just can’t get fast enough as far as the oil industry is concerned. The more horse power we can get our hands on, the more we can use at any given time. Small things matter, small improvements in resolution really do make a difference.’ The team at Tullow has imposed vectorisation and improved the parallelisation of the code generally leading to a six-fold speed-up.
NSilico provides data management and analytics software for the life sciences and healthcare industries. Its flagship program, Simplicity, is a cloud-based system for the automatic annotation, analysis, and visualisation of genetic data. Based in the city of Cork, to the south of Ireland, NSilico last year took part in the Prace SHAPE programme to encourage small and medium companies to use HPC to help develop their business, partnering with CINES in France and ICHEC to develop a technique for rapid alignment of short DNA sequences.
NSilico uses the Smith-Waterman algorithm to determine similar regions between two strings or nucleotide or protein sequences. According to Brendan Lawlor, a software architect from NSilico: ‘Smith-Waterman is a data-dependent algorithm largely because some cells in the matrix may rely on the results of others.’ NSilico has streamlined the code, rewriting the Smith-Waterman algorithm component into only 1,000 lines of Scala code and thus reducing opportunities for inefficiencies. However the team has so far only scaled the code across three cores, so they are just beginning investigations into how to scale across larger systems.
Supercomputing for small companies made simpler?
The most fragile part of the HPC ecosystem is the network of small firms, spin-out companies and open source software service companies Often they are geographically dispersed and lack critical mass – so there could be a role for Prace as an ‘incubator’ for such companies and for the ‘early adopters’ of HPC among small to medium enterprises (SMEs), Lee Margetts, lecturer in computational mechanics at the University of Manchester, told the meeting.
A special session of the meeting was devoted to demonstrating how these smaller companies in the HPC ecosystem could encourage and support the wider use of high-performance computing both in industry and also in the public sector.
In a hugely dynamic and enthusiastic presentation, Stefano Cozzini, CEO of eXact Lab in northern Italy, offered some ideas and practical examples. His company exists to ease access to high performance computing for SMEs and for the public sector as well.
It started as an HPC consultancy, but after three years in business, Cozzini and his co-founders realised that consultancy ‘does not scale’ and that they needed a scalable solution to provide their services. This led to the eXact computing environment, targeting SMEs whose HPC needs still have to be properly identified. These could range from CFD to rendering for media companies, he said.
One fruit of their labours is XeRis – a cloud platform for advanced analysis of seismic hazards with a web-based interface. Among the users have been a Swiss nuclear power plant and the Government of Trieste in Italy, which is using the service to assess the seismic safety of schools in the region. (As the destruction of Assisi due to two devastating earthquakes that shook Umbria in September 1997 testifies, earthquakes are a real risk in Italy.)
But Cozzini admitted that it was a challenge to find enough people who were trained in high-performance computing and to establish networks of small and medium sized companies who could provide the expertise needed. So, the University of Trieste is running a Masters programme in high-performance computing; and in early 2012, a group of Slovenian and Italian companies came together to form the High Performance and Cloud Computing Cross-Border Competence Consortium (HPC5) to provide businesses, researchers, and universities with advanced HPC and cloud computing services.
Manuel Arenaz, from the Spanish company Appentra Solutions, reminded his audience that ‘programming supercomputers is hard’. His company has therefore developed a software tool, called ‘Parallware’, to find course-grain parallelism in sequential source code automatically, without intervention by the programmer. Speaking by video link as he was unable to attend the session in person, Parallware as another way of addressing the HPC ‘talent gap’ because it should allow engineers and scientists to focus on their science and engineering and decouple that task from the details of the underlying parallel hardware.
About the authors
Dr Tom Wilkie is the editor for Scientific Computing World.
You can contact him at firstname.lastname@example.org.
Find us on Twitter at @SCWmagazine.
Robert Roe is a technical writer for Scientific Computing World, and Fibre Systems.
You can contact him at email@example.com or on +44 (0) 1223 275 464.