The future of HPC in Australia
Australia, with a population less than Beijing, but with the 12th largest global economy, has traditionally punched above its weight in the international research stakes. Australia’s challenge now is to translate this research capacity and economic power more strongly into innovation and wealth creation, in a world in which increasingly ‘to outcompute is to outcompete’.
Challenges also arise from Australia’s tenure of a resource-rich, island continent and a climate that experiences severe weather events including cyclones, floods, and bushfires. The realisation of national benefits and economic outcomes, and the management of environmental risks, demands continental-scale simulation and data-management capabilities, which, in turn, requires investments in world-class research infrastructure and skills.
High-performance computing is consequently a crucial element in advancing research, economic competitiveness, and our national wellbeing, with the unique challenges of the Australian environment providing a significant and particular focus for the current development of a national e-infrastructure strategy.
Australia’s HPC landscape: history and current status
The use of supercomputers in Australia can be traced back to the late 1960s, led by two national agencies, the Bureau of Meteorology (BoM), and the national science agency, CSIRO — driven by the need for numerical weather prediction and computational support for research. Since then, these two agencies have operated separate and shared facilities, which have included IBM, CDC, Cray, Fujitsu, and NEC systems, and most recently an Oracle Constellation cluster.
Australia’s university community lagged in its uptake of HPC until 1987, when the Australian National University (ANU), in Canberra, established its own supercomputing facility, commencing with Facom vector systems. The absence of a national strategy for research HPC was addressed in the late 1990s, leading to the establishment, at ANU, of the Australian Partnership for Advanced Computing (APAC) which would later evolve into the National Computational Infrastructure (NCI). Since 2000, the computational capability at APAC/NCI has risen over 1000-fold from a one-teraflop HP Alphaserver, through SGI and Sun/Oracle systems, to the current Fujitsu Primergy system, Raijin, commissioned in 2012. It has 57,472 Intel (Sandy Bridge) cores and a performance of 1,200 teraflops.
In 2014, there are three petascale HPC systems in Australia: Raijin at NCI (www.nci.org.au); a 35,000 core, 1,500 teraflop Cray XC-30, Magnus, at the Pawsey Supercomputing Centre (www.ivec.org) in Perth, Western Australia (WA); and a 65,536 core, 840 teraflop, IBM Blue Gene Q at the Victorian Life Science Computation Initiative (VLSCI, www.vlsci.org.au) in Melbourne. These are augmented by tier 2 facilities at CSIRO, including a substantial GPU cluster, a dedicated system for radio astronomy research at the Pawsey Centre, and smaller systems in some national agencies and university consortia. In 2015 to 16, the petascale systems will be joined by a new operational meteorological services facility for BoM.
NCI and Pawsey are the national systems, funded by Australia’s National Research Infrastructure Strategy, while VLSCI is a specialist life sciences facility, established for Victorian researchers and funded by that State Government. The investments in NCI and Pawsey are correspondingly prioritised, towards climate/weather science and the environment for NCI, and the geosciences and the Square Kilometre Array project for Pawsey. While each national facility also supports the gamut of scientific and technological research, this investment prioritisation is likely to continue, and be amplified by expectations of increasing usage by industry.
The current distinctive environment
Perhaps what most distinguishes NCI and Pawsey among international petascale facilities is the funding model, in which all recurrent operating costs must be met by co-investment – a requirement of the government’s funding programme. This has shaped the partnerships that support these facilities, the focus of activities, and the access models, which combine partner shares and merit-based access, and also reflect investment priorities. Today, institutional co-investment is seen as synonymous with the value of government infrastructure investments. NCI and Pawsey respectively generate annual co-investment incomes of around $11M and $8M, from partnerships that include national science agencies, research-intensive universities, and in the case of Pawsey, the Western Australian State Government.
Co-investment at such levels engenders strong governance, and expectations that operations are of research-production standard, and services are comprehensive and integrated, in alignment with the data intensity of the investment priorities, and the data richness of partner institutions. Accordingly, the NCI and Pawsey systems, in addition to their role in computational simulation, are integral parts of high-performance data processing and analysis pipelines – a situation that differs markedly from most supercomputer centres that focus primarily on simulation and modelling. For example, the processing of data collected from precursor telescopes for the Square Kilometre Array (SKA) project will utilise 25 per cent of the Pawsey infrastructure on a dedicated 300+ teraflops system. The situation at NCI is similar, in that climate simulations are combined with observational data for analysis using both the supercomputer and a specialist OpenStack cloud of supercomputer specification. In contrast, the VLSCI facility is highly focused on simulation.
Confronting our future
Natural disasters and economic outcomes
Australia occupies a continental landmass only slightly smaller than continental USA, with approximately 10 per cent of its population living within three kilometres of the coast. Australia’s latitude and climate make it prone to severe weather events such as tropical cyclones, floods and bushfires. Such events occur frequently during the summer season. Indeed they happened concurrently in the summer of 2010–11, placing the resources of BoM under considerable pressure due to the combination of severe flooding in the eastern states and bushfires in Western Australia, compounded by the devastation of Tropical Cyclone Yasi – a storm more powerful than Hurricane Katrina. The imperative that BoM handle extreme situations with better localised forecasts and more reliable weather prediction is driving upgrades to its model suite to operate at higher resolution, and a substantial upgrade to petascale HPC capabilities by 2015–16.
The social and economic benefits that can be realised are significant, not only through the better management of severe weather risk, but also from the capacity to undertake seasonal forecasting. Exemplifying the former is the potential mitigation, through more skilful weather forecasting, of losses of between 50 million and 100 million dollars per day, due to port closures and lost production, when cyclone warnings are in place in Australia’s northwest.
Australia’s weather and climate modelling suite, ACCESS, is a coupled model capable of operating over time scales that range from hours, for severe weather events, through days for weather forecasting, months for seasonable prediction, and decades and centuries for climate variability and change. A critical role is played by the NCI facility, which serves as the development platform for the next-generations of ACCESS, driven by a collaboration between BoM, CSIRO, and the academic community. Work is concurrently underway to optimise the scaling and I/O performance of ACCESS, to prepare it for the many-core processors of the future through a collaborative project involving NCI, BoM, and Fujitsu.
Highly accurate predictions of cyclone paths will have significant economic benefits for industry and human safety, which will be further enhanced when coupled with flood inundation modelling based on land topography. This latter part of the equation is already in place through ANUGA, an open source package developed by ANU and Geoscience Australia (GA) for modelling the impact of hydrological disasters such as dam breaks, riverine flooding, storm surges, and tsunamis. These capabilities, when further coupled with major earth observation collections (see Fig. 3), such as GA’s Data Cube, which holds decades of images of Australia taken by NASA’s Landsat satellites, will provide the ability to model flood runoff and the recharging of aquifers, with economic benefits for agriculture.
Supporting the SKA project
The Square Kilometre Array (SKA), one of the largest scientific endeavours in history, is a $2.3 billion international project to build a next-generation radio telescope in South Africa and Australia that will help scientists answer fundamental questions about the origins of the universe, such as how the first stars and galaxies were formed. It will be 50 times more sensitive and able to survey 10,000 times faster than today’s most advanced telescopes, and is expected to generate one exabyte of data per day.
The Pawsey Supercomputing Centre is one of the 20 members of the SKA Science Data Processing Consortium, responsible for designing the hardware and software to analyse, process and visualise the data produced by the SKA. The data from the Australian component of the SKA is under consideration to be processed and stored primarily at the Pawsey Centre. The data produced will be too large to store for any reasonable period of time, and so the data must be managed in real-time, necessitating immense processing power (see Big Data needs networks, page 14).
Two precursor projects to the SKA, the Australian Square Kilometre Array Pathfinder (ASKAP) and the Murchison Wide-field Array (MWA) were launched in late 2012 and serve as important technological demonstrators. The MWA telescope has been in full operation since 2013, and data from the first antennae of the ASKAP project are already being processed by the real-time Cray supercomputer at the Pawsey Supercomputing Centre.
Managing the diversity of HPC requirements
The workloads of Australia’s petascale systems are perhaps atypical of facilities elsewhere, in that they need to marry computational capability with data-intensive capacity, and handle the requirements of merit-based access for university researchers with the R&D and service requirements of national agencies. The relative lack of tier 2 systems in Australia imposes workloads on the tier 1 systems that otherwise would be handled in alternative ways. Accordingly, the tier 1 facilities serve the gamut of research — pure, strategic, applied, and industry – providing the platform from which Australian researchers maintain international competitiveness, and the science agencies undertake research that delivers national benefits.
There also are significant legacy workloads – applications no longer of supercomputer class, and accordingly, there is a need to migrate lowly-scaling tasks onto a more suitable platform, in order to increase the effectiveness of the use of the system, and provide greater opportunities for researchers through access to more advanced tools and methods, better suited to a supercomputer. From a system point of view, the cloud now provides a significant opportunity. It is also possible that ultimately the cloud may present a threat, but for as long as tier 1 HPC facilities are valued and serving as crucial platforms for research collaboration, the threat is some distance away.
Addressing the skills shortage
The value of national investments in HPC depends as much on soft infrastructure (skills, software capability etc.) as it does on hardware. At this time, none of the current tier 1 facilities in Australia are equipped with accelerator technology, reflecting the primary usage drivers, but also a skills gap and insufficient capacity with which to migrate the user base towards methods that exploit accelerators. The next generation of procurements must inevitably include an accelerator component if performance gains are to be realised at reasonable cost for both procurement and operations. It will be a ‘goldilocks’ decision, however, since too small a fraction will see Australia’s competitive position weakened, while too great a fraction may be a waste if the user base is unable to take advantage of the capability. Australia is thus at a crossroads, in that the potential of future procurements may not be realised in the absence of a much heightened investment in skills and software development capability.
While there are benefits arising from a requirement for co-investment, its focus on recurrent operations has limited the building of a significant software developmental capability of the type required to carry the country into the high-petascale and exascale eras. While there is significant work underway to reengineer major community codes to scale more highly, and to be prepared for many core architectures, the needs outstrip both the financial resources and the in-country skills-base to undertake the work. Although in specific areas, such as the data processing applications used for radio astronomy, where there has been investment in software infrastructure that exploits accelerator technologies, these focused efforts do not translate easily to the wider research community across multiple science domains.
Australia will need to prioritise an investment in a national computational sciences capability, of the kind found in major national laboratories and centres, and one that is closely linked with the university sector – to build skills at undergraduate and postgraduate level, with research studies being intimately linked with the work of science agencies and industry.
Australian investments in HPC over the past decade or more have had a substantial transformative effect on national research and innovation. University researchers in all fields are increasingly dependent on HPC access for the international competitiveness of their work, as are the national science agencies in working towards nationally beneficial outcomes in priorities such as the environment, resources, future technologies, and health. Industry uptake is rising. Indeed, one start-up from university IP, which was sold recently to a US multinational for $76M, would not have gained a foothold had it not been for national investments in HPC.
With research becoming increasingly data-driven and data-intensive, the convergence of computation and data in comprehensive and strongly integrated environments seems to be the right model for Australia, given its relative ability to support internationally competitive facilities. In the currently tight economic environment, the effectiveness of, and outcomes from, the current investments will be closely examined, with future investments likely to factor in the potential to contribute to economically beneficial and measurable outcomes. For this to be realised, the investments in hardware and soft skills will need to be balanced, and implemented efficiently in order to realise economies of scale.
Australia’s national high-end, integrated facilities have evolved to become a valued collaborative research platform, with academic communities, science agencies, and industry now so dependent on these services that there is ‘no way back’. It is essential, therefore, to translate the provision of such services from the status of a ‘four-year research grant’ into the mainstream, in order to secure the skills base, the access, and the quality of services. It is these, and related matters, that are presently occupying the attention of policy makers and research institutions alike as Australia develops its blueprint for its national e-infrastructure through to 2020.