During the opening press briefing for SC13, the international supercomputing conference in Denver, Colorado, professor Bill Gropp, the conference chair, remarked that supercomputing was going through a period of technological upheaval comparable to the introduction of CMOS a quarter of a century earlier. Several new processor technologies were becoming available, he said, but unlike the situation of a quarter of a century earlier: ‘We don’t have a CMOS – we have candidates, but nothing that has the maturity that you could bet your company on.’ Although the focus of his remarks was on new processors, technological – and commercial – changes are making their presence felt in the interconnects sector as well.
At the beginning of November, Cray snapped up the intellectual property relating to high-performance networking created by Gnodal, an interconnects company that had just gone into administration. Cray also recruited most of the company’s staff into its own R&D team. Yet a year earlier, in 2012, Cray had sold its interconnect hardware programme and associated patents to Intel, while Intel itself acquired Qlogic’s InfiniBand business in 2012.
Meanwhile, at SC13 in Denver – later in November – Mellanox, the leading independent provider of InfiniBand interconnects, was proudly showing off the MetroX system, which it had announced at the end of October, to provide connectivity not within a compute cluster but for long-haul distances as great as 80km, from one data centre to another.
Also on show in Denver was a new supercomputer from nCore, running on RapidIO interconnects, while the Australian company Exablaze was debuting at the show, exhibiting its own ultra-low latency switches and network interface cards.
Because supercomputers are essentially clusters of commodity hardware, they make intense demands on the interconnects joining them up; and on the software to run them. Network-intensive applications, such networked storage and cluster computing, need an infrastructure with a high bandwidth and low latency. As high-performance computing moves towards Exascale and as data centres become bigger and have to handle ever-larger fluxes of data, there is a market ‘pull’ that is shaping the future of interconnects. Equally, new technologies are appearing – or are being brought in from other areas of application – that are providing a technological ‘push’.
‘It’s all about moving data,’ says Barry Bolding, Cray’s vice president for storage and data management. In an interview at SC13, he said the company’s strategy was to focus on the high-end in three areas: computing; storage; and analytics. ‘If we’re driving storage and analytics, then we’re moving data. Supercomputing has changed,’ he continued. ‘It’s not all about Flops but about data – moving data. This is the trend at SC13 – that is what I have heard a lot about.’ For the company’s XC and XE supercomputers, Cray had historically designed and used proprietary interconnects – Aries and Gemini – which could scale up efficiently with the size of the cluster. The company was selling ‘a tremendous amount’ of Cray interconnects in the market.
Cray decided to stop developing its own hardware for interconnects. It therefore sold its hardware programme to Intel because Cray recognised that Intel, as a pre-eminent chip manufacturer, has the fabrication capacity – and therefore the ability to produce hardware – that Cray itself lacked. However, Cray is still looking to ways to be innovative and differentiate itself in networking and interconnects. It therefore acquired Gnodal’s intellectual property, as encapsulated in patents but also as held in the skills and brains of the staff, but Gnodal’s switches and hardware were not included in the deal. Bolding stressed that, for Cray: ‘We have never changed our plan to differentiate on interconnects. We never got out of the network business.’ The company will focus on software, and the acquisition of Gnodal’s expertise is part of that strategy of differentiation by innovation, he said: ‘You need people who understand how data moves, to be able to innovate’.
He also pointed out that the company was involved with the Open Fabrics Alliance, which is critical to the future development of interconnects. The OFA develops, tests, licences and distributes OpenFabrics Software (OFS) – multi-platform, high-performance, low-latency and energy-efficient open-source RDMA (remote direct memory access) software – and has recently released its latest roadmap.
In for the long haul
Mellanox too is interested in moving data – and over long distances. The company has positioned itself as the leading independent, ‘end-to-end’ supplier of high-performance interconnects. It offers everything from adapters, switches, and software to the physical cables and connectors themselves. But in response to the growth in physical size of datacentres and to the need to transport data between centres, it announced shortly before the Denver show, the expansion of its MetroX long-haul interconnects.
This time the focus is not just moving data within a data centre but across longer distances. MetroX systems enable high-performance, high-volume data sharing between distant sites at distances of between1km and 80km. Over distances of 1km, its T6000 system offers a throughput of 640Gb/s, with a port density of 16 x FDR10 long haul, and a latency of 200ns (plus 5 microseconds/km over fibre). For longer distances of 80km, the throughput is 40Gb/s, with one port, and a 700 nanosecond latency (plus the 5 microseconds/km over the fibre).
Within computing centres, the number of Mellanox FDR InfiniBand systems that have been installed has tripled between November last year and November 2013, since the company announced its Connect-IB, scalable server and storage adapter, just over a year ago, offering what it claimed was the first 100Gb/s interconnect adapter (using dual-port FDR 56Gb/s InfiniBand technology).
Although it is best known perhaps for its InfiniBand solutions, Mellanox is now also offering Ethernet, having announced its 56 Gigabit Ethernet product line in May this year. The 56GbE solution consists of Mellanox’s ConnectX-3 and ConnectX-3 Pro NICs, SwitchX-2 based SX1024, and SX1036 switches, QSFP+ cables, and acceleration and management software. It claims 40 percent more bandwidth than competing 40GbE solutions.
On the choice between Ethernet and InfiniBand, Mellanox’s Brian Sparks said: ‘For us, it doesn’t matter much. In the end, it’s about what the applications need.’ If scalability and low latency are the concerns, then InfiniBand is the natural choice, he continued. He pointed out that InfiniBand also offers overall system efficiencies of up to 97 per cent, whereas Ethernet was around the 50 per cent mark.
In keeping with that focus on what a particular application needs, he pointed out that the company’s Metrox system offered not just distance but any RDMA interconnect that the user wanted. This means that users do not need to switch from one system for short-distances to a different one for the longer haul: ‘You can stay native to InfiniBand without coming out,’ he said.
But in addition to mainstream players in high-performance computing, companies with backgrounds in other applications or other technologies are making their presence felt in the interconnect scene. One of the most demanding applications, in terms of moving data with ultra-low latencies, is high-frequency trading on the finance markets. Exablaze, the Australian company which was exhibiting for the first time in Denver, has its roots in that sector rather than in high-performance computing.
At the beginning of November, just before SC13, Exablaze claimed that its ExaNICX4 10 Gigabit Ethernet network interface card was the fastest available, capable of sending a 60-byte package on the round trip from an application to network and back in less than a microsecond – about half the previous time taken. In Denver, the company was exhibiting both the ExaNICX4 card and its EXALINK50 50-port switch for which it claimed a maximum data rate of 10 Gb/s, a typical port-to-port latency of 3.45 nanoseconds, and a fibre-to-fibre latency typically of 4.3ns.
In mid-November, nCore announced that it had sold the first of its Brown Dwarf supercomputers to the US Pacific NorthWest National Laboratory (PNNL). The interconnects in Brown Dwarf use the RapidIO compute fabric, providing 20 Gb/s of point-to-point bi-directional RDMA between system nodes and 320 Gb/s of bandwidth for each four-node blade. RapidIO tends to be favoured in the communications and embedded systems market and one of the advantages claimed for the system is its energy efficiency.
So it is perhaps no surprise that the use of RapidIO interconnects in Brown Dwarf goes hand in hand with the fact that the system does not use conventional processors but rather Texas Instruments’ multicore system on a chip, called Keystone, which uses both ARM processors and TI’s own DSP cores. The announcement of the sale of Brown Dwarf to PNNL came shortly after the RapidIO Trade Association published the first draft of its reference design for the server, data centre and supercomputing markets. The association’s membership includes telecom and storage OEMs as well as FPGA, processor, and switch companies, reflecting that there are many application areas other than supercomputing that have an interest in this interconnect technology.
Interconnects at Exascale
Novel interconnects form an integral part of one European project to move supercomputing technology towards Exascale. The EU’s DEEP project combines a standard, InfiniBand cluster using Intel Xeon nodes (cluster nodes) with an innovative, highly scalable ‘booster’ constructed of Intel Xeon Phi co-processors (booster nodes) and the Extoll high-performance 3D torus network. Both interconnects are coupled by Intel core booster interface nodes.
Extoll was spun out of the University of Heidelberg in 2011 and has been using FPGAs and ASICs to integrate the host-interface, network-interface controller, and network functions within a single chip. Its Tourmalet ASIC is in production and is expected to be available by end of 2013. The intention is that the final hardware components of DEEP will use these Extoll ASICs.
Making light work of distance
At the IEEE’s International Electron Devices Meeting in Washington DC in December, IBM’s Dr Solomon Assefa announced a major advance in the ability to use light instead of electrical signals to transmit information for future computing. The company’s silicon integrated nanophotonics circuit integrates a photodetector and modulator fabricated side-by-side with silicon transistors on a single silicon chip using, for the first time, sub-100nm semiconductor technology.
The outcome of more than a decade of research, it will allow IBM ‘to move silicon nanophotonics technology into a real-world manufacturing environment that will have impact across a range of applications,’ according to Dr John Kelly, senior vice president and director of IBM research. The company claims that it has solved the key challenges of transferring silicon nanophotonics into the commercial foundry. By adding a few processing modules into a high-performance 90nm CMOS fabrication line, a variety of silicon nanophotonics components such as wavelength division multiplexers (WDM), modulators, and detectors can be integrated side-by-side with a CMOS electrical circuitry. As a result, single-chip optical communications transceivers can be manufactured in a conventional semiconductor foundry, providing significant cost reduction over traditional approaches.
Mellanox too has shown an interest in silicon photonics. In the summer, it acquired Kotura, a developer of silicon photonics optical interconnect technology. According to Mellanox’s Brian Sparks, the interest in longer-distance transmission and the thirst for higher speeds, no matter what the distance, mean that optical fibre cables are being used more and more for the interconnects. It was one of two strategic acquisitions that demonstrated Mellanox’s interest in optics and silicon photonics as future technologies for interconnects. It also bought the Danish company, IPtronics A/S, which designs optical interconnect components. IPtronics’ current location in Roskilde, Denmark, is to serve as Mellanox’s first R&D centre in Europe, the company announced, while Kotura’s base in Monterey Park, California, will be its first R&D centre in the United States.
Reflecting this interest in optical transmission, fibre optic cable companies were very much in evidence at SC13 in Denver. Steffen Koehler, senor production line manager at Finisar sees high-performance computing as a big growth area for the company. Although Finisar was displaying a broad portfolio of 10Gb/s, 14Gb/s and 25Gb/s technologies, supporting not only InfiniBand EDR and 100Gb/s Ethernet but also Fibre Channel, SAS, and PCIe, Koehler stressed the synergies between the products: ‘What is unique in our approach is that the building blocks are the same. We are vertically integrated. We have our own employees putting our products together. We have built on our VCSEL strength,’ he continued by creating a 56Gb/s VCSEL, ‘which is the world’s fastest laser’. That VCSEL was at the heart of what he described as the company’s ‘big message’ at SC13, a 25Gb/s board-mounted optical assembly, which be in full production in 2014, offering a low-power and high-density optical interface for supercomputing interconnects. He also offered an interesting perspective on the InfiniBand/Ethernet issue, saying that the requirements of InfiniBand QDR for speed had blazed the trail while Ethernet had brought the volume in demand: thus Ethernet had benefited in terms of speed from InfiniBand, while economies of scale developed as a result of Ethernet had benefited InfiniBand.
Samtec was demonstrating its FireFly ‘micro flyover system’. It is designed for placement on-board or on-package providing interchangeable active optic engines or cheaper passive copper cable. Its name arises because the system is intended to allow cables to be routed in such a way that the data can ‘fly’ over the board, with the cables going over other components rather than having to be routed around them, so simplifying board layout and material choice. The optical version uses a 850nm VCSEL array and multi-mode fibres. FireFly also offers 10Gb/s and 14Gb/s, but the 28Gb/s option has not yet been released.
Molex too was exhibiting its equipment to support high-speed data transmission, particularly its zQSFP+ intended to support next-generation 100 Gb/s Ethernet and 100 Gb/s InfiniBand enhanced data rate (EDR). The zQSFP+ transmits up to 25 Gb/s per-serial-lane and its SMT connector’s preferential coupling design is backward compatible with QSFP+ form factor modules and cable plugs. System components include SMT connectors and 1-by-n EMI cages and stacked 2-by-n connectors which are designed to accept advanced heat-sink systems that provide a high level of heat dissipation for next-generation system-power levels. In addition, Molex was offering a longer distance option with the zQSFP+ 100 Gb/s 2km long-reach silicon photonics active optical cable (AOC). Singlemode, long-reach silicon photonics AOCs will transmit up to 2km with significantly lower power consumption and cost than other long reach optical solutions. Molex’s AOC pigtails allow for connections to single-mode cabling up to 2km, and for upgrades when next-generation bandwidth cables are introduced.
One simple but important issue in all this is to ensure interoperability and compliance among all the vendors. Once example of how this can be achieved is the ‘plug fest’ organised every two years by the InfiniBand Trade Association (IBTA). Shortly before SC13, it released its latest Combined Cable and Device Integrators’ List, a compilation of results from the IBTA Plugfest 23, listing all products that passed the requirements of interoperability and compliance. One significant pointer to a faster future was the fact that cables supporting the Enhanced Data Rate (EDR) 100Gb/s InfiniBand standard were tested for compliance for the first time.
But even this may not be forward-looking enough, as Luxtera’s Brian Welch pointed out: ‘If you want to create a system, you want the infrastructure to last 10 years, not just the three-year cycle of server replacement. So you have to install with 400Gb/s in mind. Communications that are 30m to 40m long are difficult to change – so plan for the long term. Single-mode fibre is the future. It’s a long-term investment in infrastructure and silicon photonics is the best natural fit for the next generation of systems.’