Cuda 5

Nvidia has released Cuda 5, a new version of its parallel programming model for GPUs. It can be downloaded for free from the Nvidia Developer Zone website.

With more than 1.5 million downloads, supporting more than 180 leading engineering, scientific and commercial applications, the Cuda programming model is the most popular way for developers to take advantage of GPU-accelerated computing. New features of Cuda 5 make the development of GPU-accelerated applications faster and easier, including support for dynamic parallelism, GPU-callable libraries, Nvidia GPUDirect technology support for RDMA (remote direct memory access) and the Nvidia Nsight Eclipse Edition integrated development environment (IDE).

Key features include:

  • Dynamic Parallelism whereby GPU threads can dynamically spawn new threads, allowing the GPU to adapt to the data. By minimising the back and forth with the CPU, dynamic parallelism greatly simplifies parallel programming. And it enables GPU acceleration of a broader set of popular algorithms, such as those used in adaptive mesh refinement and computational fluid dynamics applications.
  • A new Cuda BLAS library that allows developers to use dynamic parallelism for their own GPU-callable libraries. They can design plug-in APIs that allow other developers to extend the functionality of their kernels, and allow them to implement call-backs on the GPU to customise the functionality of third-party GPU-callable libraries. The “object linking” capability provides an efficient process for developing large GPU applications.
  • GPUDirect Support for RDMA enables direct communication between GPUs and other PCI-E devices, and supports direct memory access between network interface cards and the GPU. It also significantly reduces MPISendRecv latency between GPU nodes in a cluster and improves performance.
  • Nvidia Nsight Eclipse Edition enables programmers to develop, debug, and profile GPU applications within the familiar Eclipse-based IDE on Linux and Mac OS X platforms. An integrated Cuda editor and Cuda samples speed the generation of Cuda code, and automatic code refactoring enables easy porting of CPU loops to Cuda kernels. An integrated expert analysis system provides automated performance analysis and step-by-step guidance to fix performance bottlenecks in the code, while syntax highlighting makes it easy to differentiate GPU code from CPU code.

To help developers maximise the potential of parallel computing with Cuda technology, Nvidia has launched a free online resource centre for Cuda programmers. The site offers access to all Cuda developer documentation and technologies, including tools, code samples, libraries, APIs, and tuning and programming guides.


For functionality and security for externalised research, software providers have turned to the cloud, writes Sophia Ktori


Robert Roe looks at the latest simulation techniques used in the design of industrial and commercial vehicles


Robert Roe investigates the growth in cloud technology which is being driven by scientific, engineering and HPC workflows through application specific hardware


Robert Roe learns that the NASA advanced supercomputing division (NAS) is optimising energy efficiency and water usage to maximise the facility’s potential to deliver computing services to its user community


Robert Roe investigates the use of technologies in HPC that could help shape the design of future supercomputers