Nvidia has announced Cuda 6, the latest version of its parallel programming language, designed for programming GPUs to accelerate applications by replacing existing CPU-based libraries.
Unified Memory is a new feature introduced with Cuda 6 which aims to simplify programming by enabling applications to access CPU and GPU memory without copying data from one to the other. This enables easier support for GPU acceleration in a range of programming languages, Nvidia claims.
Drop-in libraries have also been added which accelerates applications' BLAS and FFTW calculations by up to 8x by simply replacing the existing CPU libraries with the GPU accelerated equivalents.
Multi-GPU scaling is also now available. Re-designed BLAS and FFT GPU libraries automatically scale performance across up to eight GPUs in a single node, delivering more than nine teraflops of double precision performance per node, and supporting larger workloads than ever before (up to 512GB). In addition to this Multi-GPU scaling compatability has been added for the new BLAS drop-in library.
In addition to the new features, the CUDA 6 platform offers a full suite of programming tools, GPU-accelerated math libraries, documentation and programming guides.
Version 6 of the CUDA Toolkit is expected to be available in early 2014.