NVIDIA CUDA Toolkit 3.2 promises up to 300 percent performance boost

Posted on Thursday, November 18 2010 @ 12:58 CET by Thomas De Maesschalck

NVIDIA announced the CUDA Toolkit 3.2:

Today NVIDIA announced the availability of the CUDA Toolkit 3.2 production release, which provides significant performance increases, new math libraries and advanced cluster management features for developers creating next-generation GPU-accelerated applications.

The CUDA Toolkit includes all the tools, libraries and documentation developers need to build CUDA C/C++ applications, and is the foundation for many other GPU computing language solutions. New features and significant performance enhancements in version 3.2 include:

Up to 300-percent performance improvement in CUDA BLAS (CUBLAS) library routines, delivering 8 times faster performance than the latest Intel MKL (Math Kernel Library)
CUDA FFT (CUFFT) library optimizations delivering 2 - 20 times faster performance than the latest MKL
New CURAND library for random number generation at 10-20 times faster than the latest MKL
New CUSPARSE library of sparse matrix routines that delivers 6-30 times faster performance than the latest MKL
A host of additional improvements to GPU debugging and performance analysis tools

In addition, the new CUDA Toolkit 3.2 release includes H.264 encode/decode, new Tesla Compute Cluster (TCC) integration, cluster management features, and support for the new 6GB NVIDIA Tesla and Quadro GPU products.

NVIDIA CUDA Toolkit 3.2 promises up to 300 percent performance boost

About the Author